- 1798.100 – Consumers right to receive information on privacy practices and access information
- 1798.105 – Consumers right to deletion
- 1798.110 – Information required to be provided as part of an access request
- 1798.115 – Consumers right to receive information about onward disclosures
- 1798.120 – Consumer right to prohibit the sale of their information
- 1798.125 – Price discrimination based upon the exercise of the opt-out right
What qualifies as aggregate or de-identified information under the CCPA?
The CCPA defines both “aggregate consumer information” and “deidentified information.” Aggregate consumer information is defined to mean “information that relates to a group or category of consumers, from which individual consumer identities have been removed, that is not linked or reasonably linkable to any consumer or household, including via a device. “Aggregate consumer information’ does not mean one or more individual consumer records that have been deidentified.”1
Deidentified information is defined under the CCPA to mean “information that cannot reasonably identify, relate to, describe, be capable of being associated with, or be linked, directly or indirectly, to a particular consumer, provided that a business that uses deidentified information:
(1) Has implemented technical safeguards that prohibit reidentification of the consumer to whom the information may pertain.
(2) Has implemented business processes that specifically prohibit reidentification of the information.
(3) Has implemented business processes to prevent inadvertent release of deidentified information.
(4) Makes no attempt to reidentify the information.”2
Notably, the definition of “aggregate consumer information” explicitly excludes deidentified information from its scope, even though it is possible that both definitions could apply to the same data set. The functional difference between the two definitions is primarily that the definition of aggregate consumer information applies solely to the data itself, whereas the definition of deidentified information also incorporates and considers the conditions under which such data is held. In any event, the effect is the same: whether aggregated or deidentified, the data is no longer “personal information.”
Is encrypted data out of the scope of the CCPA?
In some cases yes, and in other cases no.
The CCPA defines “personal information” as information that, among other things, “is capable of being associated with” a particular consumer.1 Conversely, the CCPA refers to information as “deidentified” if it “cannot reasonably” be “associated with” a particular consumer.2
In situations in which a company encrypts personal information, but maintains the means to decrypt the information (e.g., a password or an encryption key), an argument exists that while the encrypted information remains in the possession of the business, it is “capable” of being associated with a consumer. In such a situation, most of the requirements of the CCPA would apply with one important exception. The private right of action conferred by the CCPA to bring suit following a data breach only applies in the context of “nonencrypted” information that has been disclosed.3 As a result, if the business accidentally disclosed the encrypted information (or if the encrypted information were accessed by a malicious third party), the business should not be liable for the statutory liquidated damages identified in the Act.
In situations in which a company receives, stores, or transmits encrypted information, but does not have the means to decrypt it (e.g., acts simply as a transmission conduit), a strong argument exists that the information “cannot reasonably” be associated with a particular consumer and, as a result, is not personal information subject to the CCPA.
In comparison to the CCPA, the European GDPR recognizes encryption as a security technique that may help keep personal data safe, but the GDPR does not state that encrypted data is no longer personal data; nor does the GDPR state that encrypted data is not governed by the Regulation.4 To the contrary, the Article 29 Working Party5 held the opinion that encryption does not “per se lend[ ] itself to the goal of making a data subject unidentifiable” and “it does not necessarily result in anonymisation.”6
Is it possible for data that has undergone salted-hashing to still be considered “personal information?”
Maybe.
“Salting” refers to the insertion of a random value (e.g., a number or a letter) into personal data before that data is hashed.
Whether personal information that has undergone salting and hashing is still considered “personal information” depends upon the particular law or regulation at issue.
In the context of the CCPA, information is not “personal information” if it has been “deidentified.”1 Deidentification means that the data “cannot reasonable identify, relate to, describe, be capable of being associated with, or be linked, directly or indirectly, to a particular consumer.”2 A strong argument could be made that data that is salted and then hashed cannot reasonably be associated with an individual. That argument is strengthened under the CCPA if a business takes the following four steps to help ensure that the salted and hashed data will not be re-identified:3
- Implement technical safeguard that prohibit reidentification. Technical safeguards may include the process or techniques by which data has been deidentified. For example, this might include the hashing algorithm being used or the number of characters inserted as part of the salting process.4
- Implement business processes that specifically prohibit reidentification. This might include an internal policy or procedure that prevents employees or vendors from attempting to reidentify data or reverse the salted and hashed values.
- Implement business processes to prevent inadvertent release of deidentified information. This might include a policy against disclosing hashed values to the public.
- Make no attempt to reidentify the information. As a functional matter, this entails taking steps to prohibit reidentification by the business’s employees.
In comparison, in the context of the European GDPR the Article 29 Working Party5 has stated that while the technique of salting and then hashing data “reduce[s] the likelihood of deriving the input value,” because “calculating the original attribute value hidden behind the result of a salted hash function may still be feasible within reasonable means,” the salted-hashed output should be considered pseudonymized data that remains subject to the GDPR.6
For more information and resources about the CCPA visit http://www.CCPA-info.com.
This article is part of a multi-part series published by BCLP to help companies understand and implement the General Data Protection Regulation, the California Consumer Privacy Act and other privacy statutes. You can find more information on the CCPA in BCLP’s California Consumer Privacy Act Practical Guide, and more information about the GDPR in the American Bar Association’s The EU GDPR: Answers to the Most Frequently Asked Questions.
1. CCPA, Section 1798.145(a)(5).
2. CCPA, Section 1798.140(h).
3. CCPA, Section 1798.140(v).
4. Salting refers to the insertion of characters into data before it is hashed to make brute force reidentification more difficult.
5. The Article 29 Working Party was the predecessor to the European Data Protection Board.
6. Article 29 Working Party, WP 216: Opinion 05/2014 on Anonymisation Techniques at 20 (adopted 10 April 2014).
Is it possible for a token to still be considered “personal information?”
Maybe.
“Tokenization” refers to the process by which you replace one value (e.g., a credit card number) with another value that would have “reduced usefulness” for an unauthorized party (e.g., a random value used to replace the credit card number).1 In some instances, tokens are created through the use of algorithms, such as hashing techniques.
Whether personal information that has been tokenized is still considered “personal information” depends upon the particular law or regulation at issue.
In the context of the CCPA, information is not “personal information” if it has been “deidentified.”2 Deidentification means that the data “cannot reasonable identify, relate to, describe, be capable of being associated with, or be linked, directly or indirectly, to a particular consumer.”3 A strong argument could be made that data that is fully tokenized, and no longer is connected to a particular consumer, cannot reasonably be associated with an individual. That argument is strengthened under the CCPA if a business takes the following four steps to help ensure that the tokenized data will not be re-identified:4
- Implement technical safeguards that prohibit reidentification. Technical safeguards may include the process, or techniques, by which tokens are assigned. For example, a business might take steps to randomly generate tokens, or ensure that tokens are not assigned sequentially in a manner that might allow a third party to guess to whom the token relates.
- Implement business processes that specifically prohibit reidentification. This might include an internal policy or procedure that separates tokens from any “key” that might allow an individual to match a token to a consumer.
- Implement business processes to prevent inadvertent release of deidentified information. This might include a policy against disclosing information about individuals even if the names of the individuals have been replaced with tokens.
- Make no attempt to reidentify the information. As a functional matter, this entails taking steps to prohibit reidentification by the business’s employees.
In comparison, in the context of the European GDPR, the Article 29 Working Party5 has stated that even when a token is created by choosing a random number (i.e., it is not derived using an algorithm), the resulting token typically does not make it impossible to re-identify the data and, as a result, the token is best described as “pseudonymized” data which would still be “personal data” subject to the GDPR.6
Is it possible for data that has undergone hashing to still be considered “personal information?”
Maybe.
Hashing refers to the process of using an algorithm to transform data of any size into a unique fixed sized output (e.g., combination of numbers). To put it in layman’s term, some piece of information (e.g., a name) is run through an equation that creates a unique string of characters. Anytime the exact same name is run through the equation, the same unique string of characters will be created. If a different name (or even the same name spelled differently) is run through the equation, an entirely different string of characters will emerge.
While the output of a hash cannot be immediately reversed to “decode” the information, if the range of input values that were submitted into the hash algorithm are known, they can be replayed through the hash algorithm until there is a matching output. The matching output would then confirm, or indicate, what the initial input had been. For instance, if a Social Security Number was hashed, the number might be reverse engineered by hashing all possible Social Security Numbers and comparing the resulting values. When a match is found, someone would know what the initial Social Security Number that created the hash string was. The net result is that while hash functions are designed to mask personal data, they can be subject to brute force attacks.
Whether a hash value in and of itself is considered “personal information” depends upon the particular law or regulation at issue.
In the context of the CCPA, information is not “personal information” if it has been “deidentified.”1 Deidentification means that the data “cannot reasonable identify, relate to, describe, be capable of being associated with, or be linked, directly or indirectly, to a particular consumer.”2 An argument could be made that data once hashed cannot reasonably be associated with an individual. That argument is strengthened under the CCPA if a business takes the following four steps to help ensure that the hashed data will not be re-identified:3
- Implement technical safeguard that prohibit reidentification. Technical safeguards may include the process or techniques by which data has been deidentified. For example, this might include the hashing algorithm being used, or combining the hashed algorithm with other techniques that are designed to further obfuscate information (e.g., salting).4
- Implement business processes that specifically prohibit reidentification. This might include an internal policy or procedure that prevents employees or vendors from attempting to reidentify data or reverse hashed values.
- Implement business processes to prevent inadvertent release of deidentified information. This might include a policy against disclosing hashed values to the public.
- Make no attempt to reidentify the information. As a functional matter, this entails taking steps to prohibit reidentification by the business’s employees.
In comparison, in the context of the European GDPR, the Article 29 Working Party5 considered hashing to be a technique for pseudonymization that “reduces the linkability of a dataset with the original identity of a data subject” and thus “is a useful security measure,” but is “not a method of anonymisation.6 In other words, from the perspective of the Article 29 Working Party while hashing might be a useful security technique it was not sufficient to convert “personal data” into deidentified data.