Anonymizing data

If you manage to anonymize data, the GDPR no longer applies to that data. Unfortunately, the term is often used erroneously. Anonymization of data means that it is no longer possible to identify a natural person from the data. This requires stringent proof. Replacing the identity of persons in a dataset with numbers, GUIDs, or hashes does not satisfy this requirement, since it is still possible to reverse engineer and identify the original set of persons, especially if you have a list of possible persons to compare against. This process is called pseudonymization and is still considered personal data, according to the GDPR. Pseudonymized data is defined as data where it is possible to identify a person given external data. Pseudonymization is still considered a data protection mechanism.

While anonymization is difficult, valid methods still exist. One example is statistical aggregation. Operations as sums, averages, variance, standard deviations, and so on eliminate the possibility to say anything about individuals, if the population is sufficiently large. Another method is data obfuscation, or data masking. This means you destroy certain details in the dataset, which accomplishes the same thing as aggregation. An example can be to destroy one of the four bytes in an IPv4 address. From the three remaining bytes, you can deduce the region, but not the individual performing an action, for instance.

Table of Contents for Anonymizing data

Create new playlist

Sign In

Sign Up

Table of Contents for
Anonymizing data