>
Dealing with Sensitive and PII Data and How to Manage it

Every software, web portal, or application that stores client information has to adopt specific policies for sensitive and PII (personally identifiable information) data. Thus, it is crucial to understand how to manage this kind of information. 

In this article, we will understand sensitive and PII data and the differences between them. Then we’ll talk about security data policy prevention. Finally, we’ll see how it’s technically possible to protect and keep this data secure while at rest and while being transferred.

Sensitive Data vs. PII

Let’s first understand an essential distinction between PII data and Sensitive Information. PII is every data that can be used to identify a person univocally. PII data usually constitutes a data packet, but even a single piece of information can, in some cases, be considered PII. For example, having only the name and surname of a person available is not enough for cybercriminals, but knowing the tax code allows you to uniquely identify a person.

The definition of sensitive information instead changes based on the law of the specific region where data are handled. Generally speaking, sensitive information is data that needs particular care when it has to be handled because it contains information about race, religion, health, or other particularly delicate areas. 

Finally, it’s worth mentioning that there is a particular classification of data that goes under the acronym SPII (sensitive personally identifiable information): data that is both sensitive and contains personally identifiable information.

Security by Prevention 

Employees of any company must know if they are handling data with peculiar requirements. Therefore, having a constant update on data privacy and management in the work environment is a preventive action that every employer must adopt, starting from the Ingestion phase of the data lifecycle management. 

Prevention, as in most other cases, is the best strategy as dealing with data privacy leaks can cost a company a huge amount of money. One example of privacy policies I’ve seen not followed several times is the handling of PII for personal purposes, for example, checking the account and permission of relatives in a software application. This kind of operation should never be permitted. 

One of the most important ways to improve security policies regarding PII is to share with workers a precise and updated list of what is considered as PII or Sensitive PII in a specific environment. While it’s essential for every figure involved to have a general understanding of what information can be regarded as sensitive, a list of elements to cross-check can help every database manager or software developer in their job. New information handled by the company should always be checked against this list. 

Data can be represented only in two forms: at rest or in movement. Policies that handle PII and sensitive information must care about data in both forms. 

Protecting Data at Rest

When the data is at rest, in a database or a data warehouse, we should still care about several things. First, what region we acquire the PII or Sensitive information in and where we store it are both very important. We need to consider diverse regional regulations. For example, if our database is in Europe, we must comply with all GRPR regulations.

To protect data at rest, we can apply encryption (we use a key to encrypt our data so it is impossible to read without a key) or static masking. Masking means the PII will never be stored as it is; it will be obfuscated first. Of course, this can be applied only in cases where we do not need the complete information. Consider also that the best way to protect PII and sensitive data is avoiding storing it at all. For example, ask yourself: Do we need to store that residence address, and can we rely on an external system to retrieve it? The lesser sensitive information we store at rest, the better.

Peculiar retention policies also apply when storing PII or Sensitive information. That’s why it is probably useful to store this kind of data in different locations without mixing it with “standard” business information.

Finally, most recent data warehouse cloud providers advise developers to enable tagging of PII data. Google Cloud Big Query has even provided an auto-tagging solution with its data loss prevention feature. By automatically tagging a column as “sensitive or PII information”, it enables you to raise awareness about handling certain columns with enhanced care and promotes the use of restrictive policies when it comes to reading those columns.

Protecting Data In Movement

When data is moving, which means being transferred from one point to another, it is particularly vulnerable. In this case, it is foundational to use secure encryption of data. A leak in the pipeline or the wrong destination provided could always happen. 

Dynamic masking is another technique we can adopt. In this case, the PII or sensitive information will be stored in a readable format; the masking will apply when reading from the database (for example, hiding some of the digits of a telephone number). We can automate dynamic masking if we use column tagging techniques.

Operational roles and profiles should be configured and handled properly with peculiar care when it comes to assigning permission to read and transfer data from one location to another.

Finally, it can be useful to have an updated list of locations where this information is being read or shown. For example, if the application uses a frontend client, you should have an updated list of all the web pages where the PII is shown and all the HTTP endpoints that expose this information. This way, we can put particular care into checking for security and privacy vulnerabilities in those locations.

Conclusion

In this article, we have seen how it is possible to handle and manage sensitive data and PII. After a brief introduction, we discussed some prevention techniques by providing insight into why we should keep a list of PII handled. Last, we saw some technical actions that we can put into practice to protect data at rest and in movement, such as data tagging, dynamic and static masking, encryption, or having an updated list of PII exposing endpoints.

Show Comments