Mitigating GDPR Risks in Unstructured Data

If you like action movies, you’ll be familiar with that moment when the star narrowly escapes death and finally breathes a sigh of relief—only to be nearly killed all over again by some unexpected disaster. When it comes to GDPR compliance, today’s CISOs are like the action hero who totals his car, climbs out and is then nearly run over by an out-of-control 18-wheeler. Wait for it. Here comes unstructured data. We’re all on edges of our seats wondering if the CISO is going to be road kill…

Well, not all of us. Varonis is tackling this risk head-on. According to Brian Vecci, Field CTO at Varonis, documents like PDFs can be a hidden source of GDPR vulnerability. (GDPR is the EU’s General Data Protection Regulation.) “Most GDPR compliance efforts have focused on the big databases,” Vecci explained.

Brian Vecci, Field CTO of Varonis

“This makes sense, up to a point,” he said. “The kind of personal data relevant to GDPR is largely in customer databases. However, as our clients invariably discover when they deploy our tools, there’s a huge amount of PII [personal identifying information] tucked away in file storage volumes. That’s scary, because a lot of the time they didn’t know the files were even there. They certainly don’t know who has access to them.”

For example, your customer service or HR department might have thousands of Word documents that contain names, addresses, phone numbers, national identification numbers, vehicle information and so forth. A determined hacker, especially one equipped with automated search capabilities, could vacuum up confidential personal data from such records. Such an attack would trigger GDPR fines.

The problem is multi-layered. As Vecci pointed out, the first issue is basic awareness. What do you have? Then, the challenge is to understand how a particular document might cause compliance risk. There are multiple regulations in GDPR, and they vary across the 28 EU member states. Varonis addresses these difficulties through automated discovery and classification of unstructured data. Their new release uses 340 patterns that cover all 28 EU countries. It goes beyond Regular Expressions (RegExes) to include keywords, proximity, negative keywords and validation algorithms.

Automated identification and classification of files that contain GDPR data is the first step. “Once you know what you have, you have to protect those files,” Vecci added. “Who’s accessing it? Who owns it?” The Varonis toolset lets compliance managers monitor GDPR affected data and take measures to prevent it from being misused or breached. “You might find, unfortunately, that an insider is systematically looking at PII that has nothing to do with his or her job. In that case, we help you identify the extent of that person’s activities and remediate the situation by cutting off access.”

There are a range of benefits that arise from securing unstructured data that contains information covered by GDPR. Avoiding a breach that can trigger GDPR penalties is a great benefit. Being able to demonstrate that you’ve undertaken countermeasures to mitigate privacy risks can reduce the sting of penalties. The EU allows for reductions in fines if the company can show preventative measures and mitigation.

Security and anti-fraud also get better with stronger GDPR compliance that involves unstructured data. It’s worthwhile to avoid having a malicious actor commit fraud or engage in a cyber attack using PII discovered in documents. Unstructured data with PII can be the basis for impersonation, a serious threat vector.

Solving the GDPR unstructured data challenge resonated with one of the most dominant themes at this year’s RSA conference: the critical importance of automation. “You simply cannot do this manually,” Vecci said. “Automation is essential… and it has to be smart automation, with tools and dashboards that make it easy to classify and prioritize the risks you face.”

Photo Credit: greenplasticamy Flickr via Compfight cc