GDPR Challenge 2: State of the art security of documents and metadata

Posted by Daniela Di Noi on 1/18/18 1:06 PM

Find me on:

Do you remember the first GDPR challenge? We talked about how to identify documents that contain personal data and label them appropriately.

Continuing our GDPR series, in this post, Bart van Bouwel and Jean-Luc Goedermans, from  CDI-Partner, they will elaborate on the second challenge: State of the art security of the documents and the metadata.

When we store documents and other files, we not only store files that might contain personal data, we also create personal data by doing this.

When we run a classification algorithm, as discussed in the previous post, we extract personal data from a document and store it in metadata. As a result, databases and indexes are filled with personal data in a semi-structured way. This makes finding and retrieving documents far easier, but it also increases the risk related with keeping personal data.

But there is more. Personal data is information related to an identified or identifiable natural person. For compliance reasons we will keep records of persons accessing, copying, and printing documents containing personal data. This audit trail links natural persons, for instance employees, to the documents they use. By definition, this information is also personal data.

And usage history is not harmless, it can lead to criminal convictions. In the UK there are some examples of hospital staff convicted for accessing medical files for non-professional reasons. This is in fact the very reason of keeping an audit trail, but unauthorized access to this information is a data breach in its own right.

Security of processing is covered in article 32 of the GDPR. We need to have appropriate security related to the risk to the rights and freedoms of natural persons. This is a very broad description that will lead to a lot of interpretation and discussion.


GDPR - Security of processing - Xenit.png


Typically, documents are stored on local or network drives or in the Cloud and distributed by email or by copying them on USB keys. Protecting documents in this way is not easy and options are limited to granting read or write access.

When we store documents in Alfresco, we gain a lot of options. Next to the original document, we can generate and store previews. These can even be personalized with a watermark. It is far easier to address people leaving documents on the printer (a possible personal data breach!) when their name is on every page.

Instead of sending documents by email, we will send links to documents or links to PDF previews. For very sensitive documents a separate login process can be enforced. This reduces the risk of sending documents to the wrong person (the number one type of data breach). And when a person asks to be forgotten, we can automatically inform the receiving third party of this request.

We will need to protect the documents, the database, the indexes, and the log files. For sensitive metadata, for instance the national number, we will use one-way encryption or cryptographic hashing. Only a user that knows the correct key can retrieve the documents related to the key, but the indexes can’t be used to retrieve all the existing keys.

Documents stored in Alfred Archive are always protected and set up according best object storage practices. There is no direct access on a file system basis, which neutralizes 99% of known malware & intrusion risks. Furthermore, documents in the store are encrypted, with proper handling of the private and public key. Access to the document store is under detailed auditing. Life points can set the document to be 'immutable'. Moreover, a health processor continuously checks the binary integrity of all content - which can include the validity of a digital signature.

As for metadata, the database that contains the metadata can be encrypted on database level. Specific metadata fields with personal data can be encrypted and will not be exploitable by any direct access to the database. Index information is protected with encryption as well. No system can directly access the indexes in our reference architecture, as we have a safety gateway in front of the archive, and we have the Alfresco content services in front of the document store.

 Alfred Archive Architecture-1.png


As you can see, storing files and documents in Alfresco with Alfred GDPR ensures adequate and appropriate security to your company’s most vulnerable digital assets.

Coming up next week: Challenge 3 “Monitor and control access to the documents and store the right metadata to proof legitimate purpose”.

The series is not legal advice for your company to use in complying with EU data privacy laws like the GDPR. Instead, it provides background information to help you better understand the GDPR. 


Topics: Alfresco, Content Services, Handling Documents, Edit Online, Edit Offline, GDPR, Compliance, personal data, Security, sensitive data, PROCESSING, PRIVACY, governance, breaches, Alfred Desktop, Alfred Finder, Alfred Edge, Alfred Object Storage, Alfred, Alfred Inflow, document, Storing data, securing of processing

About Xenit 


Xenit is a Belgium-based IT company, focusing  on content services solutions, and covering all document-related business processes, from data migration to digital archive to hybrid/cloud hosting solution, to help organizations get control of their information. Premier Partner and System Integrator of Alfresco Digital Business Platform, Xenit has more than 10 years of experience in Alfresco Content and Process Services.


Subscribe to Email Updates

Recent Posts

Posts by Topic

see all