GDPR and Test Data

Data is at the heart of any enterprise application and test data is the heart of a good test environment. However, test data should not, and in the case of GDPR, must not contain information that can be used to identify an individual. Our Test Data Management solution will address issues of disk space, data verification, data confidentiality and protracted test durations. Control and management of test data ensures that every test starts with a consistent data state which is essential in maintaining your data in a predictable state at the end of the test. Checking both visible test results and the database effects is a key principle of AQM (Application Quality Management), a task which is practically impossible to accomplish manually.


image with GDPR on screen

What is GDPR?

The General Data Protection Regulation (GDPR) is a European Union(EU) directive effective from May 2018. It protects the privacy of EU residents and gives them control of their data, how it is used and the right to see that data.

GDPR applies to any personal data held in the EU irrespective of where the individual lives, and to personal data of any EU citizen (including UK residents post-Brexit) held outside of the EU. Access to information that can be used to identify an individual must be limited to those with a need to access that information.

For example, a person responsible for shipping orders needs access to a customer’s name and address, and the items that have been ordered, in order to fulfil the customer’s request. They do not need, nor should they have, access to that individual’s credit card details, date of birth etc.

It is also an imperative that data relating to an individual must be accurate.

What does GDPR mean for Test Data

Imagine that your test system holds details of an individual’s bank account balance and, for testing purposes, you have made the account overdrawn. What happens when that individual requests details of the information you hold about them? Will they be happy to learn that, in one of your systems, their account is showing a negative balance? More importantly, have they given you permission to use that data for testing purposes?

It has generally been accepted that ‘live’ data should not be used for testing, and in some cases, this has effectively been mandated (HIPAA in the US, for example). With GDPR it is wholly inappropriate to use data that can be used to identify an individual for testing purposes. GDPR refers to Pseudonymisation, a process that transforms personal data in such a way as it cannot be attributed to a real person. Pseudonymised data can “no longer be attributed to a specific data subject without the use of additional information”, according to GDPR legislation. That means that you need to limit the potential exposure and pseudonymise the data.

Focusing on your production systems for a minute, you are, inevitably, going to need to make changes to your systems to comply with GDPR. This means testing, potentially lots of testing. You are going to need test data and that data needs to be compliant.


padlocked usb stick
block letters spelling comply

So What Can you do About Test Data?

The obvious thing to do is to scramble, de-identify, obfuscate and/or mask, sensitive information. This requires analysis of the data to highlight where and how sensitive data is stored. This is something that is likely to already have been done or is in progress. Next, you need to decide what to do to the data to remove the ability to identify a real person from it. 

Reducing the size of the test data, through sub-setting, will limit the potential for non-compliance; sub-setting will also make testing easier and quicker, saving time and increasing the speed at which you can test you GDPR compliance changes.

Test Data Management Solutions

Original Software’s TestBench provides code-free, easy to define mechanisms to pseudonymise data while maintaining its referential integrity.

Again, Original Software’s TestBench provides an easy to use, code-free solution for the extraction and sub-setting of data to create test data environments. As with pseudonymisation, data referential integrity is always maintained. The end result is a smaller, more focused test data environment with no information that can be attributed to a real individual.


5 boxes 4 ticks and a pen

How We Help

What Next

Test Data Strategies for Application Testing

Related Articles