Given an organisations large scale dependency on use of production data for testing (due to complexities/nuances), a proven and trusted method of obfuscation, without compromising data integrity, is vital. How does iData’s obfuscation tool perform this?
iData has a large selection of Obfuscation functions built in. With the flexibility of generating seed keys, or referencing table identity keys as seed, we provide a no-way-back approach to generating truly obfuscate or masked data in the none-production systems.
The majority of these functions and those for data generation do not use any of the original field to generate a new obfuscated value. Instead, the new value is generated by combining a seed value, such as the rows ID, with some criteria such as a numeric range, a character template pattern or a regular expression. iData then uses the seed value to randomly generate a completely new value that matches the criteria. The original value of the field is not recoverable because the original value was not used in any way to generate the obfuscated value. The seed value is usually set from the primary key of the table so that the output of each row is consistent from one run to the next.
Data has specific functions for generating completely new personal details such as names email addresses phone numbers and addresses. These fields are populated by randomly selecting values from a pool of names and address components and combined together to give a unique result. Once again the values output have no relationship to the original values and so there is no method of reversing the output to discover what the original values were.
By design there are obfuscation functions that do use some information from the original field, but provide customisation through masking functions, that can permit part of the original value to be present in the output, users can specify how much of the original field should still be accessible as required. The masked part of the field is replaced with a fixed character and so anything in this part of the field is unrecoverable. However, it is down to the configurator to set these approaches up, and choose what can remain visible and what needs to be masked.
Also by design, there are two obfuscation functions that do use some information from the original field. In these cases the original value is used to create a template for generating new values e.g. “abcd-5678” is converted to a pattern “AAAA-9999” and then this pattern is used to randomly select new characters, numerals or symbols; the only information that is revealed is the original format of the field.
With a hash or tokenisation functions the original field value is used and influences the outcome, so that an original input value produces a specific and unique output value.
When iData processes in the data through the library of obfuscation functions, we do not tokenise, hash or encrypt any of the values, but replaces the original value with a new value selected at random. For example the input value from the field e.g. password is thrown away, it has no influence on the output value so 'Password123' -> 'abcd' on one record and 'Password123-> 'cdef' on another.
This does not mean to say that an iData users could decide to build a custom transformation to do so, but using the documented features of iData will ensure the data is not exposed to any potential reverse engineering techniques.