Data Cleansing¶
See Data Cleansing for more information about this algorithm framework.
Creating a Data Cleansing Algorithm via API¶
-
Retrieve the frameworkId for the Data Cleansing Framework. This can be done via the following endpoint:
algorithm GET /algorithm/frameworks
The framework information should look similar to the following:
{ "frameworkId": 24, "frameworkName": "Data Cleansing", "frameworkType": "STRING", "plugin": { "pluginId": 7, "pluginName": "dlpx-core", "pluginAuthor": "Delphix Engineering", "pluginType": "EXTENDED_ALGORITHM" } }
-
Upload a lookup file via the following endpoint:
fileUpload POST /file-uploads
Copy the fileReferenceId value returned in the Response Body.
-
Create a Data Cleansing algorithm via the following endpoint:
algorithm POST /algorithms
Using the JSON formatted input, similar to the following example:
{ "algorithmName": "demoDataCleansing", "algorithmType": "COMPONENT", "frameworkId": 24, "algorithmExtension": { "lookupFile": { "uri": "delphix-file://upload/f_52b19f8a9125435a83a1237fa53aeaf5/sample.txt" }, "delimiter": "=", "caseSensitive": false, "trimWhitespace": true } }
Data Cleansing Algorithm Extension¶
-
lookupFile (required)
String
The fileReferenceId value returned from the fileUpload endpoint for uploading files to the Masking Engine. The file should contain a newline separated list of {value, replacement} pairs separated by the delimiter. No extraneous whitespace should be present. -
delimiter (required, minLength=1; maxLength=50; default="=")
String
The delimiter string used to separate {value, replacement} pairs in the lookup file. -
caseSensitive (optional, default=true)
Boolean
Whether the case of the input string must match the values in the lookup file. -
trimWhitespace (optional, default=true)
Boolean
Whether to trim leading and trailing whitespace from the input string.
Note: This must be true to cleanse fixed-width files and fixed-length database data types such as CHAR and NCHAR.