Tokenization¶

See Tokenization for more information about this algorithm framework.

Creating a Tokenization Algorithm via API¶

Retrieve the frameworkId for the Tokenization Framework. This information can be retrieved using the following endpoint:

algorithm   GET /algorithm/frameworks

The framework information should look similar to the following:

{
    "frameworkId": 13,
    "frameworkName": "Tokenization",
    "frameworkType": "STRING",
    "plugin": {
        "pluginId": 7,
        "pluginName": "dlpx-core",
        "pluginAuthor": "Delphix Engineering",
        "pluginType": "EXTENDED_ALGORITHM"
    }
}

Create a Tokenization algorithm instance via the following endpoint:

algorithm   POST /algorithms

Configure a new algorithm using the JSON formatted input similar to the following:

{
    "algorithmName": "exampleTokenization",
    "algorithmType": "COMPONENT",
    "frameworkId": 13,
    "algorithmExtension": {
        "ivLength": 16,
        "fallback": "CHARACTER_MAPPING",
        "cmCharacterGroups": [
            "[A-Za-z0-9+/]"
        ],
        "cmMinMaskedPositions": 1
    }
}

Tokenization Algorithm Extension¶

ivLength (default=16, minimum=0, maximum=16)

Integer
The length of the initialization vector (IV) used for AES in CBC-CTS mode. The default length is 16, which offers the most security. The tradeoff is that this increases the length of the masked result. Selecting a lower IV length decreases the length of the masked result. It is recommended that you only select an IV length of 0 if you require the masked value for each input to be consistent between jobs and for the same input to only mask to one output.
fallback (required, no default)

String
This specifies how to handle masking a value where the encrypted result does not fit in the column size. If an AES encrypted result is too long to fit into the field, there are two fallback options:
- NONE - the job fails if the masked result is too long
- CHARACTER_MAPPING - the Character Mapping algorithm is used to tokenize the value, which produces a result that is the same length as the input

Extension for Character Mapping fallback¶

cmCharacterGroups (default=["[A-Za-z0-9+/]"])

Array of Strings
A list of String values defining the characters to be masked. Each group must be either: - a Java regex style character group beginning with '[' - a String of the literal characters that comprise the group.

Duplication of characters within or among groups is not permitted.
cmMinMaskedPositions (default=1, minimum=0)

Integer
The minimum number of positions that must be replaced for masking to be considered successful. Non-conformant data handling is triggered whenever fewer positions are masked. Inputs containing only whitespace never trigger non-conformant data handling.