Skip to content

Tokenization

See Tokenization for more information about this algorithm framework.

Creating a Tokenization Algorithm via API

  1. Retrieve the frameworkId for the Tokenization Framework. This information can be retrieved using the following endpoint:

    algorithm   GET /algorithm/frameworks
    

    The framework information should look similar to the following:

    {
        "frameworkId": 13,
        "frameworkName": "Tokenization",
        "frameworkType": "STRING",
        "plugin": {
            "pluginId": 7,
            "pluginName": "dlpx-core",
            "pluginAuthor": "Delphix Engineering",
            "pluginType": "EXTENDED_ALGORITHM"
        }
    }
    
  2. Create a Tokenization algorithm instance via the following endpoint:

    algorithm   POST /algorithms
    

    Configure a new algorithm using the JSON formatted input similar to the following:

    {
        "algorithmName": "exampleTokenization",
        "algorithmType": "COMPONENT",
        "frameworkId": 13,
        "algorithmExtension": {
            "ivLength": 16,
            "fallback": "CHARACTER_MAPPING",
            "cmCharacterGroups": [
                "[A-Za-z0-9+/]"
            ],
            "cmMinMaskedPositions": 1
        }
    }
    

Tokenization Algorithm Extension

  • ivLength (default=16, minimum=0, maximum=16)

    Integer
    The length of the initialization vector (IV) used for AES in CBC-CTS mode. The default length is 16, which offers the most security. The tradeoff is that this increases the length of the masked result. Selecting a lower IV length decreases the length of the masked result. It is recommended that you only select an IV length of 0 if you require the masked value for each input to be consistent between jobs and for the same input to only mask to one output.

  • fallback (required, no default)

    String
    This specifies how to handle masking a value where the encrypted result does not fit in the column size. If an AES encrypted result is too long to fit into the field, there are two fallback options:
    - NONE - the job fails if the masked result is too long
    - CHARACTER_MAPPING - the Character Mapping algorithm is used to tokenize the value, which produces a result that is the same length as the input

Extension for Character Mapping fallback

  • cmCharacterGroups (default=["[A-Za-z0-9+/]"])

    Array of Strings
    A list of String values defining the characters to be masked. Each group must be either: - a Java regex style character group beginning with '[' - a String of the literal characters that comprise the group.

    Duplication of characters within or among groups is not permitted.

  • cmMinMaskedPositions (default=1, minimum=0)

    Integer
    The minimum number of positions that must be replaced for masking to be considered successful. Non-conformant data handling is triggered whenever fewer positions are masked. Inputs containing only whitespace never trigger non-conformant data handling.