Skip to content

Segment Mapping

Segment Mapping algorithms produce no overlaps or repetitions in the masked data. They let you create unique masked values by dividing a target value into separate segments and masking each segment individually.

You can mask up to a maximum of 36 values using segment mapping. You might use this method if you need columns with unique values, such as Social Security Numbers, primary key columns, or foreign key columns. When using segment mapping algorithms for primary and foreign keys, in order to make sure they match, you must use the same Segment Mapping algorithm for each. You can set the algorithm to produce alphanumeric results (letters and numbers) or only numbers.

With Segment Mapping, you can set the algorithm to ignore specific characters. For example, you can choose to ignore dashes [-] so that the same Social Security Number will be identified no matter how it is formatted. You can also preserve certain values. For example, to increase the randomness of masked values, you can preserve a single number such as 5 wherever it occurs. Or if you want to leave some information unmasked, such as the last four digits of Social Security numbers, you can preserve that information.

To decide whether Character Mapping or Segment Mapping is the correct option for your use case, see Choosing Between Character and Segment Mapping Frameworks.

Creating a Segment Mapping Algorithm via UI

  1. In the upper right-hand region of the Algorithm tab, click Add Algorithm.

  2. Select Segment Mapping Algorithm. The Create Segment Mapping Algorithm pane appears.

  3. Enter an Algorithm Name.

    Info

    This MUST be unique.

  4. Enter a Description.

  5. From the No. of Segment drop-down menu, select how many segments you want to mask.

    NOTE

    This number does NOT include the values you want to preserve.

    The minimum number of segments is 2; the maximum is 9. A box appears for each segment.

  6. For each segment, choose the Type of segment from the drop-down: Numeric or Alphanumeric.

    Info

    Numeric segments are masked as whole segments. Alphanumeric segments are masked by individual characters.

  7. For each segment, select its Length (number of characters) from the drop-down menu. The maximum is 4.

  8. Optionally, for each segment, specify range values. You might need to specify range values to satisfy particular application requirements, for example. See the details below.

  9. Preserve Original Values by entering Starting position and length values. (Position starts at 1.) For example, to preserve the second, third, and fourth values, enter Starting position 2 and length 3.

    If you need additional value fields, click Add.

  10. To override the behavior of the segment mapping algorithm when it encounters data values in an unexpected format, you can change the selection under Nonconforming Data behavior. By default, the segment mapping algorithm will Use global setting as specified on the Algorithm Settings page. Selecting Mark job as Failed will instruct the segment mapping algorithm to throw an exception that will result in the job failing. Selecting Mark job as Succeeded will instruct the segment mapping algorithm to ignore the non-conformant data and not throw an exception. Note that Mark job as Succeeded will result in the non-conformant data not being masked should the job succeed, but the Monitor page will display a warning that can be used to report the non-conformant data events.

  11. When you are finished, click Save.

  12. Before you can use the algorithm in a profiling job, you must add it to a domain. If you are not using the Masking Engine Profiler to create your inventory, you do not need to associate the algorithm with a domain.

Specifying Range Values

You can specify ranges for Real Values and Mask Values. With Real Values ranges, you can specify all the possible real values to map to the ranges of masked values. Any values NOT listed in the Real Values ranges would then mask to themselves.

Specifying range values is optional. If you need unique values (for example, masking a unique key column), you MUST leave the range values blank. If you plan to certify your data, you must specify range values.

When determining a numeric or alphanumeric range, remember that a narrow range will likely generate duplicate values, which will cause your job to fail.

  1. To ignore specific characters, enter one or more characters in the Ignore Character List box. Separate values with a comma.

  2. To ignore the comma character (,), select the Ignore comma (,) checkbox.

  3. To ignore control characters, select Add Control Characters. The Add Control Characters window appears.

  4. Select the individual control characters that you would like to ignore, or choose Select All or Select None.

  5. When you are finished, click Save.

  6. You are returned to the Segment Mapping pane.

Numeric Segment Type

  • Min# — A number; the first value in the range. Value can be 1 digit or up to the length of the segment. For example, for a 3-digit segment, you can specify 1, 2, or 3 digits. Acceptable characters: 0-9.

  • Max# — A number; the last value in the range. The value should be the same length as the segment. For example, for a 3-digit segment, you should specify 3-digits. Acceptable characters: 0-9.

  • Range# — A range of numbers; separate values in this field with a comma (,). Value should be the same length as the segment. For example, for a 3-digit segment, you should specify 3 digits. Acceptable characters: 0-9.

Info

If you do not specify a range, the Masking Engine uses the full range. For example, for a 4-digit segment, the Masking Engine uses 0-9999.

Alphanumeric Segment Type

  • Min# — A number from 0 to 9; the first value in the range.

  • Max# — A number from 0 to 9; the last value in the range.

  • MinChar — A letter from A to Z; the first value in the range.

  • MaxChar — A letter from A to Z; the last value in the range.

  • Range# — A range of alphanumeric characters; separate values in this field with a comma (,). Individual values can be a number from 0 to 9 or an uppercase letter from A to Z. (For example, B,C,J,K,Y,Z)

Info

If you do not specify a range, the Masking Engine uses the full range (A-Z, 0-9). If you do not know the format of the input, leave the range fields empty. If you know the format of the input (for example, always alphanumeric followed by numeric), you can enter range values such as A2 and S9.

Warning

The Segment Mapping pattern and sub-patterns need to match the data in order for it to be masked. If the data is longer than the defined pattern it will be passed through unmasked. To avoid this unwanted behavior - patterns (segments), Ignore Characters, and Preserve Original Values should be set to match the data.

For information on creating Segment Mapping algorithms through the API, see API Calls for Creating Algorithms - Segment Mapping.

Examples

Perhaps you have an account number for which you need to create a segment mapping algorithm. You can separate the account number into segments, preserving the first two-character segment, replacing a segment with a specific value, and preserving a hyphen. The following is a sample value for this account number:

NM831026-04

Where:

  • NM is a plan code number that you want to preserve, always a two-character alphanumeric code.

  • 831026 is the uniquely identifiable account number. To ensure that you do not inadvertently create actual account numbers, you can replace the first two digits with a sequence that never appears in your account numbers in that location. (For example, you can replace the first two digits with 98 because 98 is never used as the first two digits of an account number.) To do that, you want to split these six digits into two segments.

  • -04 is a location code. You want to preserve the hyphen and you can replace the two digits with a number within a range (in this case, a range of 1 to 77).