Configuring Profiling Settings¶
In addition to using your Rule Set to determine the inventory of what to profile, a Profiling job uses Profiler Sets to determine the set of Expressions to identify your sensitive data. You can add regular expressions or data type constraints to be used by Profiler Sets to the Profiler Settings.
To display the Profiler Settings, click on the Settings tab and select Profiler on the left-hand side of the page.
The Profiler Settings screen displays Expressions along with their Domain, Expression text/Constraints(DataType) and minimum column length, Expression Name, Owner, and Expression profiling Level.
Column and data level expressions use regex text patterns on column meta-data and the data within the column respectively to identify sensitive data.
Column and data level expressions are case insensitive.
Matches with column level expressions can be limited by data type using Type Expressions. Type Expressions consist of a user-chosen name, a data type, an optional minimum field length and a domain to which the constraint applies. Supported data types are: String, Number, Date, Binary. Each type represents a number of native datatypes in the database. For example, VARCHAR2, NVARCHAR and TEXT fields will all be recognized as being String types for the purposes of profiling. The minDataLength field limits matches to fields that are at least as long as the minDataLength. Finally, the domainName field specifies which domain the constraint applies to.
String All character types supported by the database, such as VARCHAR, NVARCHAR, CLOB and NCLOB are considered String types by the profiling logic. The minLength parameter considers the length specification of the column type, which may be characters or bytes. For example, Oracle supports VARCHAR2 fields measuring in either characters or in bytes. A VARCHAR2(20) column can hold 20 characters whereas a VARCHAR2(20 BYTE) column can hold 20 bytes, which may be fewer than 20 characters if multibyte characters are present. A type expression with a minLength of 20 will match to both.
Number All numeric types are considered Number types by the profiling logic, including INTEGER, FLOAT, BIG_INTEGER, etc. The minLength parameter considers the number of base-10 digits supported by the type. For floating point values, minLength refers to the integral part of the number.
Date The Date type includes all calendar date and date/time types, such as DATE and LOCAL_DATE_TIME types. The minLength parameter is not permitted for Date Type Expressions.
Binary The Binary type includes large object types such as BLOB and BINARY. The minLength parameter considers the maximum storage size of the column in bytes.
If there is more than one type expression assigned to a domain then a column will match for the domain if the regular expression matches and at least one of the type expressions match. For example, dates of birth are often stored in string types instead of dates, so you might have a String type expression and a Date type expression assigned to the Date of Birth domain to allow columns of either type to match. Two type expressions of the same type can't be assigned to the same domain in the same profile set. If there are no type expressions assigned to a domain, then the profile expression alone will determine matching without regard to data type. Like Profile Expressions, Profile Type Expressions must be part of a profile set in order to be effective. Profile Type Expressions have no effect on Data Level Profiling.
Profile Type Expressions are only supported for database profiling. They have no effect on profiling of file data.
Currently only Oracle and MS SQL Server are fully supported. On other platforms, type expressions may result in unexpected matches.
To add an Expression¶
Click Add Expression at the top of the Profiler screen.
Select a Domain from the Domain dropdown.
- Domains are used by Profiling jobs to determine the masking Algorithm to apply to your sensitive data. When an Expression is matched, the Profiling job will associate the specified Domain to the sensitive data. The Masking Engine comes out of the box with over 30 pre-defined Domains. Domains can be added, edited, and deleted from the Settings Domains screen.
Enter the following information for the Expression:
- Expression Name— The name used to select this expression as part of a Profiler Set.
Select an Expression Level for the Expression:
Column Level— To identify sensitive data based on column names.
Data Level— To identify sensitive data based on data values, not column names.
Type Level— To identify sensitive data based on column data type.
The subsequent fields show up in the UI dynamically depending on the Expression Level selected.
- Enter the Expression Text — The regular expression used to identify sensitive data.
Select Constraints (Data Type) for the expression: String, Numeric, Binary, Date.
Select Minimum Column Length for the data type
Length constraints are not applied to large object types such as CLOBs and BLOBs.
For example, if we want to ensure that column level profiling only identifies a column with the FIRST_NAME domain if the column is a string type and has a capacity of at least 5 characters, we could add the type constraint shown below.
When you are finished, click Save.
To edit a saved Expression, click the Edit icon to the right of the Expression.
To delete an Expression¶
Click the Delete icon to the far right of the name.
A Profiler Set is a grouping of Expressions for a particular purpose. For instance, First Name, Last Name, Address, Credit Card, SSN, and Bank Account Number Expressions could constitute a Financial Profiler Set.
The Masking Engine comes with two predefined Profiler Sets: Financial and Healthcare vertical. A Delphix Masking Engine administrator (a user with the appropriate role privileges) can create/add/update/delete these Profiler Sets.
If you want to edit or add a Profiler set, click Profiler Set at the top of the Profiler Settings screen. The Profiler Set dialog appears, listing the Profiler Sets along with their Purpose, Owner, and Date Created.
To add a Profiler Set¶
- Click Add Set at the top of dialog window.
- Enter a Profiler Set Name.
- Optionally, enter a Purpose for this Profiler Set.
- Enter or select which Expressions to include in this set.
- Enter or select which Type Expressions to include in this set.
- When you are finished, click Submit.
To edit an existing Profiler Set, click the Edit icon to the right of the Profiler Set name.
To delete a Profiler Set¶
Click the Delete icon to the right of the Profiler Set name.