Creating Masking Jobs¶

This section describes how users can create a masking job.

Creating New Jobs¶

In the Environment Overview screen, select one of the jobs icons to create the corresponding job:

Profile
Mask

Creating a New Masking Job¶

To create a new masking job:

Click Mask. The Create Masking Job window appears.
You will be prompted for the following information:
1. Job Name — A free-form name for the job you are creating. Must be unique across the entire application.
2. Masking Method — Select either In-Place or On-The-Fly. In-Place jobs update the source environment with the masked values. On-The-Fly jobs read unmasked data from the source environment and writes the masked data to the target environment.
  
  INFO: On-The-Fly Masking Jobs.
  
  Only certain combinations of connector types are supported. On-The-Fly jobs where the source and target connectors are of the same type (e.g. Oracle to Oracle, delimited file to delimited file), and jobs with a database source (e.g. Oracle, MS SQL) and the target is delimited files are supported.
  
  The target tables or files must be created in advance and the names must match the names of the source tables or files. In the case of a database to delimited file job, the file names should match the table names.
3. Multi Tenant — Checkbox if the job is for a multi-tenant database.
  
  INFO: Provisioning Masked VDBs.
  
  A job must be Multi-Tenant to use it when creating a masked virtual database (VDB). This option allows existing rulesets to be reused to mask identical schemas via different connectors. The connector can be selected at job execution time.
4. Rule Set — Select a rule set that this job will execute against.
5. Source Environment (only for On-The-Fly Masking Method) - Select the Source Environment that this job will get the data from.
6. Source Connector (only for On-The-Fly Masking Method) - Select the Source Connector that provides the connection to the the chosen Source Environment.
7. Streams: Number—The number of parallel streams to use when running the job. For example, you can select two streams to mask two tables in the Rule Set concurrently in the job instead of one table at a time.
  
  INFO: Choosing the Number of Streams
  
  Jobs - even with a single stream - will have separate execution threads for input, masking, and output logic. While it is not necessary to increase the number of streams to engage multiple CPU cores in a job, doing so may increase overall job performance dramatically, depending on a number of factors. These factors include the performance characteristics of the data source and target, the number of processor cores available to the Delphix Masking Engine, and the number and types of masking algorithms applied in the Rule Set. The memory requirements for a job increase proportionately with the number of streams.
8. Streams: Row Limit—The number of data rows that may be in process simultaneously for each masking stream. For file jobs, this controls the number of delimited or fixed-width lines, mainframe records, or XML elements in process at one time. Setting this value to 0 allows unlimited rows into each stream, while leaving it blank will select a default limit based on job type.
  
  INFO: Choosing the Row Limit
  
  The default Row Limit values have been selected to allow typical jobs to run successfully with the default job memory and streams number settings. This assumes a maximum row or record size of approximately 2000 bytes with 100 masked columns. If masked row or record size, or column count, exceed these values, it may be necessary to either allocate more memory to the job by increasing Max Memory, or reduce the Row Limit to a smaller value. Conversely, if the masked rows are quite small and have few masking assignments, increasing the Row Limit may improve job performance. Remember to consider the worst case (the largest rows, the most masking assignments) table or file format in the Rule Set when making this determination.
9. Min Memory (MB) — Minimum amount of memory to allocate for the job, in megabytes.
10. Max Memory (MB) — Maximum amount of memory to allocate for the job, in megabytes.
  
  Info
  
  It is recommended that the Min/Max Memory should be set to at least to 1024.
11. Update Threads — The number of update threads to run in parallel to update the target database.
  
  Warning
  
  Multiple threads should not be used if the masking job contains any table without an index. Multi-threaded masking jobs can lead to deadlocks on the database engine. Multiple threads can cause database engine deadlocks for databases using T-SQL If masking jobs fail and a deadlock error exists on the database engine, then reduce the number of threads.
12. If Nonconforming Data is encountered
  - Stop job on first occurrence - (optional) To abort a job on the first occurrence of nonconforming data. The default is for this checkbox to be clear.
  Info
  
  The job behavior depends on the setting specified in the If Nonconforming data is encountered field on the Algorithm Settings page. If Mark job as Failed is selected then the job would be aborted on the first occurrence of nonconforming data. If Mark job as Succeeded is selected then the job will not be aborted.
13. Commit Size — (optional) The number of rows to process before issuing a commit to the database.
14. Feedback Size — (optional) The number of rows to process before writing a message to the logs. Set this parameter to the appropriate level of detail required for monitoring your job. For example, if you set this number significantly higher than the actual number of rows in a job, the progress for that job will only show 0 or 100%.
15. Disable Constraint — (optional) Whether to automatically disable database constraints. The default is for this check box to be clear and therefore not perform automatic disabling of constraints. For more information about database constraints see Enabling and Disabling Database Constraints.
16. Batch Update — (optional) Enable or disable use of a batch for updates. A job's statements can either be executed individually, or can be put in a batch file and executed at once, which is faster.
17. Disable Trigger — (optional) Whether to automatically disable database triggers. The default is for this check box to be clear and therefore not perform automatic disabling of triggers.
18. Drop Indexes — (optional) Whether to automatically drop indexes on columns which are being masked and automatically re-create the index when the masking job is completed. The default is for this check box to be clear and therefore not perform automatic dropping of indexes.
19. Prescript — (optional) Specify the full pathname of a file that contains SQL statements to be run before the job starts, or click Browse to specify a file. If you are editing the job and a prescript file is already specified, you can click the Delete button to remove the file. (The Delete button only appears if a prescript file was already specified.) For information about creating your own prescript files.
20. Postscript — (optional) Specify the full pathname of a file that contains SQL statements to be run after the job finishes, or click Browse to specify a file. If you are editing the job and a postscript file is already specified, you can click the Delete button to remove the file. (The Delete button only appears if a postscript file was already specified.) For information about creating your own postscript files see Creating SQL Statements to Run Before and After Jobs
21. Comments — (optional) Add comments related to this masking job.
22. Email — (optional) Add e-mail address(es) to which to send status messages.
When you are finished, click Save.

Enabling and Disabling Database Constraints¶

Depending on the type of target database you are using, the Delphix Engine can automatically enable and disable database constraints.

The ability to enable and disable constraints ensures that the Delphix Engine can update columns that have primary key or foreign key relationships. You can set Delphix to handle constraints automatically by enabling the Disable Constraint checkbox on a Masking job.

Note

Delphix does not support the enable/disable constraints feature for all databases. To see which databases are supported, see the Data Source Support page.

Creating SQL Statements to Run Before and After Jobs¶

When you create a masking job or a certification job, you can specify standard, static SQL statements to run before (prescript) you run a job and/or after (postscript) the job has completed. For example, if you want to mask a column that has a foreign key constraint to another table, you could use a prescript to disable the constraint and a postscript to re-enable the constraint.

You create prescripts and postscripts by creating a text document with the SQL statement(s) to execute. If the text file contains more than one SQL statement, each statement must be separated by a semicolon [;]. For example to remove records with date_column before December 12th, 2017 before masking a table (owner.table), one would create a prescript file containing the following and associate the prescript file to the masking job that includes the table in its ruleset:

DELETE FROM owner.table WHERE date_column < ‘20171207’;

Database-specific, SQL programming extensions (such as PL/SQL and Transact-SQL) and dynamic SQL statements are not supported in prescripts and postscripts. However, you can create procedures and functions using your database tooling of choice and call them using standard SQL statements from a prescript or postscript.