Version 4 release 4.2, August 2024
General Description
DS
··· DS Life Cycle
··· ··· Create DS
··· ··· Transform DS
··· ··· Store DS
Type System
··· DS Types
··· ··· PlainText
··· ··· Structured
··· ··· Passthru
··· ··· Columnar
··· ··· Spatial DS Types
··· ··· ··· Point
··· ··· ··· Track
··· ··· ··· Polygon
··· Attribute, Variable, and Parameter Types
··· ··· Names
··· ··· String Literals
··· ··· Numerics
··· ··· Boolean Type
··· ··· NULL
··· ··· Typed Arrays
··· Custom Object Types
Script Structure
··· Top-level Structure
··· Expressions
··· ··· Automatic Evaluation in Strings and Names
··· ··· Expression Operators
··· ··· Expression Functions
··· Parameter List Expressions
Data Storage Interface Statements
··· CREATE DS Statement
··· COPY DS Statement
Data Transformation Statements
··· TRANSFORM DS Statement
Data Manipulation Statements
··· SELECT Statement
··· CALL Statement
··· ANALYZE Statement
Compound Control Flow Statements
··· LET Statement
··· IF Statement
··· LOOP Statement
··· CREATE PROCEDURE Statement
··· DROP PROCEDURE Statement
Settings Statements
··· OPTIONS Statement
Transform Definition Language version 4 (TDL4) is a domain-specific language designed to describe ETL processes to perform with Data Cooker ETL tool. The strongest emphasis is on the Transformation phase.
Syntax of TDL4 is inspired by and based on SQL, but has substantial semantic differences due to Data Cooker's differences from traditional RDBMS. Instead of SQL tables that consist of column-based rows, Data Cooker operates with strongly typed collections of Record Objects called 'DS' (for Data Sets / Data Streams). Also, Transformation phase of ETL does not imply analytic workloads or complex data manipulation that traditional SQL facilities usually provide.
As an SQL dialect, TDL4 is mainly declarative but also has substantial means of flow control, and even procedural elements.
Another difference from SQL engines is that Data Cooker doesn't require global data catalog nor information schema. A rudimentary partial schema is defined ad hoc by language statements only if required by current context. It doesn't imply control over data types, nor integrity, neither consistency, neither enforces data constraints.
TDL4 allows high level manipulation over DSs as a whole, and intentionally omits features like aggregation, window functions, and anything that reduces several records of given DS to a single value. Only record-level expression evaluations for record attributes are provided by SELECT and similar statements. Those Expression Operators and Functions are implemented as Java classes and thus are extensible.
At the same time, Data Cooker's language interpreter also does provide built-in support for pluggable 'Operations' and 'Transforms' to allow more complex manipulations with DS written in Java. By implementing an Operation interface, developers are allowed to seamlessly integrate custom logic into ETL processes, including generation of new DS and changing data inside of them as they like. With Transforms, it is possible to change type of DS Record Objects.
Another level of extensibility is provided by 'Storage Adapters' layer of plug-ins that covers Extract and Load phases of ETL processing.
TDL4 does have specific type system, customized for DS that contain not only relatively flat Records, but also complex Spatial Objects with embedded geometry data, because Data Cooker's heritage begins from a geo-informatic processing toolchain.
In terms of Data Cooker, a DS is a basic entity of every ETL process.
DS is a typed collection of Objects, that may have or not complex type-dependent internal structure, and/or a set of simple Attributes, accessible from the process by Names. Each Object together with an opaque Key form an elementary unit of processing, the Record.
DSs are partitioned — split into chunks called Partitions, to balance the load of executors because they process a single partition at each time. Number of Partitions in most cases must be set manually, but if not, it'll be defined by Input Adapter using intrinsic properties of data source and execution environment. Records are assigned to Partitions by hashcodes of its Keys. Correct definition of Keys is crucial to avoid 'data skew' (a situation when a small number of Partitions contain most of the records while all others are empty).
According to definition of ETL, DSs can be created (once), transformed (many times), and stored (several times).
In Data Cooker, a DS can be created by several means from the external source of data or from other existing DSs.
Create method #1. Input Adapters extract data from the data source, which could be some object or file-based storage, a traditional database, or something else. Each Input Adapter supports creation of a single type of DS, but may support different data sources.
Initial Key of newly created Record is set by Input Adapter, then DS is partitioned using those initial keys. Rules for a concrete Input Adapter are documented by its developer.
Method #2. A DS can be derived from one or many existing by performing an SQL-like SELECT query. Type and partitioning of a derived DS is inherited from parent DSs. See SELECT section of this manual for details.
Method #3 to create a DS is to call an Operation that generates new DS from existing ones using its own rules. There are some common patterns described below in the CALL section, but exact rules are documented by the developer of the Operation.
After DS is created, it may be transformed by Transforms, changing its Record type and/or Key, as well as number of partitions. Number of objects can also change if one source object corresponds to many resulting ones and vice versa. There can be performed as many consecutive Transforms on a given DS as needed. As Transforms are pluggable, transformation rules are documented by developer.
If Record Keys are changed by applying a Transform, affected Records may be assigned to other partitions.
Final step of DS's life cycle is storage to some external data store, performed by Output Adapter. An Output Adapter may store different types of DSs into different formats. Rules, as usual, are documented by developer.
Differing from SQL, TDL4 data types are defined first on the level of DS as a whole for each Record Object, and then by level of Record Attributes and language constructs.
Record Objects may have or not an internal structure, and if Record Object type does allow Attributes, each Attribute type is dynamically coalesced to match the context of expression that references it, if possible.
Supported types are:
After loading a DS from unstructured text source, each line is represented as text in the system encoding (usually, UTF8), or byte array. Before performing meaningful transformations on the data contained in that set, it should be converted to another, structured type.
Vice versa, before saving structured DSs to text-based file formats, it should be converted to PlainText.
Each object of a Structured DS is backed by a JSON representation, so it can be as arbitrary as it fits within JSON limitations. Namely, String, Numeric, Boolean, an NULL properties are supported, as well as homogenous arrays (with every member of same type). Top level object may be an array itself.
Each top-level property is an Attribute itself (read-write), and nested properties' Attribute names are compound (read only). They may include array index or range part too.
This type is not a data type per se, but just a marker reported from API calls to plugin Transforms that means it doesn't change data type in the set.
Each object of DS of that type is a record consisting of uniquely named columns, enumerated in the specified order. After enumerated, they can be accessed via Attributes of same name.
Data in the column doesn't have an explicit type, nor it has any metadata except name. Columns, omitted from the record, have automatically assigned NULL values.
Because Data Cooker inherited preceding project legacy which was a geo-informatics project, TDL4 natively supports geometry-specific Spatial data types implemented as extensions to JTS Topology Suite and GeoJSON.
Objects of that type are abstractions of Points a map with an optional set of arbitrary Attributes.
Being an extension of JTS Point, each object has mandatory internal attributes of geographic 'latitude' and 'longitude'.
Object of a Track is an abstraction of a route traveled by some physical object over the real-world map. It consists of Points, grouped together by an appointed 'tracking' Attribute, and sorted by another 'timestamp' Attribute, optionally cut into segments by 'segmentation' Attribute. If segmentation Attribute is not defined, there still be a single segment that includes all points of a track.
Each level of nested objects (Track, Segments, Points) may have an arbitrary number of other named Attributes.
A region on a map, defined as a collection of a single external border outline and any number of internal cut-out outlines, with an arbitrary number of other named attributes.
There is no direct access to vertices of Polygon outlines, but there is an automatically computed Centroid Point, with radius set to its most distant border vertex.
Outside DSs, there are also script-level Variables, and Expression- / Pluggable Entity-level Parameters, that may have their own types, more or less similar to SQL: Names, String, Numeric, Boolean, NULL, and Typed Arrays.
Names are used to define and identify any entities that have to be referred to in the script, i.e. DSs, Attributes, Parameters, Variables, and so on.
Canonical representation of a Name is a string of Latin alphabet letters, numbers (any position except first), and underscores, or a string of any valid Unicode characters enclosed in double quotes. If name should contain a character of double quote itself, it must be doubled. If name clashes with a TDL4 keyword, it must be quoted.
ALL Names are case-sensitive.
Quoted representation of names provides automatic expression evaluation (see later in a dedicated paragraph).
Names could be joined by a dot to create compound names, used in SELECTs with JOINs.
Names of Variables are always explicitly identified by preceding 'dollar sign' $ sigil. Names of Parameters are likewise preceded by 'commercial at' @ sigil.
String literals are fragments of arbitrary Unicode text enclosed in apostrophes. If a String should contain an apostrophe, it must be doubled.
String literals are always automatically evaluated (see later).
There are different representations of immediate Numeric values allowed:
99 here means a sequence of decimal numbers, HH for hexadecimal numbers, s is sign sigil (+ or -). Parts in square braces are optional, symbols E, L, . (dot) and 0x are used as is.
Literal TRUE is used for Boolean truth and literal FALSE for Boolean falseness.
Like in SQL, NULL is used for non-value.
Most of the expressions that include NULL evaluate to NULL, except ones that use truth table similar to OR operator, and some other specific cases (sse in the dedicated topic). As in SQL, no NULL is equal to another NULL.
A Variable or parameter that is not defined in the current context evaluates to NULL as well.
Names, String Literals, and Numerics can be collected into Arrays. Syntax for an array constructor is a simple list of values using comma, enclosed in square brackets. Optional keyword ARRAY may be used before the opening bracket.
All values in the array must be of same type as the first one, otherwise Data Cooker will throw an error and halt script execution. Arrays may be empty.
For Numeric Arrays that consist of Long values over some continuous range, there is a special syntax as well.
Ranges may be ascending or descending and include both boundaries. For example, RANGE[1,-1] is equivalent of ARRAY[1,0,-1].
Arrays may be used in LOOP control structures, IN operator, to pass collections to CALL parameters, and in other contexts.
Some Operations and even Expression Functions define their own specific subtypes of data types that they interpret with their own set of rules. Those types are referred to as 'Something-like Objects'.
For example, standard date/time manipulation functions operate with DateTime-like Objects, which are represented by Long Epoch seconds or milliseconds, or String formatted as a ISO8601-alike timestamp.
Another example, some spatial Operations may extend existing Point Record type with custom 'radius' attribute measured in meters, that makes them POI-like Objects (for Points of Interest).
TDL4 script is a sequence of any number of statements, each ending with a mandatory semicolon.
Each simple language statement is executed in current execution context, starting with top-level, while compound statements run nested statements in their own nested context. List of language statements is as of follows:
Whitespace and tabulation characters are ignored and may be freely used for means of readability. Script may be empty, there is no error.
Expressions can be evaluated in two distinct contexts. First is 'Loose Expression' that can refer to any script-level entities. Another one is 'Record-level Expression' which can also refer to Record Attributes by their Names.
Loose Expression context | Record-level Expression context |
---|---|
Record-level evaluation rules take place only where it makes sense.
There is automatic expression evaluation performed inside quoted Names and String literals. Any text between curly braces { and } inside quoted Name or String is evaluated as an expression, and result is substituted as a substring in-place.
To turn off automatic evaluation, opening brace should be prepended by a backslash \.
There are three SQL-specific 'peculiar' Operators implemented right by TDL4 interpreter: IN, IS, and BETWEEN.
Symbol (Variants) | Priority | Associativity | # of Operands | Description |
---|---|---|---|---|
Operand [NOT] IN ARRAY[] Operand [NOT] IN $array_var Operand [NOT] IN array_attr | 35 | Right | 2 | TRUE if left operand is present in the right, cast to Array, FALSE otherwise. Vice versa for NOT variant |
Operand IS [NOT] NULL | 35 | Right | 1 | TRUE if left operand is NULL, FALSE otherwise. Vice versa for NOT variant |
35 | Right | 3 | TRUE if left operand is inclusively between Min and Max operands, cast to Numerics, FALSE otherwise. For NOT variant, TRUE if it is outside the range, excluding boundaries |
All other Operators that can be used in Expressions are modeled after Java operators, implemented elsewhere in the classpath, and their list is extensible / dependent of distribution.
If an Operator requires operands of concrete data type, they will be directly cast or stringified-then-parsed to that type. If cast or parsing is impossible, Data Cooker throws an error and halts script execution.
Operators with higher priority are evaluated before lower priority operators. In case of same priority, evaluation is left to right except if Operator is right-associative. For expression grouping and changing of priority, parentheses can be used.
Most Operators result in NULL if any of their operands are NULL, but some do handle NULLs in a specific way.
For exact list of available Operators, their Priority, Associativity, and NULL handling, look into automatically generated distro docs.
A function call has the following syntax:
Functions may have any number of strictly positional Arguments (1st, 2nd, 3rd, and so on). Some have a fixed list of strongly typed Arguments, some accept any number of Arguments that are automatically cast to a single type. There are also functions that have no arguments and therefore return some predefined value, or even refer to Object Record as a whole and access its internal data, such as Spatial Object's geometry.
In any case, Arguments are evaluated from left to right.
Record-level Functions can be called only from Record-level Expressions, and are invalid in the Loose Expression context.
Functions are provided in the same way as Operators, by Java implementations from the classpath. For the complete list of Functions, their return and Argument types refer to your distro docs.
Parameter lists are expressions composed of parameters (at least one), separated by commas.
Each parameter is a name to value (which may be expression or array literal) pair.
Parameter lists are usually passed to named function-like expressions (TRANSFORM, CALL, and so on).
Parameter lists are bound to Record-level context and can refer to DS Attributes.
Creates a new DS from data source. Its name must be new and unique.
Functional expression includes Input Adapter name and its parameter list. List of available Input Adapters depends on the distribution.
FROM expression evaluates to a path String pointing to data source. Each Input Adapter has its own syntax of path, which could be a URI, a query, or something more specific.
Number of partitions is set by PARTITION clause. Its expression evaluates to Java Integer number of partitions. If not set, single partition DS is created.
Initial object keying, and thus, initial partitioning can be changed by the BY algorithm clause. Default algorithm is HASHCODE, which sets initial Record Key to Java hashcode of the Record Object itself, so identical objects will be placed into same partitions. SOURCE derives Keys from source file name (or storage-dependent analog, it if's not file-based), so all Records from same source file will go to same partition. RANDOM sets keys to random values, effectively evenly shuffling records between partitions.
In any case, initial key is a Java Integer numeric.
Saves existing DS to storage.
Likewise, INTO expression evaluates to a path String (usually, a URI) pointing to a data storage location, with a syntax specific to each Output Adapter. List of available Output Adapters depends on the distribution.
If a star is used after DS name, it is treated as a prefix, and all matching DSs will be copied to storage (usually, into subdirectories under specified INTO path for file-based storages). Each partition would usually go to a separate file.
Invokes a Transform to convert a DS from one type to another. Instead of keyword TRANSFORM, ALTER may be used.
Is a star is used after DS name, it is treated as a prefix, and all matching DSs will be transformed with same parameters.
Functional expression includes name of Transform and list of its parameters. As transforms are pluggable, exact list depends on a distribution, and thus they're documented separately.
An item of column definitions of a certain type has the following structure:
In case of the COLUMNS list with ($variable) syntax, referenced variable must be of type of Array of Names.
'Type of columns' in this context depend on the DS type.
DS Type | Type of Attributes | Record Object Level |
---|---|---|
PlainText | VALUE | There are no Attributes |
Columnar | VALUE | Column names of the record object |
Structured | VALUE | Property names of the record object |
Point | POINT or POI or VALUE | Attributes of Point object |
Track | TRACK or SEGMENTEDTRACK or VALUE | Attributes of Track (level 1 object) |
SEGMENT or TRACKSEGMENT | Attributes of Segments (level 2 objects) | |
POINT or POI | Attributes of Points (level 3 objects) | |
Polygon | POLYGON or VALUE | Attributes of Polygon object |
Key item is a simple expression, referring to existing DS columns:
If key is set, DS Records will be reassigned to new partitions according to new Keys. Be aware that raw result of key expression is used to set Record Key, so if it evaluates to a constant, all records will go into one partition.
Key evaluation incidence depends on the Transform. Most evaluate keys before the record is transformed, but there are some that do it after (for example, if source Record is PlainText no Attributes are available before transformation). See Transforms documentation to find out.
If statement includes partition clause, DS will be repartitioned to a desired number of partitions after both transforming and rekeying, even if that number is same as before.
Perform an SQL-like query to one or more existing DSs and create a new, resulting DS. Its name, specified in INTO clause, must be new and unique.
Query part is a star (with meaning 'select all available attributes') or a list of 'what to select' expressions. Resulting DS attributes are named according to aliases (if specified) or stringified expression (expression text with all whitespace removed):
If same alias is set for several expressions, rightmost will rewrite result of all preceding (there can't be many attributes with same name in DS object). Aliases are set per column type (same as in TRANSFORM).
SELECT expressions are bound to Record-level context for all DS participating in the SELECT.
For Structured SELECTs, compound property names must be used to access nested properties using dot notation ("top.inner.innermost"), and zero-based array indices or ranges to access array members. For example, "prop.inner[0]" for 1st menber of "prop.inner" array, or "[2].prop" for 3rd topmost array member's property "prop". Ranges may be specified only for innermost properties (no dots allowed afterward): "prop[*]" or "prop[]" for entire array "prop", "prop[1:]" from 2nd member onwards, "prop[:5]" up to 6th, "prop[2:4]" from 3rd to 5th. Range boundaries may be negative which means 'Nth member from array end'. If right range goes past array size, result will be NULL padded. If boundaries overlap or both go beyond beginning, empty array will be returned.
In case of straight JOINs, fully-qualified compound names for attributes from different DSs must be used in the same dot notation.
Source scope is specified in the FROM clause:
Queries can be performed to a single or many (with UNION) DS of any type, or multiple JOINed KeyValue DSs.
Query to a single DS is equivalent of iteration over all its data and simple evaluation of what-to-select items, then applying constraints.
In the case of a single DS with * query and no other constraints, a 'fast shallow copy' of DS is performed (just a new name to same underlying data is assigned).
UNIONs could unite either all listed DSs, or if a star syntax is used, all DSs with names that have specified prefix. There are several types of UNION operation available:
UNION Operation | Result |
---|---|
By default, CONCAT or CAT | Records of all DSs are queried, in order of occurrence |
XOR | Only records that are unique to their source DSs are queried, other are discarded |
AND | Only records that are occurred in all participating DSs are queries, other are discarded |
UNION can be applied to DSs of same type only. Also, for UNION to be successful, all top-level objects of participating DSs must have top-level attributes of same names.
JOINs can be used to form new objects from participating DSs Objects that have same Record keys. There are following JOIN operations available:
Resulting record objects of straight JOINs have internal data from the first meaningful Object occurred in the DS iteration order, while Attributes are read from all JOINed sets, using compound name with dot notation (1st part is set name, then attribute name). For subtracting JOINs, however, only one set is the source of resulting objects, thus only that object Attributes are available, using simple name.
JOIN Operation | Type | Result Evaluation Rules | Result Type |
---|---|---|---|
By default, INNER | Straight | Records that have matching keys in all participating DSs are queried, other discarded. Records always have all Attributes from all participating sets | 1st set |
LEFT | Straight | DSs are iterated from left to right. Result records always have columns form 1st DS, and from subsequent DSs records with same keys. If there is no record with matching key, NULLs will be used for its attributes | 1st set |
LEFT ANTI | Subtracting | DSs are iterated from left to right. Records from 1st DS that have no matching keys in all subsequent DSs will be queried, other discarded | 1st set |
RIGHT | Straight | DSs are iterated from right to left. Result records always have columns form last DS, and from preceding DSs records with same keys. If there is no record with matching key, NULLs will be used for its attributes | Last set |
RIGHT ANTI | Subtracting | DSs are iterated from right to left. Records from last DS that have no matching keys in all subsequent DSs will be queried, other discarded | Last set |
OUTER | Straight | DSs are iterated from left to right. Result records always have columns form all DSs records with same keys. If there is no record with matching key, NULLs will be used for its attributes | 1st set |
There are 3 types constraints available to narrow down the query results: WHERE, DISTINCT, and LIMIT clauses.
WHERE constraint is a single Boolean expression (applicable only to one level of object selected by type alias, if there are many), evaluated before 'what-to-select' expressions and all other constraints for each record. If result is TRUE, record is queried, otherwise discarded.
DISTINCT, if specified, is applied after resulting query is evaluated. Of all records that have been evaluated to exactly same key and object, there will be only one left, and any repeats discarded.
LIMIT is applied at the very last, and comes in two flavors, by count and by percentage. Count can be specified in any valid Numeric format, and the resulting DS will have at most the specified number of records, randomly selected. If set by percentage, it must fall into a range from 0 (exclusive) to 100 (inclusive), and result will have randomly trimmed down to set percentage of records. If invalid percentage is specified, that will lead to an error.
To summarize, the order of SELECT evaluation is as of follows:
Serves to call pluggable Operations, akin to SQL stored procedures (but with a substantial semantic difference), or TDL Procedures (see further below).
Functional expression includes name of an Operation or Procedure and its parameters. Full list of available Operations depends on distribution, thus them and their parameters are documented separately. TDL Procedures are ephemeral and exist only in the current execution context.
Operations always create new DSs from existing and require INPUT FROM and OUTPUT INTO DS references (in either order).
Like SELECT, there is source scope of DSs, set by INPUT clause. There are two flavors of source scope, depending on Operation.
Positional INPUT is either a list of DS names, or a prefix, followed by a star. In case of a list, Operation tries to use matching inputs in the specified order. In case of star syntax, order doesn't matter. Some Operations may require an exact number of positional inputs, while some will use all positional inputs to create an equal number of positional outputs. Please refer to Operation documentation for its requirements.
Named INPUTs have internal aliases, specific to Operations. In that case, each DS has an explicit role for the calling Operation. Some Operations may have listed some named inputs optional, and for the exact meaning of omitted input please see their respective docs.
Quite unlike SELECT, an Operation may produce a number of resulting DSs. There are 3 distinct varieties of them. Filtered outputs have records internally unchanged from input, but selected according to some set of rules. Augmented outputs extend of otherwise change internal structure of records coming from inputs. Finally, there may be outputs consisting of records, generated from input anew. Please refer to Operation docs for each case.
Positional OUTPUT has meaning similar to positional INPUT. Specified star name is used for setting resulting DSs prefix (with suffixes coming from INPUT names), or a literal list of names is used, if they are comma-separated.
Named OUTPUTs are explicitly different, and for each of Operation OUTPUT alias, resulting DS generally is of a specified structure. Some operations may have optional named OUTPUTs.
An Operation may have positional INPUTs and named OUTPUTs, or vice versa.
Performs calculation of basic statistics about the specified DS (or multiple DSs, found by prefix in the star syntax).
This statement uses a single selected Attribute as a 'counting key' for statistics. If not set, record's own key always is used automatically. After the 'counting key' is decided, all objects of the DS are iterated, and the following stats are calculated:
Statistical indicator | Meaning |
---|---|
_streamName | DS name |
_streamType | DS type |
_counterColumn | Property name set in KEY clause, or NULL if inapplicable |
_totalCount | Count of objects in the DS, like COUNT * in SQL |
_uniqueCounters | Number of unique 'counting key' values |
_counterAverage | Average number of objects per each of unique 'counting key' value |
_counterMedian | Median number of objects per each of unique 'counting key' value |
They are appended to special Columnar DS named '_metrics' as a new record. Its records keys are always names of analyzed DSs.
Because each DS may be TRANSFORMed multiple times in their life cycle, and because ANALYZE may be called another time for a DS (i.e. with different 'counting key'), each successive invocation of ANALYZE for same DS adds new record to '_metrics' DS instead of replacing it.
Compond statements have nested statements (including other compound statements) inside their clauses that execute under various conditions in a separate execution context.
Defines (or undefines) a new script Variable in the current execution context.
Simple expression of a Variable is same as of a parameter expression, as well as simple Array definition. If expression result is NULL, Variable is removed from the context.
Arrays may be returned as results from the limited SELECT query:
There are a few differences from a regular SELECT query. First, only one DS must be queried (no JOINs nor UNIONs are allowed). Second, only one expression in the what-to-select list is allowed. Non-Numeric results will be stringified, because arrays don't support data types other than Numerics and Strings. Query constraints do work as expected.
Performs control flow branching. Evaluates a Boolean expression, and executes statements nested in THEN clause for TRUE. Otherwise, executes statements in ELSE clause, if it is present, or just does nothing.
Nested statements in either branches can be any number of TDL4 statements (even none).
Executes statements nested in BEGIN clause as many times as number values in the array, each time setting loop Variable to next value. If expression evaluates to NULL or array is empty, and ELSE clause is present, its nested statements will be executed once instead.
If there was an existing Variable with same name as loop Variable in the outer context, it will be restored to its prevoius value, because LOOP creates a new nested context.
Creates a sequence of statements in the current global execution context, that can be CALLed as a whole with its own context of Variables.
Procedures exist in the same namespace as Operations, and they can't override existing Operations. But they can be replaced further down in the script code. For example, if your library script defines some Procedures, you may redefine some in your script by using OR REPLACE clause.
Each @-Parameter becomes a $-Variable inside BEGIN clause with the value assigned at the CALL context. Because Parameters become Variables, they are always named, never positional.
Parameters can be optional, in that case they should have default value set by an expression (evaluated in the current context, not in the CALL context). If defined without default value (just as a @Name), Parameters become mandatory and must always have a CALL value.
Use this statement to remove one or many PROCEDUREs from the current global execution context.
Sets internal Data Cooker parameters to specified values.
Unlike all other statements, Data Cooker pulls all OPTIONS to topmost execution context and executes them once, before evaluating all other script contents. They are completely ignored thereafter, in the normal script execution flow.
Parameters, that may be set by OPTIONS, is dependent on Data Cooker distribution and version, so please see a dedicated doc in the distribution for exact definitions.