DePartition Components - combines or merges multiple flow partitions of data rec
ords into a single flow:
(Concatenate, Gather, Interleave, Merge.)
Concatenate: Appends multiple flow partitions of data records one after another.
Gather: Combines data records from multiple flow partitions arbitrarily.
Interleave: Combines blocks of data records from multiple flow partitions in rou
nd-robin fashion.
Merge: Combines data records from multiple flow partitions that have been sorted
according to the key specifier, and maintains the sort order.
Partition components - distribute data records to multiple flow partitions to su
pport data parallelism, as follows:
(Broadcast, Partition by expression, partition by key, partition by perc
entage, partition by range, partition by round robin, partition by loadbalance.)
Broadcast: Distributes data by combining input data records into a single flow a
nd writing a copy of that flow to each output flow partition.
Partition by expression: Distributes data records to its output flow partitions
according to a specified DML expression.
Partition by key: Distributes data records to its output flow partitions accordi
ng to key values.
Partition by percentage: Distributes a specified percentage of the total number
of input data records to each output flow.
Partition by range: Distributes data records to its output flow partitions accor
ding to ranges of key values specified for each partition.
Partition by Round Robin: Distributes data records evenly to each output flow in
round-robin fashion. Use the Interleave component to reverse the effects of Par
tition by Round-robin.
Partition by Load balance: Distributes data records to output flow partitions, w
riting more records to the flow partitions that consume records faster.
Miscellaneous Components - Collects the output from log ports of components for
analysis of a graph after execution.
(Documentation, Gather Logs, Meta - Pivot, Redifine Format, Replicate, R
un Programme,Throttle, Trash)
Documentation: This component provides a facility for documenting a transform
Gather Logs: Collects the output from log ports of components for analysis of a
graph after execution.
Meta - Pivot: Pivots around one or more fields in the input.
Redifine Format: Copies data records from its input to its output without changi
ng the values. Use Redefine Format to change a record format or rename fields.
Replicate: Arbitrarily combines all the data records it receives into a single f
low and writes a copy of that flow to each of its output flows.
Run Programme: Executes a standard UNIX or Windows NT program.
Throttle: Copies data records from its input to its output, limiting the rate at
which records are processed.
Trash: Ends a flow by discarding all input data records.