Transformations of Mapping Data Flow
Azure Data Factory #MappingDataFlows Azure Synapse
Activity Description SSIS equivalent SQL Server equivalent
Multiple inputs/outputs
New branch Create a new flow branch SELECT INTO
with the same data Multicast (+icon) SELECT OUTPUT
Join data from two streams INNER | LEFT | RIGHT JOIN,
Join based on a condition Merge join CROSS | FULL OUTER JOIN
Conditional Route data into different SELECT INTO WHERE condition1
streams based on conditions Conditional Split SELECT INTO WHERE condition2
Split CASE ... WHEN
Check the existence of data in Lookup / Merge SELECT * FROM Table
Exists another stream WHERE EXISTS(SELECT ... ) | JOIN
Join | NOT EXISTS
Collect data from multiple SELECT col1a UNION (ALL)
Union streams Union All SELECT col1b
Lookup additional data from LEFT | RIGHT JOIN
Lookup another stream Lookup
Schema modifier
Cast columns to different SELECT
Cast types Derived Column CAST(NumStr as INT) as Number,
CONVERT(DATE, '14/07/2022', 103)
Derived Compute new columns based SELECT Column1 * 1.09 as NewColumn
on the existing once Derived Column
Column
Choose columns to flow to OUTPUT in components, SELECT Column1, Column4
Select the next stream FROM Table
mapping columns
Calculate aggregation on the SELECT Year(DateOfBirth) as Year,
Aggregate stream Aggregate MIN(), MAX(), AVG()
GROUP BY Year(DateOfBirth)
Surrogate Add a surrogate key column Script SELECT ROW_NUMBER()
to output stream from a OVER(ORDER BY n ASC) AS R#, n
Key Component FROM sys.databases
specific value + Incremental Primary Key (with limited capabilities)
Pivots row values into SELECT rowCol, c1, c2 FROM
Pivot columns, groups columns and Pivot ( SELECT sourceCols FROM Table)
PIVOT
aggregates data ( SUM(sumCol) FOR col IN (...) )
Unpivots columns into row SELECT rowCol, col, X FROM (
Unpivot values and ungroups columns Unpivot SELECT rowCol, c1, c2 FROM pvt)
UNPIVOT
(X FOR col FROM (c1, c2)) AS unpvt
Aggregates data based on a [Sort] + SELECT fun() OVER(
Window window and joins with PARTITION BY pc ORDER BY oc) newc,
Custom Script pc, oc, otherCols FROM Table
original data
Generates an ordered ranking [Sort] + SELECT ROW_NUMBER() OVER (
Rank based upon sort conditions PARTITION BY pc ORDER BY oc) newc,
Custom Script pc, oc, otherCols FROM Table
specified by the user
Enables custom function by Script Not available out of the box.
External Call calling out to external REST Some tricks are possible: OLE Automation,
Component CLR function, although not recommended.
endpoints row-by-row
BLOG:
https://SQLPlayer.net/tag/ADFDF @SQLPlayer
Transformations of Mapping Data Flow
Azure Data Factory #MappingDataFlows Azure Synapse
Activity Description SSIS equivalent SQL Server equivalent
Formatters
Takes array values from Script Component or SELECT * FROM OPENJSON(json) WITH (
Flatten hierarchical structures such as id INT 'strict $.id',
3rd Party Component Name NVARCHAR(50) '$.info.name');
JSON (denormalization) (ZappySys SSIS JSON Parser Transform)
Parses columns in document Script Component or SELECT * FROM OPENJSON ...
Parse form data (JSON, XML, SELECT * FROM OPENXML ...
3rd Party Component SELECT value FROM
delimited text) (ZappySys SSIS JSON Parser Transform)
STRING_SPLIT('clothing,road', ',');
Turns complex data types into Script
Stringify strings Component
Row modifier
Filter rows in the stream SELECT * FROM Table
Filter based on a condition Conditional Split WHERE [Column] LIKE '%pattern%'
Order data in the stream SELECT * FROM Table
Sort based on column(s) Sort ORDER BY [Column] ASC
Set action policy on rows Conditional Split MERGE |
Alter Row when database is sink INSERT, UPDATE, DELETE
+ n Destinations IF | WHERE
Enables you to build custom Conditional Split or WHERE, REPLACE, SUBSTRING,
Assert rules for data quality and data IF, RAISERROR, THROW
DQS Cleansing
validation
Source / Destination
Source for your data flow. OLE DB Source SELECT * FROM SourceTable
Source Obligatory first element of and more …
every Data Flow in ADF
Destination for your data flow OLE DB Destination and INSERT INTO TargetTable
Sink stream more…
Version: 2.1 (Aug 2022)
BLOG:
https://SQLPlayer.net/tag/ADFDF @SQLPlayer