Azure Data Factory
Interview Questions
Ankur Bhattacharya
Basic ADF Interview
Questions
What is Azure Data Factory (ADF)? How does it
work?
What are the key components of ADF?
Pipelines, Datasets, Linked Services,
Triggers, Activities
What are Linked Services in ADF? How are they
different from Datasets?
What is the difference between Copy Activity
and Data Flow Activity?
How do you schedule an ADF pipeline? What
are the different types of triggers?
What is a Self-hosted Integration Runtime
(SHIR)? Why is it used?
How do you pass parameters between activities
in a pipeline?
What is the difference between a Lookup
Activity and Get Metadata Activity?
How can you monitor an ADF pipeline?
What types of data sources can ADF connect to?
Intermediate ADF Interview Questions
How do you implement an incremental load in
ADF?
How do you handle errors in ADF pipelines?
What is the difference between Data Flows and
Mapping Data Flows?
How do you use Filter, ForEach, and If Condition
activities?
How do you handle retries and timeouts in ADF?
What are the different authentication methods
available for Linked Services?
How do you copy data from on-premises to
Azure without opening ports?
Explain the concept of Pipeline Concurrency
and how to manage it in ADF.
How do you secure sensitive credentials in ADF?
What are Integration Runtimes? When should
you use each type
Advanced ADF Interview Questions
How do you implement Change Data Capture
(CDC) in ADF?
Explain how to implement Slowly Changing
Dimensions (SCD) Type 1 and Type 2 in ADF.
How do you optimize large data movement
operations in ADF?
What is the difference between Data Flow
Debug Mode and Pipeline Debug Mode?
How do you manage dynamic ETL pipelines in
ADF using parameters and expressions?
Explain Data Flow Partitioning and its impact on
performance.
How do you use ADF with Azure Key Vault for
secure access?
How can ADF integrate with Databricks?
How do you implement end-to-end logging in
ADF pipelines?
How do you migrate SSIS packages to ADF
using Azure SSIS IR?
Scenario-Based ADF Questions
You need to load 500GB of data from an on-
prem SQL Server to Azure Blob Storage daily.
How would you design the ADF pipeline?
A pipeline has failed due to a transient error in
the destination system. How would you handle
retries and alert mechanisms?
You need to process files dynamically based on
their arrival in an Azure Blob Storage
container. How would you implement this?
You have multiple ADF pipelines that need to
execute in sequence. How do you orchestrate
them?
How do you ensure a pipeline processes only
new or changed data efficiently?
Your ADF pipeline runs but takes too long to
execute. How would you troubleshoot and
optimize it?
You need to move data from multiple sources
(on-prem SQL, APIs, Blob Storage) into a
Synapse Data Warehouse. How would you
design the pipeline?
How do you manage and version-control ADF
pipelines in a CI/CD environment using Azure
DevOps?
A Data Flow activity fails due to a schema
mismatch. How would you troubleshoot and fix
it?
How do you set up a data pipeline that
automatically scales based on workload in
ADF?