[go: up one dir, main page]

0% found this document useful (0 votes)
52 views29 pages

ADF Interview Questions v2

The document provides a comprehensive overview of Azure Data Factory (ADF), detailing its purpose, components, and functionalities, including data integration, transformation, and orchestration. It covers key concepts such as Integration Runtimes, triggers, data lakes, and various activities within ADF, highlighting their roles and use cases. Additionally, it discusses security measures, CI/CD support, and the differences between ADF and traditional ETL tools.

Uploaded by

mynameisjoydeep
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
52 views29 pages

ADF Interview Questions v2

The document provides a comprehensive overview of Azure Data Factory (ADF), detailing its purpose, components, and functionalities, including data integration, transformation, and orchestration. It covers key concepts such as Integration Runtimes, triggers, data lakes, and various activities within ADF, highlighting their roles and use cases. Additionally, it discusses security measures, CI/CD support, and the differences between ADF and traditional ETL tools.

Uploaded by

mynameisjoydeep
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 29

ADF Interview Question & Answer

Q1. What do we need Azure Data Factory for?


Data forms the backbone of every business, whether it's a global enterprise or a budding
startup, requiring a reliable, secure, and easily accessible data source.
Azure Data Factory provides a robust data integration platform from Microsoft, enabling
the creation and deployment of data transformations in minutes instead of hours.
It empowers users to effortlessly connect to diverse data sources, automating and
orchestrating data integration workflows with ease.
With its capability to transform data formats seamlessly, Azure Data Factory simplifies
complex workflows, streamlining operations and enhancing efficiency.
Q2. What is Azure Data Factory?
Azure data factory is a fully managed service for data integration, data transformation, and
data warehousing. Azure data factory can be used to connect to on-premises databases, on-
premises data warehouses, cloud services, and other services.
With ADF, you can easily create, manage, and orchestrate a wide range of data integration
scenarios. it also provides a visual interface for data movement, allowing you to build
workflows with ease.
Q3. What are the main components of ADF?
1. Pipelines: Group of activities performing a task.
2. Activities: Individual steps in a pipeline (e.g., Copy, Lookup).
3. Datasets: Metadata definitions pointing to data structures.
4. Linked Services: Connection strings for external sources.
5. Integration Runtimes (IRs): Compute environments.
Q4. Explain Integration Runtimes in ADF?
In Azure Data Factory (ADF), Integration Runtimes (IR) are the backbone that helps ADF
move data and connect to various systems. Think of IR as a bridge between ADF and your
data—whether it’s in the cloud, on-premises, or a mix of both
Types of Integration Runtimes
1) Azure Integration Runtime (Azure IR)
What it does:
Handles all cloud-to-cloud data transfers and data transformation tasks.

| Azure Data Engineer Trainer | Mr. Srinivas (12+ Yrs Exp) |


Key points:
Fully managed by Azure, so you don’t need to set up anything.
Works only with cloud-based data (e.g., Azure SQL, Blob Storage).
Uses encrypted communication to keep data safe.
Example: Copy data from Azure Blob Storage to Azure SQL Database.
2) Self-Hosted Integration Runtime (SHIR)
What it does:
Allows ADF to access data stored on your company’s local servers or private
networks.
Key points:
You need to install this on your local server or a virtual machine (VM).
Ideal for hybrid setups where you want to move data between your on-premises
database and Azure.
Securely connects to ADF through HTTPS.
Example: Move data from your on-premises SQL Server to Azure Data Lake.
3) Azure-SSIS Integration Runtime (Azure-SSIS IR)
What it does: Runs SQL Server Integration Services (SSIS) packages in the cloud.
Key points:
Useful if you’re already using SSIS for ETL and don’t want to rewrite everything for
Azure. Scales up and down depending on your needs.
Works with Azure SQL or SQL Managed Instance as its database.
Example: Run your existing SSIS workflows in Azure with minimal changes.
Q5. Is there a limit on the number of integration runtimes?
There is no given limit on the number of integration runtimes.
Q6. What are the types of triggers in ADF?
In ADF, triggers are used to start pipelines (your data workflows) automatically. There are
three main types of triggers, each designed for different scenarios.

| Azure Data Engineer Trainer | Mr. Srinivas (12+ Yrs Exp) |


1) Schedule Trigger
What it does: Runs your pipeline on a specific schedule.
When to use: If you want your pipeline to run at regular intervals, like daily, hourly,
or weekly.
Example:
Run a pipeline every day at 9 AM to process sales data.
2) Tumbling Window Trigger
What it does: Runs pipelines at fixed time intervals but ensures no overlaps. It keeps
track of missed runs and ensures they are executed.
When to use: For scenarios where processing every interval (or time slice) is
important.
Example:
Process hourly data from a file system. If one hour is missed due to downtime, the
trigger will retry that hour later.
3) Event-Based Trigger
What it does: Starts a pipeline when a specific event occurs, like a file being added to
or modified in storage.
When to use: For real-time or on-demand processing.
Example:
Automatically start a pipeline when a new file is uploaded to Azure Blob Storage.
Summary:
 Schedule Trigger: Time-based execution.
 Event Trigger: Based on Blob Storage events.
 Tumbling Window Trigger: Executes pipelines at recurring intervals.
Q7. What do you mean by Blob Storage?
Blob storage is an extremely scalable and low-cost service for storing, processing, and
accessing large volumes of unstructured data in the cloud. Blob storage can be used for
anything from storing large amounts of data to storing images and videos. It provides highly
efficient access to any type of file, and it is fast and reliable for both block
operations and object-level uploads and downloads.

| Azure Data Engineer Trainer | Mr. Srinivas (12+ Yrs Exp) |


Q8. What is the difference between Azure data lake and Azure Data warehouse?
The data lake is the storage solution for big data analytics. It has been used by many
companies to store their data in a central location, which can be accessed by multiple
applications.
The data warehouse is a more powerful version of the data lake. It’s designed to store and
visualize large volumes of data and can be used for ETL (Extract, Transform and Load)
operations.
Q9. Differentiate between Data Lake Storage and Blob Storage?
Data Lake Storage Blob Storage
Data Lake is an online data storage service Blob storage is used for storing files and
in Azure that is used for storing large static content like images, videos, etc.
amounts of data in a single location.
It follows a hierarchical file system. It follows an object store with a flat
namespace.
Data is stored as files that are present inside Data is stored in storage accounts in the
folders. form of containers.
It can be used to store Batch, interactive, You can store text files, videos, binary data,
stream analytics, and machine learning media storage for streaming and general
data. purpose data.

Q10. Differentiate between the Mapping data flow and Wrangling data flow
transformation activities in Adf?
The Mapping data flow activity transforms one or more source datasets into a single
destination dataset while the Wrangling data flow activity transforms one or more source
datasets into multiple destination datasets.
Q11. Which Data Factory version is used for creating data flows?
Data Factory V2 version is used to create data flows.
Q12. Can we pass parameters to a pipeline run?
Yes, we can pass parameters to a pipeline run. A pipeline run is an activity that will be
performed by a user, such as a data transformation or data ingestion, and it has a name and
a description. Pipeline runs can be created from pipelines or they can be executed
manually.
Q13. What are the two levels of security in ADLS Gen2?
Role-Based Access Control – It includes built-in azure rules such as reader,
contributor, owner or customer roles. It is specified for two reasons. The first is,

| Azure Data Engineer Trainer | Mr. Srinivas (12+ Yrs Exp) |


who can manage the service itself, and the second is, to permit the reasons is to permit the
users built-in data explorer tools.
Access Control List – An ACL is a set of rules that can be applied to a resource, such as a
directory, file, or database. The rules can be used to specify who can perform actions on the
resource, and when those actions can take place.
Q14. What are the two types of compute environments that are supported by Data
Factory?
On-demand compute environment – On-Demand Compute Environment (ODCE) provides a
set of tools that enable you to build, manage, and operate a fully managed cloud service
that provides compute resources as a service.
Bring your own environment – It is a service in Azure that allows you to run your own
cloud-based data warehouse wherein you can manage the compute environment with
ADF.
Q15. What do you mean by Azure Table Storage?
Azure Table Storage helps to store structured data in the cloud, allowing the data to be
accessed from any device. It makes it easy to store and access data in the cloud and is ideal
for storing large amounts of structured and semi-structured data.
Q16. Is Azure Data Factory ETL or ELT Tool?
Azure Data Factory (ADF) is a cloud service that helps you automate, orchestrate, and
manage data warehouse workloads and supports both ETL and ELT.
Q17. What separates Azure Data Factory from the conventional ETL Tools?
When it comes to big data analytics, many companies are still using traditional ETL Tools
like Informatica and Talend. ETL tools were designed for batch processing and they’re not
optimized for real-time processing.
These limitations make them a poor choice for big data analytics. With the help of Azure
Data Factory, organizations can use the power of the cloud to perform a variety of data
transformations and extractions, as well as load data into databases and other systems.
Q18. What is the role of Linked services in Azure Data Factory?
Linked services are used to connect Azure Data Factory with on-premises or cloud-based
systems. It provides a mechanism for connecting Azure Data Factory to external systems.

| Azure Data Engineer Trainer | Mr. Srinivas (12+ Yrs Exp) |


Q19. What are ARM Templates in Azure Data Factory and what is it used for?
An Azure Resource Manager template is a JSON (JavaScript Object Notation) file that
defines the configuration of a specific resource in Azure. The templates are used to deploy
the
resources in Azure and contain the same code as the pipeline.
You can use ARM templates to move your application to a new server without having to
manually install any software on the server. In addition, you can use ARM templates to
deploy a new version of your application without having to redeploy the entire application.
Q20. Which activity should you use if you want to use the output by executing a query?
The Look-up activity can be used if you want to use the output by executing a query. The
output can be an array of attributes or even a single value.
Q21. Have you ever used Execute Notebook activity in Data Factory? How to pass
parameters to a notebook activity?
Yes, I have executed notebook activity in the data factory. We can pass parameters to a
notebook activity with the help of the base Parameters property. default values from the
notebook are executed if and when the parameters are not specified in the activity.
Q22. Can you push code and have CI/CD in ADF?
Data Factory fully supports CI/CD for data pipelines using Azure DevOps and GitHub. This
allows the ETL process to be developed and deployed in stages before releasing the finished
product. Once the raw data is refined into a ready-to-manipulate, consumable format, load
the data into Azure Data Warehouse or Azure SQL Azure Data Lake, Azure Cosmos DB, or
other analytics engine that your organization can reference from business intelligence tools.
Q23. What are variables in Azure Data Factory?
Variables in Azure Data Factory pipelines provide the ability to store values. They are used
for similar reasons as variables in any programming language and can be used within a
pipeline. Set Variable and Add Variable are two activities used to set or manipulate the
value of a variable. A data factory has two types of variables:
System variables:
These are fixed variables from the Azure pipeline.
User variables:
User variables are manually declared in your code based on your pipeline logic.

| Azure Data Engineer Trainer | Mr. Srinivas (12+ Yrs Exp) |


Q24. What do you understand by copy activity in the azure data factory?
Copy Activity is one of the most popular and used activities in Azure Data Factory. This
is used for ETL or lift and shift to move data from one data source to another. You can
also perform transformations while copying data.
For example, let’s say you are reading data from a txt/csv file that contains 12 columns.
However, when writing to the target data source, only seven columns should be preserved.
You can transform this to send only the required number of columns to the target data
source.
Q25. What are the different activities you have used in Azure Data Factory?
Here you can share some of the most important features if you have used them in
your career, whether it is your work or a university project. Here are some of the most
used functions:
 The Copy Data function allows you to copy data between datasets.
 For Each function for a loop.
 Get a metadata function that can provide metadata from any data source.
 Define variable activity to define and initialize variables on the pipeline.
 Search function to search for values from table/file.
 The wait function waits a certain amount of time before or between
pipeline operations.
 The validation function checks the existence of files in the data set.
 Web Activity to call a custom REST endpoint from the ADF pipeline.
Q26. When would you choose to use Azure Data Factory?
When you need to manage a large number of data sources and data flows, Azure Data
Factory is the right choice for you. It can help you automate tasks like ETL, data processing,
data migration, data integration, and data preparation.
Q27. What is the purpose of Lookup activity in the Azure Data Factory?
In the ADF pipeline, the Lookup activity is commonly used for configuration lookup
purposes, and the source dataset is available. Moreover, it is used to retrieve the data
from the source dataset and then send it as the output of the activity. Generally, the
output of the lookup activity is further used in the pipeline for taking some decisions or
presenting any configuration as a result.
In simple terms, lookup activity is used for data fetching in the ADF pipeline. The
way you would use it entirely relies on your pipeline logic. It is possible to obtain

| Azure Data Engineer Trainer | Mr. Srinivas (12+ Yrs Exp) |


only the first row, or you can retrieve the complete rows depending on your dataset or
query.
Q28. Is it possible to calculate a value for a new column from the existing column from
mapping in ADF?
We can derive transformations in the mapping data flow to generate a new column based
on our desired logic. We can create a new derived column or update an existing one when
generating a derived one. Enter the name of the column you’re creating in the Column
textbox.
You can use the column dropdown to override an existing column in your schema. Click
the Enter expression textbox to start creating the derived column’s expression. You can
input or use the expression builder to build your logic.
Q29. Can we define default values for the pipeline parameters?
Yes, we can easily define default values for the parameters in the pipelines.
Q30. Can an activity in a pipeline consume arguments that are passed to a pipeline run?
Every activity within the pipeline can consume the parameter value passed to the pipeline
and run with the @parameter construct.
Q31. Which Data Factory activity can be used to get the list of all source files in a specific
storage account and the properties of each file located in that storage?
Get Metadata activity.
Q32. Which Data Factory activities can be used to iterate through all files stored in a
specific storage account, making sure that the files smaller than 1KB will be deleted
from the source storage account?
 For Each activity for iteration
 Get Metadata to get the size of all files in the source storage
 If Condition to check the size of the files
 Delete activity to delete all files smaller than 1KB
Q33. What are the three methods used for executing pipelines?
 Under Debug mode
 Manual execution using Trigger now
 Using an added scheduled, tumbling window or event trigger.

| Azure Data Engineer Trainer | Mr. Srinivas (12+ Yrs Exp) |


Q34. Data Factory supports four types of execution dependencies between the ADF
activities. Which dependency guarantees that the next activity will be executed regardless
of the status of the previous activity?
Completion dependency.
Q35. Can we monitor the execution of a pipeline that is executed under the Debug mode?
The Output tab of the pipeline, without the ability to use the Pipeline runs or Trigger runs
under the ADF Monitor window can be used to monitor it.
Q36. Define datasets in ADF?
 A dataset refers to data that can potentially be used for pipeline activities as
outputs or inputs.
 A dataset is the structure of the data within linked stores of data such as
files, documents and folders.
 A Microsoft Azure Blob Storage dataset = a folder and the container within Blob
Storage from which particular pipeline activities should read data as processing input.
Q37. Do you require proper coding for ADF?
Not really, as ADF offers more than 90 built-in connectors to transform data.
Q38. What is Azure Data Lake?
 Azure Data Lake streamlines processing tasks and data storage for
analysts, developers, and data scientists.
 It is an advanced mechanism that supports the mentioned tasks across
multiple platforms and languages.
 It helps eradicate the barriers linked with data storage, making it simpler to carry
out steam, batch, and interactive analytics.
 Features in Azure Data Lake resolve the challenges linked with productivity
and scalability and fulfil growing business requirements.
Q39. What is the data source in the ADF?
 The data source is the source that includes the data intended to be used or
executed. The data type can be binary, text, csv files, json files, etc.
 It can be in the form of image files, video, audio, or might be a proper database.
 Examples of data source include Azure blob storage,azure data lake storage, Azure
sql database, or any other database such as mysql db, etc.

| Azure Data Engineer Trainer | Mr. Srinivas (12+ Yrs Exp) |


Q40. What does the breakpoint in the ADF pipeline mean?
Breakpoint is the debug portion of the pipeline. If you wish to check the pipeline with any
specific activity, you can accomplish it through the breakpoints.
Say you are using 3 activities in the pipeline and now you want to debug up to the second
activity only. This can be done by placing the breakpoint at the second activity. To add a
breakpoint, you can click the circle present at the top of the activity.
Q41. How to trigger an error notification email in Azure Data Factory?
 We can trigger email notifications using the logic app and Web activity.
 We can define the workflow in the logic app and then can provide the Logic App
URL in Web activity with other details.
 You can also send the message also using Web activity after failure or completion
of any event in the Data Factory Pipeline.
Q42. How can we implement parallel processing in Azure Data Factory Pipeline?
 Each Loop Activity in Azure Data Factory provides you with parallel
processing functionality.
 For Each Loop Activity, there is a property to process workflow inside the For Each
loop in sequential or parallel fashion. The property, is sequential, specifies
whether the loop should be executed sequentially or in parallel.
 A maximum of 50 loop iterations can be executed
Q43. What has changed from private preview to limited public preview in regard to data
flows?
 You will no longer have to bring your own Azure Databricks clusters.
 ADF will manage cluster creation and tear-down.
 Blob datasets and Azure Data Lake Storage Gen2 datasets are separated
into delimited text and Apache Parquet datasets.
 You can still use Data Lake Storage Gen2 and Blob storage to store those files. Use
the appropriate linked service for those storage engines.
Q44. How can we use code in Data Factory to higher environments?
At a high level, the following series of actions can help –
 Make a feature branch for our code base to be kept.
 Create a request form to merge it when you are sure that the code belongs in the
Dev branch.
 Publish the development branch’s code to create ARM templates.

| Azure Data Engineer Trainer | Mr. Srinivas (12+ Yrs Exp) |


 As a result, code can be promoted to higher surroundings like Staging or
Production using an automated CI/CD DevOps pipeline.
Q45. What are the three tasks that Microsoft Azure Data Factory supports?
 Data Factory supports the following – data movement, transformation, and
control activities.
 Movement of data activities: These processes help in transfer of data.
 Activities for data transformation: These activities assist in data transformation as
the data is loaded into the target.
 Control flow activities: Control (flow) activities help in regulating any activity’s
flow through a pipeline.
Q46. What activity can be used when you want to use the results from running a query?
The output of a query or executable execution can be returned by a look-up activity.
The outcome can be a singleton value, an array of attributes, or any transition or control
flow activity like the For Each activity. These outputs can be used in a subsequent copy data
activity.
Q47. What are the available ADF constructs and how are they useful?
 Parameter: The @parameter construct, each activity in the pipeline can use
the parameter value that was passed to it.
 Coalesce: The @coalesce construct can be used in the expressions to handle
null values.
 Activity: The @activity construct enables the consumption of an activity output in
a subsequent activity.
Q48. What do you mean by data flow maps?
 Mapping flows of data are data transformations that are visually designed,
without having to write any code, data engineers can create data transformation
logic using data flows.
 The resulting data flows are carried out in weighted Apache Spark clusters by
Azure Data Factory pipelines as activities.
 Utilizing the scheduling, control flow, and monitoring tools already available in
Azure Data Factory, data flow activities could be operationalized.
 Data flow mapping offers a completely visual experience without the need for
coding. Scaled-out data processing is carried out using execution clusters that are
managed by ADF. All of the path optimizations, data flow job execution, and code
translation are handled by Azure Data Factory.

| Azure Data Engineer Trainer | Mr. Srinivas (12+ Yrs Exp) |


Q49. How are the remaining 90 dataset types in Data Factory used for data access?
 Azure Synapse Analytics, Azure SQL Database delimited text files from such an Azure
storage account, or Azure Data Lake Storage Gen2 are all supported as the source
and sink data sources by the mapping data flow feature.
 Parquet files from blob storage or Data Lake Storage Gen2 are also supported.
 Data from all other connectors should be staged using the Copy activity before
being transformed using a Data Flow activity.
Q50. Elaborate on ADF's Get Metadata activity?
 Any data in an Azure Data Factory or Synapse pipeline can have its metadata
retrieved using the Get Metadata activity.
 The Get Metadata activity’s output can be used in conditional expressions to
sample predictions or to consume the metadata in later activities.
 It receives a dataset as input and outputs metadata details.
 The returned metadata can only be up to 4 MB in size.
Q51. Can ADF pipeline be debugged? How?
One of the most important components of any coding-related task is debugging, which is
necessary to test the software for any potential bugs. It also offers the choice of debugging
the pipeline without actually running it.
Q52. How can I copy data from multiple sheets in an Excel file?
 We must specify the name of the sheet from which we want to load data when
using an Excel connector inside of a data factory.
 When dealing with data from a single or small number of sheets, this approach is
nuanced. However, if we have many sheets (say, 10+), we can use a data factory
binary data format plug and point it at the excel file without having to specify
which sheet(s) to use.
 The copy activity will allow us to copy the data from each and every sheet in the file.
Q53. Does ADF facilitate nested looping?
 Adf does not directly support nested looping for any looping action (for each / until).
 One for each and until loop activities, on the other hand, contain execute
pipeline activities that may contain loop activities. In this manner, we can achieve
nested
looping because when we call the loop activity, it will inadvertently call another loop
activity.

| Azure Data Engineer Trainer | Mr. Srinivas (12+ Yrs Exp) |


Q54. How can I move multiple tables from one datacentre to another?
A smart way of accomplishing this would be by –
 Having a lookup table or file which lists the tables that need to be copied along
with its sources.
 We can then scan the list using the data retrieval activity and each loop activity.
 We can simply employ a copy activity or a mapping data flow inside each loop
activity to copy multiple tables to the target datastore.
Q55. Name a few drawbacks of ADF?
 ADF offers very good data movement and transition functions but it also comes
with a few limitations.
 The Looping activities in our pipeline cannot be present in the data factory
and requires another solution for it.
 A maximum of 5000 rows can be retrieved at once by the lookup activity. To
achieve the same type of organization in the pipeline, we must again combine SQL
with
another loop activity.
Q56. Which runtime should be used to copy data from a local SQL Server instance while
using ADF?
We should have the self-hosted assimilation runtime installed on the onsite machine where
the SQL Server Instance is offered to host to be able to copy data from an on-premises SQL
Database using Azure Data Factory.
Q57. What is the role of Connected Services in ADF?
In ADF, Linked or connected Services are majorly used for –
 Representing a data store, such as SQL Server instance, a file share, or an Azure
Blob storage account.
 The underlying VM will carry out the activity specified in the pipeline for
Compute representation.
Q58. Where can I obtain additional information on the blob storage?
Blob Storage lets you store large amounts of data belonging to Azure Objects, like text or
binary data. You can retain the classification of the data associated with your application or
make it accessible to the public.
The following are some examples of applications of Blob Storage:
 Provides files to a user’s browser in a veridical manner.

| Azure Data Engineer Trainer | Mr. Srinivas (12+ Yrs Exp) |


 Preserves data by promoting accessibility from a remote location.
 Streams live audio and video content
Q59. How can I utilize one of the other 80 dataset types that Data Factory provides to get
what I need?
Existing options for sinks and sources for Mapping Data Flow includes –
 The Azure SQL Data Warehouse and the Azure SQL Database
 Specified text files from Azure Blob storage or Azure Data Lake Storage Gen2
 Parquet files from either Blob storage or Data Lake Storage Gen2.
You can use the Copy activity to retrieve info from one of the additional connectors and
post the data staging process, you must carry out a process known as a Data Flow in order
to convert the data.
Q60. Why should we use the Auto Resolve Integration Runtime?
The runtime environment will make every effort to carry out the tasks in the same physical
place as the source of the sink data, or one that is as close as it can get in turn increasing
the productivity.
Q61. What are the advantages of carrying out a lookup in the ADF?
The Lookup activity is often used for configuration lookup as it has the data set in its initial
form. The output of the activity can also be used to retrieve the data from the dataset
that served as the source.
In most cases, the outcomes of a lookup operation are sent back down the pipeline to
be used as input for later phases.
Q62. Does Azure Data Factory offer connected services? If yes, how does it function?
The connection technique used to join an external source is known as a connected service
in ADF and the phrase is used interchangeably.
It not only serves as the connection string, but also saves the user validation data.
It can be implemented in two different ways, like –
 ARM approach.
 Azure Portal

| Azure Data Engineer Trainer | Mr. Srinivas (12+ Yrs Exp) |


Q63. What are the three most important tasks that can be completed with Microsoft
ADF?
ADF makes it easier to carry out three major processes that are moving data, transforming
data, and exercising control.
The operation known as data movement does exactly what the name suggests, which is to
facilitate the flow of data from one point to another.
Ex- Information can be moved from one data store to another using Data Factory’s Copy
Activity.
Data transformation activities refer to operations that modify data as it is being loaded into
its final destination system.
Ex- Azure Functions, Stored Procedures, U-SQL are a few examples. .
Control (flow) activities, is to help regulate the speed of any process that is going through a
pipeline.
Ex- Selecting the Wait action will result in the pipeline pausing for the amount of time that
was specified.
Q64. What are the steps involved in an ETL procedure?
The ETL (Extract, Transform, Load) technique consists of 4 main steps.
 Establishing a link to the data source (or sources) is the initial stage. Then the
information is collected and transferred to either a local database or a crowdsourcing
database, which happens to be the next step in the process.
 Making use of computational services includes activities such as transforming data
by utilizing HDInsight, Hadoop, Spark, and similar tools.
 Sending information to an Azure service, such as a data lake, a data warehouse,
a database, Cosmos DB, or a SQL database. This can also be achieved by using
the Publish API.
 Azure Data Factory makes use of Azure Monitor, API, PowerShell, Azure Monitor
logs, and health panels on the Azure site to facilitate pipeline monitoring.
Q65. Did you experience any difficulty while migrating data from on-premises to the
Azure cloud via Data Factory?
Within the context of our ongoing transition from on-premises to cloud storage, the
problems of throughput and speed have emerged as important obstacles. When we
attempt to duplicate the data from on-premises using the Copy activity, we cannot achieve
the throughput that we require.

| Azure Data Engineer Trainer | Mr. Srinivas (12+ Yrs Exp) |


The configuration variables that are available for a copy activity make it possible to
fine- tune the process and achieve the desired results.
 If we load data from servers located on-premises, we should first compress it
using the available compression option before writing it to cloud storage, where
the compression will afterwards be erased.
 After the compression has been activated, ii) it is imperative that all of our data
be quickly sent to the staging area. Before being stored in the target cloud
storage buckets, the data that was transferred might be uncompressed for
convenience.
 Copying Proportion, The use of parallelism is yet another alternative that offers the
potential to make the process of transfer more seamless. This accomplishes the
same thing as employing a number of different threads to process the data and can
speed up the rate at which data is copied.
 Because there is no one size that fits all, we will need to try out a variety of
different values, such as 8, 16, and 32, to see which one functions the most
effectively.
 It may be possible to hasten the duplication process by increasing the Data
Integration Unit, which is roughly comparable to the number of central processing
units.
Q66. What are the limitations placed on ADF members?
ADF offers tools for transmitting and manipulating data that are found in its feature set but
with a few limitations.
 The data factory does not allow the use of nested looping activities, any pipeline
that has such a structure will require a workaround in order to function properly.
Here is where we classify everything that has a looping structure: actions involving
the conditions if, for, and till respectively.
 The lookup activity is capable of retrieving a maximum of 5000 rows in a single
operation at its maximum capacity. To reiterate, in order to implement this kind of
pipeline design, we are going to need to use some additional loop activity in
conjunction with SQL with the limit.
 It is not possible for a pipeline to have more than forty activities in total, and this
number includes any inner activities as well as any containers. To find a solution
to this problem, pipelines ought to be modularized with regard to the number of
datasets, activities, and so on.
Q67. Define Azure SQL database? Can it be integrated with Data Factory?
 Azure SQL Database is an up-to-date, fully managed relational
database service that is built for the cloud, primarily for storing data.

| Azure Data Engineer Trainer | Mr. Srinivas (12+ Yrs Exp) |


 You can easily design data pipelines to read and write to SQL DB using the Azure
data factory.
Q68. What are the major differences between SSIS and Azure Data Factory?
Azure Data Factory (ADF) SQL Server Integration Services (SSIS)
ADF is a Extract-Load Tool SSIS is an Extract-Transform-Load tool
ADF is a cloud-based service (PAAS tool) SSIS is a desktop tool (uses SSDT)
ADF is a pay-as-you-go Azure subscription. SSIS is a licensed tool included with SQL
Server.
ADF does not have error handling SSIS has error handling capabilities.
capabilities.
ADF uses JSON scripts for its orchestration SSIS uses drag-and-drop actions (no
(coding). coding).

Q69. What do you mean by parameterization in Azure?


 Parameterization allows us to provide the server’s name, database name,
credentials, and so on while executing the pipeline.
 It allows us to reuse rather than building one for each request.
 Parameterization in Adf is crucial in designing and reusing while reducing the
solution maintenance costs.
Q70. What are the major benefits of Cloud Computing?
Some of the advantage of Cloud computing are –
 Scalability
 Agility
 High Availability
 Latency
 Moving from Capex to Open Fault Tolerance
Q71. Do you think there is demand in ADF?
Azure Data Factory is a cloud-based Microsoft tool that collects raw business data and
transforms it into usable information. There is a considerable demand for Azure Data
Factory Engineers in the industry.
Q72. What is an azure storage key?
Azure storage key is used for authentication and validating access for the azure
storage services that control access of the data that is based on the project
requirements.

| Azure Data Engineer Trainer | Mr. Srinivas (12+ Yrs Exp) |


There are 2 types of storage keys that are given for the authentication purpose –
 Primary Access Key
 Secondary Access Key
The main job of the secondary access key is to avoid downtime of any website or
application.
Q73. What is the service offered by Azure that lets you have a common file sharing
system between multiple virtual machines?
Azure provides a service called Azure File System which is used as a common repository
system for sharing the data across the Virtual Machines configured by making use of
protocols like SMB, FTPS, NFS, etc.
Q74. What is cloud computing?
Cloud computing is basically the use of servers on the internet that helps store, manage and
process data. The only difference being, instead of using your own servers, you are using
somebody else’s servers to accomplish the task by paying them for the amount of time you
use it for.
Q75. What is the Azure Active Directory used for?
 Azure Active Directory is an Identity and Access Management system.
 It is majorly used to grant access to specific products and services in your network.
Q76. What strategies do you use to handle large data volumes in Azure Data Factory
pipelines?
Handling large data volumes requires a combination of partitioning data, using parallel
processing, and optimizing data movement. Candidates should discuss techniques such as
chunking large datasets, utilizing the PolyBase feature for efficient data loading, and
leveraging Azure's scalable resources to handle peak loads.
Strong candidates will also mention monitoring and adjusting performance metrics, as well
as implementing retry and error handling mechanisms to ensure data integrity. Look for
detailed examples of how they have managed large-scale data processing in their previous
roles.
Q77. How do you troubleshoot performance issues in an Azure Data Factory pipeline?
Troubleshooting performance issues involves several steps, including checking the
pipeline's activity logs, monitoring resource utilization, and identifying bottlenecks in data

| Azure Data Engineer Trainer | Mr. Srinivas (12+ Yrs Exp) |


movement or transformation activities. Candidates should mention tools like Azure Monitor
and Log Analytics for detailed insights.
An effective response will include specific strategies for isolating issues, such as testing
individual components, adjusting parallelism settings, or optimizing data source
configurations. Recruiters should look for candidates who demonstrate a methodical
approach to problem-solving and experience with real-world troubleshooting scenarios.
Q78. How do you handle schema evolution in Azure Data Factory pipelines?
Handling schema evolution involves strategies like using schema drift capabilities in
mapping data flows, leveraging flexible data formats such as JSON, and maintaining a
versioned schema registry. Candidates should discuss how they manage changes to data
structure without disrupting the pipeline operations.
Q79. Your pipeline has multiple dependencies. How do you ensure it runs efficiently?
 Use dependencies (success, failure, completion).
 Enable concurrent execution for independent activities.
 Optimize with partitioned or parallel copy operations.
Q80. How do you debug pipeline failures?
 Use the Monitor tab to check activity run details.
 Analyze logs in Output and error messages.
 Enable retry policies for transient issues.
 Integrate with Azure Log Analytics for deeper insights.
Q81. How would you monitor pipeline performance?
 Use built-in Pipeline Monitoring for run statistics.
 Track resource usage in Azure Monitor.
 Set alerts for performance bottlenecks or anomalies.
Q82. How does ADF secure data movement?
 Supports Azure Key Vault for managing sensitive information.
 Uses Managed Identity for authentication.
 Provides encrypted in-transit and at-rest data.
Q83. How does ADF integrate with Databricks?
 Use Databricks Notebook Activity for custom transformations.
 Pass parameters dynamically to notebooks.
 Handle large-scale data processing with Spark.

| Azure Data Engineer Trainer | Mr. Srinivas (12+ Yrs Exp) |


Q84. Scenario: How would you connect ADF with an on-premises SQL Server?
 Install a Self-hosted Integration Runtime on the on-premises environment.
 Configure a Linked Service to SQL Server using SHIR.
 Use Copy Activity to move data.
Q85. What are best practices for optimizing ADF pipelines?
 Minimize data movement.
 Use partitioning and parallelism for large datasets.
 Use reusable components like parameters and templates.
 Cache intermediate data for repeated operations.
Q86. Scenario: A pipeline is running slower than expected. How do you optimize it?
 Use parallel copy for faster data transfers.
 Reduce data movement by processing data in-place.
 Optimize source queries with filtering and indexing.
 Use Data Flows Debug Mode to identify transformation bottlenecks.
Q87. Scenario: ADF pipeline needs to call an external REST API and process data. How
would you achieve this?
 Use Web Activity to call the REST API.
 Store the API response in Azure Blob or Azure Data Lake.
 Process the data using Data Flows or Databricks.
Q88. What is the key difference between the Dataset and Linked Service in Azure Data
Factory?
Linked Service: This is like a set of keys that lets you connect to a data source, such as a
database or storage system. For example, if you want to connect to a SQL Server database,
the linked service will have the information (like the server's name and login details)
needed to unlock and access that database.
Dataset: This is like a map that tells you what specific data you want to work with once
you're connected. For example, when you're dealing with a SQL Server database, the
dataset tells you which table or set of data you want to pull out or send to. It might even
tell you to use a specific query if you want data from different tables.

| Azure Data Engineer Trainer | Mr. Srinivas (12+ Yrs Exp) |


Q89. What are the key differences between the Mapping data flow and Wrangling data
flow transformation activities in Azure Data Factory?
In Azure Data Factory, the main dissimilarity between the Mapping data flow and the
Wrangling data flow transformation activities is as follows
The Mapping data flow activity is a visually allowed data transformation activity that
facilitates users to plan graphical data transformation logic. It does not need the users to
be expert developers. It’s executed as an activity within the ADF pipeline on an ADF
completely managed scaled-out Spark cluster.
On the other hand, the Wrangling data flow activity is a code–free data preparation
activity. It’s integrated with Power Query Online to make the Power Query M functions
available for data wrangling using spark execution.
Q90. Can an activity in a pipeline consume arguments that are passed to a pipeline run?
Each activity within the pipeline can consume the parameter value that’s passed to the
pipeline and run with the @parameter construct.
Q91. Can an activity output property be consumed in another activity?
An activity output can be consumed in a subsequent activity with the @activity construct.
Q92. How does Azure Data Factory ensure data security?
Exp answer: Azure Data Factory ensures data security through several mechanisms.
First, it uses encryption for data both in transit and at rest, employing protocols like TLS and
AES to secure data transfers. ADF integrates with Azure Active Directory (AAD) for
authentication and uses Role-Based Access Control (RBAC) to restrict who can access and
manage the factory.
Additionally, Managed Identities allow ADF to securely access other Azure services without
exposing credentials. For network security, ADF supports Private Endpoints, ensuring that
data traffic stays within the Azure network and adding another layer of protection.
Q93. How can you implement error handling in Azure Data Factory pipelines?
Example answer: Error handling in Azure Data Factory can be implemented using Retry
Policies and Error Handling Activities. ADF offers built-in retry mechanisms, where you can
configure the number of retries and the interval between retries if an activity fails.
For example, if a Copy Activity fails due to a temporary network issue, you can configure the
activity to retry 3 times with a 10-minute interval between each attempt.

| Azure Data Engineer Trainer | Mr. Srinivas (12+ Yrs Exp) |


In addition, Set-Acivity Dependency Conditions like Failure, Completion, and Skipped can
trigger specific actions depending on whether an activity succeeds or fails.
For instance, I could define a pipeline flow such that upon an activity's failure, a
custom error-handling activity, like sending an alert or executing a fallback process, is
executed.
Q94. How do you handle schema drift in Azure Data Factory?
Exp answer: Schema drift refers to changes in source data structure over time.
Azure Data Factory addresses schema drift by offering the Allow Schema Drift option in
Mapping Data Flows. This allows ADF to automatically adjust to changes in the schema of
incoming data, like new columns being added or removed, without redefining the entire
schema.
By enabling schema drift, I can configure a pipeline to dynamically map columns even if the
source schema changes.
Q95. How can you optimize the performance of an Azure Data Factory pipeline?
Exp answer: I typically follow several strategies to optimize the performance of an Azure
Data Factory pipeline.
First, I ensure that parallelism is leveraged by using Concurrent Pipeline Runs to process
data in parallel where possible. I also use Partitioning within the Copy Activity to split large
datasets and transfer smaller chunks concurrently.
Another important optimization is selecting the right Integration Runtime based on the
data source and transformation requirements. For example, using a Self-hosted IR for on-
premise data can speed up on-prem to cloud transfers.
Additionally, enabling Staging in the Copy Activity can improve performance by buffering
large datasets before final loading.
Q96. Can you describe a scenario where you optimized a data pipeline for better
performance in ADF?
Exp answer: In a project where we had to process large amounts of financial data from
multiple sources, the initial pipeline took too long to execute due to the volume of data. To
optimize this, I initially enabled parallelism by setting up multiple Copy Activities to run
concurrently, each handling a different dataset partition.
Next, I used the staging feature in the Copy Activity to temporarily buffer the data in Azure
Blob Storage before processing it further, significantly improving throughput. I
also used Data Flow optimizations by caching lookup tables used in
transformations.

| Azure Data Engineer Trainer | Mr. Srinivas (12+ Yrs Exp) |


These adjustments improved the pipeline's performance by 40%, reducing execution time.
Q97. How did you approach a situation where data quality issues affected the ADF
pipeline output?
Exp answer: In one case, I was working on a pipeline that extracted customer data from a
CRM system. However, the data contained missing values and duplicates, which affected
the final reporting. To address these data quality issues, I incorporated a Data Flow in the
pipeline that performed data cleansing operations.
I used filters to remove duplicates and a conditional split to handle missing values. I set up a
lookup for any missing or incorrect data to pull in default values from a reference dataset.
By the end of this process, the data quality was significantly improved, ensuring that the
downstream analytics were accurate and reliable.
Q98. Can you explain a time when you had to secure sensitive data in an Azure Data
Factory pipeline?
Exp answer: In one project, we were dealing with sensitive customer data that needed to
be securely transferred from an on-premise SQL Server to Azure SQL Database. I used
Azure Key Vault to store the database credentials and secure the data, ensuring that
sensitive
information like passwords was not hardcoded in the pipeline or Linked Services.
Additionally, I implemented Data Encryption during data movement by enabling SSL
connections between the on-premise SQL Server and Azure.
I also used role-based access control (RBAC) to restrict access to the ADF pipeline, ensuring
that only authorized users could trigger or modify it. This setup ensured both secure data
transfer and proper access management.
Q99. Have you used Execute Notebook activity in Data Factory? How to pass parameters
to a notebook activity?
We can execute notebook activity to pass code to our databricks cluster. We can pass
parameters to a notebook activity using the base Parameters property. If the parameters
are not defined/ specified in the activity, default values from the notebook are executed.
Q100. What do you mean by variables in the Azure Data Factory?
Variables in the Azure Data Factory pipeline provide the functionality to hold the values.
They are used for a similar reason as we use variables in any programming language and are
available inside the pipeline.

| Azure Data Engineer Trainer | Mr. Srinivas (12+ Yrs Exp) |


Set variables and append variables are two activities used for setting or manipulating the
values of the variables. There are two types of variables in a data factory: -
 System variables: These are fixed variables from the Azure pipeline. For
example, pipeline name, pipeline id, trigger name, etc. You need these to get the
system information required in your use case.
 User variable: A user variable is declared manually in your code based on
your pipeline logic.
Q101. What are the different activities you have used in Azure Data Factory?
Here you can share some of the significant activities if you have used them in your career,
whether your work or college project. Here are a few of the most used activities:
 Copy Data Activity to copy the data between datasets.
 ForEach Activity for looping.
 Get Metadata Activity that can provide metadata about any data source.
 Set Variable Activity to define and initiate variables within pipelines.
 Lookup Activity to do a lookup to get some values from a table/file.
 Wait Activity to wait for a specified amount of time before/in between the
pipeline run.
 Validation Activity will validate the presence of files within the dataset.
 Web Activity to call a custom REST endpoint from an ADF pipeline.
Q102. Can a value be calculated for a new column from the existing column from mapping
in ADF?
We can derive transformations in the mapping data flow to generate a new column based
on our desired logic. We can create a new derived column or update an existing one when
developing a derived one. Enter the name of the column you're making in the Column
textbox.
Q103. What does it mean by the breakpoint in the ADF pipeline?
To understand better, for example, you are using three activities in the pipeline, and now
you want to debug up to the second activity only. You can do this by placing the
breakpoint at the second activity. To add a breakpoint, click the circle present at the top of
the activity.
Q104. Can you share any difficulties you faced while getting data from on-premises to
Azure cloud using Data Factory?
One of the significant challenges we face while migrating from on-prem to the
cloud is throughput and speed. When we try to copy the data using Copy activity

| Azure Data Engineer Trainer | Mr. Srinivas (12+ Yrs Exp) |


from on-prem, the process rate could be faster, and hence we need to get the desired
throughput.
There are some configuration options for a copy activity, which can help in tuning this
process and can give desired results.
 We should use the compression option to get the data in a compressed mode
while loading from on-prem servers, which is then de-compressed while writing on
the cloud storage.
 Staging area should be the first destination of our data after we have enabled the
compression. The copy activity can decompress before writing it to the final
cloud storage buckets.
 Degree of Copy Parallelism is another option to help improve the migration process.
This is identical to having multiple threads processing data and can speed up the
data copy process.
 There is no right fit-for-all here, so we must try different numbers like 8, 16, or 32
to see which performs well.
 Data Integration Unit is loosely the number of CPUs used, and increasing it
may improve the performance of the copy process.
Q105. Is it possible to have nested looping in Azure Data Factory?
There is no direct support for nested looping in the data factory for any looping activity (for
each / until). However, we can use one for each/until loop activity which will contain an
execute pipeline activity that can have a loop activity. This way, when we call the looping
activity, it will indirectly call another loop activity, and we'll be able to achieve nested
looping.
Q106. How to copy multiple tables from one datastore to another datastore?
An efficient approach to complete this task would be:
Maintain a lookup table/ file containing the list of tables and their source, which needs to
be copied.
Then, we can use the lookup activity and each loop activity to scan through the list.
Inside the for each loop activity, we can use a copy activity or a mapping dataflow to copy
multiple tables to the destination datastore.
Q107. What is some performance-tuning techniques for Mapping Data Flow activity?
We could consider the below set of parameters for tuning the performance of a
Mapping Data Flow activity we have in a pipeline.

| Azure Data Engineer Trainer | Mr. Srinivas (12+ Yrs Exp) |


 We should leverage partitioning in the source, sink, or transformation whenever
possible. Microsoft, however, recommends using the default partition (size 128
MB) selected by the Data Factory as it intelligently chooses one based on our
pipeline configuration.
 Still, one should try out different partitions and see if they can have
improved performance.
 We should not use a data flow activity for each loop activity. Instead, we have
multiple files similar in structure and processing needs. In that case, we should use
a wildcard path inside the data flow activity, enabling the processing of all the files
within a folder.
 The recommended file format to use is ‘. parquet’. The reason being the pipeline will
execute by spinning up spark clusters, and Parquet is the native file format for
Apache Spark; thus, it will generally give good performance.
 Multiple logging modes are available: Basic, Verbose, and None.
 We should only use verbose mode if essential, as it will log all the details about
each operation the activity performs. e.g., It will log all the details of the operations
performed for all our partitions. This one is useful when troubleshooting issues with
the data flow.
 The basic mode will give out all the necessary basic details in the log, so try to
use this one whenever possible.
 Try to break down a complex data flow activity into multiple data flow activities.
Let’s say we have several transformations between source and sink, and by adding
more, we think the design has become complex. In this case, try to have it in
multiple such activities, which will give two advantages:
 All activities will run on separate spark clusters, decreasing the run time for the
whole task.
 The whole pipeline will be easy to understand and maintain in the future.
Q108. What are some of the limitations of ADF?
Azure Data Factory provides great functionalities for data movement and transformations.
However, there are some limitations as well.
 We can’t have nested looping activities in the data factory, and we must use some
workaround if we have that sort of structure in our pipeline. All the looping
activities come under this: If, Foreach, switch, and until activities.
 The lookup activity can retrieve only 5000 rows at a time and not more than that.
Again, we need to use some other loop activity along with SQL with the
limit to achieve this sort of structure in the pipeline.

| Azure Data Engineer Trainer | Mr. Srinivas (12+ Yrs Exp) |


 We can have 40 activities in a single pipeline, including inner activity, containers, etc.
To overcome this, we should modularize the pipelines regarding the number of
datasets, activities, etc.
Q109. How do you send email notifications on pipeline failure?
There are multiple ways to do this:
Using Logic Apps with Web/Webhook activity.
Configure a logic app that, upon getting an HTTP request, can send an email to the
required set of people for failure. In the pipeline, configure the failure option to hit the
URL
generated by the logic app.
Using Alerts and Metrics from pipeline options.
We can set up this from the pipeline itself, where we get numerous options for email on
any activity failure within the pipeline.
Q110. Imagine you must import data from many files stored in Azure Blob Storage into
an Azure Synapse Analytics data warehouse. How would you design a pipeline in Azure
Data Factory to efficiently process the files in parallel and minimize processing time?
Here is the list of steps that you can follow to create and design a pipeline in Azure Data
Factory to efficiently process the files in parallel and minimize the processing time:
Start by creating a Blob storage dataset in Azure Data Factory to define the files' source
location.
Create a Synapse Analytics dataset in Azure Data Factory to define the destination location
in Synapse Analytics where the data will be stored.
Create a pipeline in Azure Data Factory that includes a copy activity to transfer data from
the Blob Storage dataset to the Synapse Analytics dataset.
Configure the copy activity to use a binary file format and enable parallelism by setting the
"parallelCopies" property.
You can also use Azure Data Factory's built-in monitoring and logging capabilities to track
the pipeline's progress and diagnose any issues that may arise.
Q111. How can you insert folder name and file count from blob into SQL table?
You can follow these steps to insert a folder name and file count from blob into the SQL
table:

| Azure Data Engineer Trainer | Mr. Srinivas (12+ Yrs Exp) |


Create an ADF pipeline with a "Get Metadata" activity to retrieve the folder and file details
from the blob storage.
Add a "ForEach" activity to loop through each folder in the blob storage.
Inside the "ForEach" activity, add a "Get Metadata" activity to retrieve the file count for
each folder.
Add a "Copy Data" activity to insert the folder name and file count into the SQL table.
Configure the "Copy Data" activity to use the folder name and file count as source data and
insert them into the appropriate columns in the SQL table.
Run the ADF pipeline to insert the folder name and file count into the SQL table.
Q112. ADF Exercises
 Create variables using set variable activity
 How to use if condition using if condition activity
 Iterating files using for loop activity
 Creating linked services, Data sets
 Copy activity - blob to blob
 Copy activity - blob to azure SQL
 Copy activity - pattern matching files copy
 Copy activity - copy the filtered file formats
 Copy activity - copy multiple files from blob to another blob
 Copy activity - Delete source files after copy activity
 Copy activity - using parameterized data sets
 Copy activity - convert one file format to another file format
 Copy activity - add additional columns to the source columns
 Copy activity - filter files and copy from one blob to another
 Delete the files from blob with more than 100KB
 How to use getmetdata activity
 Bulk copy tables and files
 How to integrate keyvault in ADF
 How to set up integration run time
 Copy data from on premises to azure cloud
 How to use databricks activity activity and pass paraemeters to it
 How to use scheduling trigger
 How to use tumbling window trigger
 How to use event based trigger

| Azure Data Engineer Trainer | Mr. Srinivas (12+ Yrs Exp) |


 How to use with Activity
 How to use Until Activity
 Dataflows - select the rows
 Dataflows - Filter the rows
 Dataflows - join Transformations
 Dataflows - union Transformations
 Dataflows - look up Transformations
 Dataflows - window functions transformations
 Dataflows - pivot, unpivot transformations
 Dataflows - Alter rows transformations
 Dataflows - Removing Duplicates transformations
 How to pass parameters to the pipeline
 How to create alerts and rules
 How to set global parameters
 How to import and export ARM templates
 How to integrate ADF with Devops
 How to use Azure devops Repos
 How to send mail notifications using logic apps
 How to monitor the pipelines
 How to debug the pipelines
 How to schedule pipeline using triggers
 How to create trigger dependency
 How to one pipeline in another pipeline
 Handle incremental dataload
 How to run the databricks notebook
 How to connect to onprem sqlserver from ADF

| Azure Data Engineer Trainer | Mr. Srinivas (12+ Yrs Exp) |

You might also like