0% found this document useful (0 votes)

46 views4 pages

Data Engineer Interview Question

Uploaded by

Harish Pininti

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

46 views4 pages

Data Engineer Interview Question

Uploaded by

Harish Pininti

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 4

To walk you through an end-to-end solution for migrating data from an on-premises SQL Server DB to

an Azure SQL DB using Azure Data Factory (ADF), I'll break it down into clear steps. This includes
understanding why we need a self-hosted integration runtime (IR) instead of the auto-resolve IR, and
how to configure it all correctly.

Overview:

The goal is to create a Data Pipeline in ADF that moves data from an on-prem SQL Server to an Azure
SQL Database. The main challenge here is that the auto-resolve integration runtime won’t work for on-
premises data sources because it doesn't have access to on-prem networks. Instead, you will use a self-
hosted integration runtime, which you install on a machine within your on-premises environment.

Here’s how you can approach this process:

Step 1: Set Up the Azure SQL Database

First, you need to have your Azure SQL Database set up in your Azure environment. If it's not created
yet, follow these steps:

1. In the Azure portal, navigate to SQL Databases and click + Add to create a new database.

2. Follow the prompts to create your database and server. Make sure to note down the server
name, username, and password.

Step 2: Create an Azure Data Factory (ADF) Instance

1. In the Azure portal, go to Create a resource and search for Data Factory.

2. Follow the wizard to create your ADF instance, select the region where your resources are
located, and click Create.

3. Once created, navigate to the Data Factory UI by going to the Author & Monitor section.

Step 3: Install and Configure the Self-Hosted Integration Runtime (IR)

Since you’re dealing with an on-premises SQL Server DB, you need the Self-Hosted IR to facilitate the
connection between ADF (in the cloud) and your on-prem server (on your network).

Install the Self-Hosted IR:

1. In Azure Data Factory, go to Manage (gear icon on the left) > Integration Runtimes.

2. Click + New and select Self-hosted.

3. Download the Self-Hosted IR installer from the prompt.

4. Install the software on a machine that has access to your on-prem SQL Server. This machine
should be able to connect to your on-prem SQL Server database and have outbound access to
Azure.
Configure the Self-Hosted IR:

1. During the installation, the wizard will ask for a key (provided in the ADF portal) to authenticate
and link the IR to your ADF instance.

2. After installation, verify the IR is running by checking the status in the ADF portal (under
Integration Runtimes). The status should be "Running."

The Self-hosted IR allows secure communication between your on-prem systems and the cloud. It
ensures that your data can be transferred securely through your network.

Step 4: Set Up Linked Services in ADF

Linked Services define the connections to your data sources and destinations.

1. Create a Linked Service for On-Prem SQL Server:

o Go to Manage > Linked Services > + New.

o Choose SQL Server as the connector.

o For Connection type, select On-premises and use the Self-hosted IR you installed.

o Provide the necessary connection information for your on-prem SQL Server: server
name, database name, and authentication details.

2. Create a Linked Service for Azure SQL Database:

o Similarly, create another Linked Service for your Azure SQL Database.

o Choose Azure SQL Database and enter the server, database, username, and password.

Step 5: Create the Data Pipeline in ADF

Once the linked services are set up, you can create a data pipeline to move data.

1. In ADF, go to the Author tab (pencil icon).

2. Click + New pipeline.

3. Add a Copy Data activity:

o Under Source, choose the on-prem SQL Server linked service you created.

o Configure the source dataset to point to the table(s) or data you want to move.

o Under Sink, select the Azure SQL Database linked service and configure the target
dataset (your Azure SQL database tables).

4. Mapping: You can map columns between the source (on-prem SQL Server) and the sink (Azure
SQL Database) if they have different column names or data types.
5. Set any additional options you need (like fault tolerance, data transformation, etc.).

Step 6: Test the Pipeline and Monitor the Data Transfer

1. Debug the pipeline: Before running the pipeline, you can test it by clicking Debug to ensure
everything is working properly.

2. Run the pipeline: Once you’re ready, you can trigger the pipeline manually, or you can schedule
it to run at specific intervals.

3. Monitor the pipeline: After the pipeline runs, you can check the Monitor tab in ADF to view logs
and monitor the success or failure of your data transfer.

Step 7: Automate and Schedule the Pipeline (Optional)

If you want the pipeline to run on a regular schedule (for example, daily, weekly, or at specific times), you
can:

1. Go to the Trigger section in the pipeline.

2. Click New/Edit to create a schedule or event-based trigger.

Why Use a Self-Hosted Integration Runtime?

To clarify your follow-up question: Why isn’t the auto-resolve IR sufficient for on-prem to cloud
migrations?

1. Auto-Resolve IR: This IR is used for cloud-to-cloud data movement. It automatically configures
itself and doesn’t have the capability to securely access on-premises systems.

2. Self-Hosted IR: This is required for on-premises data sources because it enables ADF to securely
interact with on-prem SQL Server or other on-prem data stores. It acts as a bridge between the
cloud and on-premises environments. The Self-hosted IR runs on a machine within your on-prem
network, which allows it to access local data sources and securely transfer the data to Azure.

Without the Self-hosted IR, Azure Data Factory wouldn’t be able to access your on-prem resources due
to networking restrictions and security concerns (firewalls, VPNs, etc.).

Summary of Key Steps:

1. Set up Azure SQL Database.

2. Create an ADF instance.

3. Install and configure the Self-hosted Integration Runtime on an on-prem server.

4. Set up linked services for both on-prem SQL Server and Azure SQL Database.

5. Create and configure a Copy Data pipeline in ADF.

6. Run and monitor the pipeline to ensure data is transferred.

7. Optionally, schedule the pipeline for automated, recurring data transfers.

This approach will help you securely move data from an on-prem SQL Server to Azure SQL Database
using Azure Data Factory.

PROJECT 8 For Python
No ratings yet
PROJECT 8 For Python
31 pages
Data Migration Project
No ratings yet
Data Migration Project
36 pages
Integration Runtimes in Azure Data Fatory
No ratings yet
Integration Runtimes in Azure Data Fatory
9 pages
Azure Data Factory For Beginners
No ratings yet
Azure Data Factory For Beginners
250 pages
Tasks For Hybrid Data Integration With Error Handling
No ratings yet
Tasks For Hybrid Data Integration With Error Handling
3 pages
Notes
No ratings yet
Notes
22 pages
PROJECT 2 For Python
No ratings yet
PROJECT 2 For Python
41 pages
Adf Part-1
No ratings yet
Adf Part-1
5 pages
Azure Data Factory Interview Questions Answers 1740678784
No ratings yet
Azure Data Factory Interview Questions Answers 1740678784
9 pages
ADF Copy Data
100% (1)
ADF Copy Data
81 pages
How To Replicate Data From Azure To SAP
No ratings yet
How To Replicate Data From Azure To SAP
16 pages
1 How To Replicate Data From SAP To Azure: 1.1 System Specification
No ratings yet
1 How To Replicate Data From SAP To Azure: 1.1 System Specification
19 pages
ADF Copy Data
No ratings yet
ADF Copy Data
85 pages
Load Data With Azure Data Factory
No ratings yet
Load Data With Azure Data Factory
4 pages
Adf Part 1
No ratings yet
Adf Part 1
7 pages
Azure Data Engineer Course Curriculum Nareshit
100% (1)
Azure Data Engineer Course Curriculum Nareshit
10 pages
Adf Syllabus
No ratings yet
Adf Syllabus
12 pages
Azure Data Factory
No ratings yet
Azure Data Factory
3,167 pages
Azure Data Factory Workshop
No ratings yet
Azure Data Factory Workshop
26 pages
Azure Data Factory
100% (1)
Azure Data Factory
6 pages
Azure Cloud & Data Integration Guide
No ratings yet
Azure Cloud & Data Integration Guide
3 pages
Lab 1 - Getting Started With Azure Data Factory
No ratings yet
Lab 1 - Getting Started With Azure Data Factory
5 pages
MS Azure+Azure Data Engineering-Syllabus
No ratings yet
MS Azure+Azure Data Engineering-Syllabus
9 pages
Azure Data Factory Guide
No ratings yet
Azure Data Factory Guide
2,982 pages
Data Factory
100% (2)
Data Factory
26 pages
Azure Project Execution Plan ADF+DBX+CICD
No ratings yet
Azure Project Execution Plan ADF+DBX+CICD
5 pages
Documentation Project
No ratings yet
Documentation Project
56 pages
Azure Data Engineering Project Part 1
No ratings yet
Azure Data Engineering Project Part 1
41 pages
Azure Resource Group & SQL Setup Guide
No ratings yet
Azure Resource Group & SQL Setup Guide
73 pages
Azure Data Engr POC - S For Interns
No ratings yet
Azure Data Engr POC - S For Interns
9 pages
Azure Data Factory Guide
No ratings yet
Azure Data Factory Guide
13 pages
Azure Data Factory Overview With Realtime Ex
No ratings yet
Azure Data Factory Overview With Realtime Ex
5 pages
Azure Data Factory Guide
No ratings yet
Azure Data Factory Guide
98 pages
ADF Course Syllabus
No ratings yet
ADF Course Syllabus
3 pages
Azure Data Factory Use Case
No ratings yet
Azure Data Factory Use Case
9 pages
End To End Project ADF
100% (1)
End To End Project ADF
73 pages
Migrating SQL Server To Azure SQL Database Managed Instance: A Step-By-Step Guide
100% (1)
Migrating SQL Server To Azure SQL Database Managed Instance: A Step-By-Step Guide
21 pages
Copy Activity in ADF
No ratings yet
Copy Activity in ADF
52 pages
Azure SQL Migration Guide
No ratings yet
Azure SQL Migration Guide
14 pages
Detailed Azure Data Factory Presentation
No ratings yet
Detailed Azure Data Factory Presentation
30 pages
Capgemini Questionnaire
No ratings yet
Capgemini Questionnaire
11 pages
Azure Data Factory
100% (4)
Azure Data Factory
16 pages
Azure Data Factory Compressed
No ratings yet
Azure Data Factory Compressed
24 pages
Az Questions
No ratings yet
Az Questions
11 pages
Azure Data Factory V2 Preview Guide
No ratings yet
Azure Data Factory V2 Preview Guide
59 pages
ADF Course Deck
No ratings yet
ADF Course Deck
88 pages
Start To Finish With Azure Data Factory
100% (2)
Start To Finish With Azure Data Factory
30 pages
Lab 05
No ratings yet
Lab 05
26 pages
AWS Database Administration-Overview
No ratings yet
AWS Database Administration-Overview
8 pages
Azure Project
No ratings yet
Azure Project
32 pages
f4b7901ed5e5f9106a3a82eea2e2f003
No ratings yet
f4b7901ed5e5f9106a3a82eea2e2f003
3,614 pages
Implementing An Azure SQL Data Warehouse
No ratings yet
Implementing An Azure SQL Data Warehouse
41 pages
D365 Finance and Operations Azure Integration by Accendia
No ratings yet
D365 Finance and Operations Azure Integration by Accendia
8 pages
Azure Migration
No ratings yet
Azure Migration
25 pages
025.0 ADF Overview
No ratings yet
025.0 ADF Overview
12 pages
Azure Data Factory
No ratings yet
Azure Data Factory
4 pages
Types of Activities in ADF
100% (1)
Types of Activities in ADF
37 pages
Golang Developer Opportunity
No ratings yet
Golang Developer Opportunity
1 page
Oracle EDQ Setup Guide for Admins
No ratings yet
Oracle EDQ Setup Guide for Admins
15 pages
Oracle 11g Consolidated Database Replay Guide
No ratings yet
Oracle 11g Consolidated Database Replay Guide
12 pages
Unit I Introduction To Multimedia
No ratings yet
Unit I Introduction To Multimedia
24 pages
IBM Power Systems Performance Capabilities Reference
No ratings yet
IBM Power Systems Performance Capabilities Reference
46 pages
MVIS Draft Spec Rev - 2 - 27 - 7 - 22
No ratings yet
MVIS Draft Spec Rev - 2 - 27 - 7 - 22
12 pages
Business Information Warehouse: Release 3.0
No ratings yet
Business Information Warehouse: Release 3.0
58 pages
IT Code 402 Sample Paper Book 10 Final
100% (1)
IT Code 402 Sample Paper Book 10 Final
69 pages
Ebook The Practical Guide To Using A Semantic Layer
No ratings yet
Ebook The Practical Guide To Using A Semantic Layer
30 pages
Database Systems Overview
No ratings yet
Database Systems Overview
19 pages
Учебно-методические Комплексы Дисциплин
No ratings yet
Учебно-методические Комплексы Дисциплин
108 pages
Software Engineering Unit-3
No ratings yet
Software Engineering Unit-3
25 pages
Syllabus LUMS-EOBI Recruitment Test For Deputy Director IT Cadre (BPS-18)
No ratings yet
Syllabus LUMS-EOBI Recruitment Test For Deputy Director IT Cadre (BPS-18)
2 pages
Conceptual Database Design Chat GPT
No ratings yet
Conceptual Database Design Chat GPT
17 pages
Basic of MS - SQL: - Team Fantastic4
No ratings yet
Basic of MS - SQL: - Team Fantastic4
91 pages
Advanced SQL
No ratings yet
Advanced SQL
273 pages
Mis - Management Information System
No ratings yet
Mis - Management Information System
268 pages
Ism Unit 2 Notes
No ratings yet
Ism Unit 2 Notes
42 pages
DBSA
No ratings yet
DBSA
7 pages
美联储 Heraclius：一个具有现代支付系统潜力的拜占庭容错数据库系统（英） 2025 15页
No ratings yet
美联储 Heraclius：一个具有现代支付系统潜力的拜占庭容错数据库系统（英） 2025 15页
16 pages
Bush
No ratings yet
Bush
64 pages
Rupalis Computer Project
No ratings yet
Rupalis Computer Project
29 pages
Asu Microproject Group 4
100% (1)
Asu Microproject Group 4
18 pages
MSC/PATRAN Beginner's Guide
No ratings yet
MSC/PATRAN Beginner's Guide
294 pages
MLIB-Paper-VIII-Information Centres-Lecture 1 - DR Sonal Singh
No ratings yet
MLIB-Paper-VIII-Information Centres-Lecture 1 - DR Sonal Singh
28 pages
Chapter 18 - Establishing A Management Information System
No ratings yet
Chapter 18 - Establishing A Management Information System
13 pages
ER Diagrams in Database Design
No ratings yet
ER Diagrams in Database Design
10 pages
Wk2 Tutorial Solution
No ratings yet
Wk2 Tutorial Solution
3 pages
System Center 2012 R2 Lab 5: Application Management: Hands-On Lab - Step-by-Step Guide
No ratings yet
System Center 2012 R2 Lab 5: Application Management: Hands-On Lab - Step-by-Step Guide
112 pages
SRS For Dine Flow (Table Reservation System)
No ratings yet
SRS For Dine Flow (Table Reservation System)
9 pages

Data Engineer Interview Question

Uploaded by

Data Engineer Interview Question

Uploaded by

To walk you through an end-to-end solution for migrating data from an on-premises SQL Server DB to

Here’s how you can approach this process:

Step 1: Set Up the Azure SQL Database

Step 2: Create an Azure Data Factory (ADF) Instance

Step 3: Install and Configure the Self-Hosted Integration Runtime (IR)

Install the Self-Hosted IR:

2. Click + New and select Self-hosted.

3. Download the Self-Hosted IR installer from the prompt.

Step 4: Set Up Linked Services in ADF

1. Create a Linked Service for On-Prem SQL Server:

o Go to Manage > Linked Services > + New.

o Choose SQL Server as the connector.

2. Create a Linked Service for Azure SQL Database:

Step 5: Create the Data Pipeline in ADF

1. In ADF, go to the Author tab (pencil icon).

2. Click + New pipeline.

3. Add a Copy Data activity:

Step 6: Test the Pipeline and Monitor the Data Transfer

Step 7: Automate and Schedule the Pipeline (Optional)

1. Go to the Trigger section in the pipeline.

2. Click New/Edit to create a schedule or event-based trigger.

Why Use a Self-Hosted Integration Runtime?

Summary of Key Steps:

1. Set up Azure SQL Database.

2. Create an ADF instance.

3. Install and configure the Self-hosted Integration Runtime on an on-prem server.

5. Create and configure a Copy Data pipeline in ADF.

6. Run and monitor the pipeline to ensure data is transferred.

7. Optionally, schedule the pipeline for automated, recurring data transfers.

You might also like