0% found this document useful (0 votes)

36 views4 pages

Data Engineering Interview QA

The document provides a comprehensive list of interview questions and answers for data engineering roles, covering topics in Python, PySpark, SQL, and AWS. It includes both intermediate and advanced questions, addressing key concepts such as memory management in Python, transformations in PySpark, and AWS services. The format includes code examples to illustrate answers effectively.

Uploaded by

tejaswini6299

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

36 views4 pages

Data Engineering Interview QA

Uploaded by

tejaswini6299

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 4

Top Data Engineering Interview

Questions with Answers

## Python Interview Questions

### Intermediate

1. What are Python’s key features?

**Answer:** Interpreted, dynamically typed, object-oriented, portable, and has extensive
libraries.

2. Explain list comprehension with an example.

```python
nums = [x for x in range(10) if x % 2 == 0]
```

3. What is the difference between `is` and `==`?

- `==` checks value equality.
- `is` checks object identity.

4. **What are *args and kwargs?

```python
def example(*args, **kwargs):
print(args, kwargs)
```

5. How is memory managed in Python?

- Managed using reference counting and garbage collection.

### Advanced

1. Explain Python's GIL.

**Answer:** Global Interpreter Lock allows only one thread to execute Python bytecode at
a time.

2. What are decorators?

```python
def decorator(func):
def wrapper():
print("Before")
func()
print("After")
return wrapper

@decorator
def greet():
print("Hello")
```

3. Difference between deep copy and shallow copy.

- `copy()` creates a shallow copy.
- `deepcopy()` creates a full independent copy.

4. Python OOP concepts: Inheritance, polymorphism, encapsulation, abstraction.

5. **Generators vs Iterators.**
```python
def gen():
yield 1
yield 2
```

## PySpark Interview Questions

### Intermediate

1. **Transformations vs Actions?**
```python
df.filter(df.age > 30) # Transformation
df.show() # Action
```

2. Wide vs Narrow Transformations:

- Narrow: `filter`, `map` (no shuffling)
- Wide: `groupBy`, `join` (requires shuffle)

3. **Joins in PySpark:**
```python
df1.join(df2, df1.id == df2.id, 'inner')
```

4. **Using UDFs:**
```python
udf_func = udf(lambda x: x.upper(), StringType())
df.withColumn("upper", udf_func(df.name))
```

5. **Broadcast variables:**
```python
broadcast_var = sc.broadcast([1, 2])
acc = sc.accumulator(0)
```

### Advanced

1. Catalyst Optimizer & Tungsten Engine

2. **Coalesce vs Repartition**
```python
df.coalesce(1), df.repartition(10)
```
3. **Skew handling:** Salt keys, broadcast small tables
4. **Performance tuning:** Cache, partitioning, pruning
5. **Delta Lake + Structured Streaming**

## SQL Interview Questions

### Intermediate

1. **CTEs:**
```sql
WITH temp AS (SELECT * FROM employees)
SELECT * FROM temp;
```

2. **Recursive Queries**
3. **Pivot/Unpivot**
4. **Indexes:** Speed up search queries
5. **Explain/Analyze** for query plans

### Advanced

1. Second Highest Salary:

```sql
SELECT MAX(salary) FROM emp WHERE salary < (SELECT MAX(salary) FROM emp);
```
2. **OLAP vs OLTP**
3. **Temporal Queries**
```sql
SELECT id, LAG(salary) OVER (...) FROM table;
```
4. **Star vs Snowflake Schema**
5. **Materialized Views**

## AWS Interview Questions

### Intermediate

1. **S3 vs EBS**
2. **IAM concepts**
3. **Glue vs EMR**
4. **Athena SQL on S3**
5. **Security Best Practices**

### Advanced

1. Data Pipeline using S3, Kinesis, Glue, Athena

2. **Redshift Tuning**
3. **AWS Lake Formation**
4. **CloudWatch vs CloudTrail**
5. **Kinesis Real-Time Example:**
```python
boto3.client('kinesis').put_record(...)
```

Complete Data Engineering Interview QA
No ratings yet
Complete Data Engineering Interview QA
6 pages
Python Interview Questions
No ratings yet
Python Interview Questions
8 pages
Question
No ratings yet
Question
6 pages
Extracted
No ratings yet
Extracted
8 pages
Python Developer Interview
No ratings yet
Python Developer Interview
9 pages
Ilovepdf Merged
No ratings yet
Ilovepdf Merged
24 pages
Real Python Interview Questions American Express
No ratings yet
Real Python Interview Questions American Express
7 pages
Python Developer Interview Playbook Full
No ratings yet
Python Developer Interview Playbook Full
6 pages
DHP Answer
No ratings yet
DHP Answer
11 pages
Interview Questions
No ratings yet
Interview Questions
6 pages
Full PySpark Interview QA
No ratings yet
Full PySpark Interview QA
5 pages
Interview Prep1
No ratings yet
Interview Prep1
9 pages
Top 100 Python Interview Questions For Data Analyst
No ratings yet
Top 100 Python Interview Questions For Data Analyst
10 pages
Data Engineer Interview Prep
No ratings yet
Data Engineer Interview Prep
27 pages
Python Interview Questions Dhawal Waghulde
No ratings yet
Python Interview Questions Dhawal Waghulde
3 pages
PySpark Cheatsheet
100% (1)
PySpark Cheatsheet
12 pages
Python BigData Alternative Assignment
No ratings yet
Python BigData Alternative Assignment
5 pages
Pyspark Theory Questions
No ratings yet
Pyspark Theory Questions
5 pages
Senior Data Engineer Qna
No ratings yet
Senior Data Engineer Qna
4 pages
Data Engineer
No ratings yet
Data Engineer
12 pages
Python Imp 001
No ratings yet
Python Imp 001
16 pages
Python Interview Questions Tejal
No ratings yet
Python Interview Questions Tejal
5 pages
Notes For Fintech Assesment, Cheatsheet
No ratings yet
Notes For Fintech Assesment, Cheatsheet
19 pages
Python Interview Preparation
No ratings yet
Python Interview Preparation
22 pages
Comprehensive SQL Python Interview Guide
No ratings yet
Comprehensive SQL Python Interview Guide
4 pages
Python Interview QA Fresher
No ratings yet
Python Interview QA Fresher
4 pages
Top 50 Python Interview Questions
No ratings yet
Top 50 Python Interview Questions
8 pages
PySpark Interview Questions
No ratings yet
PySpark Interview Questions
2 pages
12 Computer Science SP 06 With Solution
No ratings yet
12 Computer Science SP 06 With Solution
17 pages
Data Analytics at NP IT SOLUTIONS
No ratings yet
Data Analytics at NP IT SOLUTIONS
4 pages
Deloitte Data Engineer Interview Experience (0-3 Yoe)
No ratings yet
Deloitte Data Engineer Interview Experience (0-3 Yoe)
22 pages
Interviewsss
No ratings yet
Interviewsss
4 pages
ProfessionalPython PDF
No ratings yet
ProfessionalPython PDF
6 pages
@Arcserve@Operations Analyst Hyderabad Remote
No ratings yet
@Arcserve@Operations Analyst Hyderabad Remote
10 pages
Data Engineering Interview QA Updated
No ratings yet
Data Engineering Interview QA Updated
4 pages
Pyhton Potential Interview Questions
No ratings yet
Pyhton Potential Interview Questions
34 pages
Pyspark 4
No ratings yet
Pyspark 4
5 pages
Python Interviews Question
No ratings yet
Python Interviews Question
47 pages
Untitled Document
No ratings yet
Untitled Document
10 pages
Python Interview Questions ReBIT
No ratings yet
Python Interview Questions ReBIT
4 pages
Python Core Concepts Cheat Sheet
No ratings yet
Python Core Concepts Cheat Sheet
2 pages
Most Asked Python Questions
No ratings yet
Most Asked Python Questions
3 pages
Answers and Explanations
No ratings yet
Answers and Explanations
32 pages
Python 1
No ratings yet
Python 1
14 pages
SQL and PySpark Interview Questions
No ratings yet
SQL and PySpark Interview Questions
15 pages
CS Viva Questions XII
No ratings yet
CS Viva Questions XII
2 pages
Interview Questions For 5 Yrs of Exp
No ratings yet
Interview Questions For 5 Yrs of Exp
6 pages
PySpark Real Time Q&A
No ratings yet
PySpark Real Time Q&A
5 pages
Senior Data Engineer Qs
No ratings yet
Senior Data Engineer Qs
7 pages
Real Python Interview Questions
No ratings yet
Real Python Interview Questions
20 pages
Python Syl Lab Us
No ratings yet
Python Syl Lab Us
17 pages
Shaik 200 Questions Data Engineer Interview Guide
No ratings yet
Shaik 200 Questions Data Engineer Interview Guide
76 pages
50 PySpark Interview Questions 1732556477
No ratings yet
50 PySpark Interview Questions 1732556477
7 pages
Python Mastering
No ratings yet
Python Mastering
12 pages
Top 20 Python Interview Q & A
No ratings yet
Top 20 Python Interview Q & A
7 pages
Python Interview Questions
No ratings yet
Python Interview Questions
2 pages
OPERATING-SYSTEM For HPSC PGT Computer Science
No ratings yet
OPERATING-SYSTEM For HPSC PGT Computer Science
80 pages
Book 4 (ATP Topical)
No ratings yet
Book 4 (ATP Topical)
262 pages
Set D Daa ct1 QP Solns
No ratings yet
Set D Daa ct1 QP Solns
9 pages
ScalpingEA 5 v3
No ratings yet
ScalpingEA 5 v3
3 pages
Assessment of Groundwater Potential Zone
No ratings yet
Assessment of Groundwater Potential Zone
26 pages
Decathlon Standard Concept 2020: Book Fonctionnalités
No ratings yet
Decathlon Standard Concept 2020: Book Fonctionnalités
44 pages
2010MBAInstitutewiseCutOff CAP1
No ratings yet
2010MBAInstitutewiseCutOff CAP1
399 pages
Matjiesfontein Questions - Grade 9 Map Pro
No ratings yet
Matjiesfontein Questions - Grade 9 Map Pro
8 pages
Institute of Engineers: Static and Dynamic Finite Element Analysis and Design of Structures
No ratings yet
Institute of Engineers: Static and Dynamic Finite Element Analysis and Design of Structures
15 pages
Financial Data Analysis Exam
No ratings yet
Financial Data Analysis Exam
11 pages
Chapter 1 - Globe A Model of The Earth
No ratings yet
Chapter 1 - Globe A Model of The Earth
6 pages
Physics Project Final
No ratings yet
Physics Project Final
24 pages
Manual de Operacion XQ 140
No ratings yet
Manual de Operacion XQ 140
55 pages
Worksheet - 5 - Collisions in 2d
No ratings yet
Worksheet - 5 - Collisions in 2d
4 pages
Bowex Inch
No ratings yet
Bowex Inch
30 pages
Jku Computer Science Model Exam 2
No ratings yet
Jku Computer Science Model Exam 2
18 pages
Circuit Breakers Explained
No ratings yet
Circuit Breakers Explained
10 pages
BBL GC 302
No ratings yet
BBL GC 302
2 pages
Euler's Theory of Columns
No ratings yet
Euler's Theory of Columns
5 pages
NFL 2024 Question Paper
No ratings yet
NFL 2024 Question Paper
41 pages
Geological Maps
No ratings yet
Geological Maps
1 page
Vma1565 Man Tgs 33400 BB Sa 6x4 Tractor Head en
No ratings yet
Vma1565 Man Tgs 33400 BB Sa 6x4 Tractor Head en
16 pages
User Manual (MeasureMind 3D)
100% (1)
User Manual (MeasureMind 3D)
271 pages
Approximation Methods in Optimization of Nonlinear Systems Peter I Kogut Olga P Kupenko Digital Version 2025
No ratings yet
Approximation Methods in Optimization of Nonlinear Systems Peter I Kogut Olga P Kupenko Digital Version 2025
135 pages
Esterification of Salicylic Acid
No ratings yet
Esterification of Salicylic Acid
3 pages
2011 Design - Testing - Retrieval - Alumina Heads
No ratings yet
2011 Design - Testing - Retrieval - Alumina Heads
8 pages
Trickstuff Piccola Bleed Procedure - Combined C21 and C22 Rev2
No ratings yet
Trickstuff Piccola Bleed Procedure - Combined C21 and C22 Rev2
6 pages
Unit 0 Packet
No ratings yet
Unit 0 Packet
6 pages
Web Technologies-Bcom CA IV Sem
No ratings yet
Web Technologies-Bcom CA IV Sem
107 pages
Last Five Years - BBA CAM 1st Sem (2017-2022 & 2023)
No ratings yet
Last Five Years - BBA CAM 1st Sem (2017-2022 & 2023)
77 pages

Data Engineering Interview QA

Uploaded by

Data Engineering Interview QA

Uploaded by

Top Data Engineering Interview

Questions with Answers

1. **What are Python’s key features?**

2. **Explain list comprehension with an example.**

3. **What is the difference between `is` and `==`?**

4. **What are *args and **kwargs?**

5. **How is memory managed in Python?**

1. **Explain Python's GIL.**

2. **What are decorators?**

3. **Difference between deep copy and shallow copy.**

4. **Python OOP concepts:** Inheritance, polymorphism, encapsulation, abstraction.

## PySpark Interview Questions

2. **Wide vs Narrow Transformations:**

1. **Catalyst Optimizer & Tungsten Engine**

## SQL Interview Questions

1. **Second Highest Salary:**

## AWS Interview Questions

1. **Data Pipeline using S3, Kinesis, Glue, Athena**

You might also like

1. What are Python’s key features?

2. Explain list comprehension with an example.

3. What is the difference between `is` and `==`?

4. **What are *args and kwargs?

5. How is memory managed in Python?

1. Explain Python's GIL.

2. What are decorators?

3. Difference between deep copy and shallow copy.

4. Python OOP concepts: Inheritance, polymorphism, encapsulation, abstraction.

2. Wide vs Narrow Transformations:

1. Catalyst Optimizer & Tungsten Engine

1. Second Highest Salary:

1. Data Pipeline using S3, Kinesis, Glue, Athena