[Link] you working in a D2C product/env in your current role?
If so,
what is your contribution and
impact?
Ans : As a software engineer on the Enterprise Data Lake (EDL) team,
my role encompasses a wide range of responsibilities. I design scalable
big data ETL pipelines using java Apache spark, perform data modeling,
and optimize data frameworks using REST APIs. Additionally, I manage
cross-property integrations for platforms like Venmo, Braintree, and
Hyper wallet, each with its unique Enterprise Data Stack. I also construct
data pipelines to generate precise reports for high-profile clients such as
Uber, Meta, and Intuit.
[Link] is your relevant exp in the role?
Ans : 3 YOE
3. What is your experience in building scalable (rps?) & resilient
systems?
Ans : We are currently developing a Unified Intelligence Platform
designed to replace legacy Python scripts. This platform optimizes
database interactions through connection pooling and enhances
scalability to manage high-volume requests. It can handle up to 3,300
jobs per second.
4. What programming languages are you proficient with? How
familiar are you with functional reactive programming concepts?
Ans: JAVA, Spark, Python, C++( For DSA)
5. Are you an expert with modern CI/CD tools and which cloud you
are working on?
Ans: YES, GCP Platform
6. Have you worked with distributed system challenges? If so, what
are those and how did you
solve them?
Ans: We have introduced a framework that automatically notifies us
whenever jobs get stuck due to source data issues or run into
out-of-memory errors. Previously, we had to manually check for these
issues and perform the necessary fixes.
7. Have you worked with asynchronous message queue systems?
If so, what are those?
Ans: Yes, I have worked with asynchronous message queue systems,
specifically using the publish-subscribe (pub/sub) model. I built daemon
jobs that notify the system when data is ready to be read and porter jobs
that process the data. This setup generates near real-time reports for
Shopify merchants.
8. Have you worked with modern service-to-service communication
protocols? If so, what are those?
Ans: I work with Pub Sub models.
9. What modern databases / persistence solutions / caching
components you worked with? What
are their advantages and what are the right use cases for them?
Ans: I work with Oracle and BQ.
We use a connection pooling mechanism to optimize database
interactions. While I haven't directly implemented caching, I have worked
with 'Change Tables' that store differential data for 30-minute batches.
This approach reduces the need for costly lookup operations on the
main tables.