Welcome! 🎉
We're excited to see how you approach this Data Engineering Coding Challenge. In our production environment, we use BigQuery, but for this challenge, you'll work locally with PostgreSQL to ensure a fully self-contained setup.
This document will guide you through setting up your environment so you can focus on solving the challenge. Let's get started! 🚀
Before diving in, make sure you have the following installed:
You'll need Docker Compose to run PostgreSQL locally.
Follow the official installation guide: Docker Compose Installation
2️⃣ **Python Virtual Environment (Click to Expand)**
We use Pipenv to manage dependencies.
- Install Pipenv:
pip install --user pipenv
- Install dependencies:
pipenv sync
- Activate the virtual environment:
pipenv shell
Run the following command to spin up a local PostgreSQL instance:
docker compose up
We’ve set up Adminer so you can explore the database with a user-friendly UI.
- Open Adminer in your browser:
http://localhost:8080/?pgsql=db&username=coding_challenge&db=coding_challenge&ns=dbt_dev
- Login Username: coding_challenge/Password: example
- Switch to the coding_challenge database and the dbt_dev schema.
✅ **1. Check if PostgreSQL is Running** (Click to Expand)
Run:
docker ps
If PostgreSQL is running, you should see a container listed.
If it's not running, start it:
docker compose up
✅ **2. Connect to the Database** (Click to Expand)
To manually test the database connection, run:
docker exec -it <container_id> psql -U coding_challenge -d coding_challenge
Replace <container_id>
with the actual container ID from docker ps
.
Alternatively, open Adminer and check if the coding_challenge
database is accessible.
✅ **3. Run dbt Debug** (Click to Expand)
Verify that dbt is installed and correctly configured:
dbt debug
If everything is set up correctly, you should see a success message.
❌ PostgreSQL Container Won't Start
Error:
ERROR: database system is in recovery mode
✅ Solution:
Try restarting the container:
docker compose down
docker compose up --force-recreate
❌ Cannot Connect to Database
Error:
psql: could not connect to server: Connection refused
✅ Solution:
Ensure the database container is running:
docker ps
If it's not listed, start it:
docker compose up -d
❌ dbt Debug Fails
Error:
Database connection failed
✅ Solution:
Check your profiles.yml
file for incorrect credentials. Ensure that dbt is using the correct database, schema, and user.
❌ Pipenv or venv Activation Issues
Error:
Command 'pipenv' not found
✅ Solution:
Ensure Pipenv is installed (pip install pipenv
) and try reactivating:
pipenv shell
For venv, re-run:
source env/bin/activate # macOS/Linux
env\Scripts\activate # Windows
1️⃣ **Ensure Your Code is Clean and Documented** (Click to Expand)
- Format your code properly.
- Add comments where necessary to explain key logic.
- If applicable, include a
README.md
inside your submission with additional details on your approach.
2️⃣ **Package Your Work** (Click to Expand)
- If submitting via ZIP file:
- Remove unnecessary files (e.g.,
__pycache__
,.venv
). - Create a zip archive of your project:
zip -r data-eng-challenge.zip .
- Send the zip file per email.
- Remove unnecessary files (e.g.,
3️⃣ **Provide Additional Context (Optional but Recommended)** (Click to Expand)
If you made specific design choices, encountered challenges, or have insights into possible optimizations, include them in a NOTES.md
file or a comment in your submission email/message.
Once you’ve submitted your work, let us know, and we’ll review it! 🎯🚀
Good luck, and happy coding! 😊