ETL pipeline that connects data warehouse and GoogleSheets for Student Conversion tracking. The pipeline has one job:
- Conversion Tracker Refresh - Pulls latest enrollment data snapshot and refreshes the "Waitlist + Registration Tracker" tab in the Student Conversion Tracker tool (for all of our schools)
https://github.com/kippnorcal/Student-Conversion-Tracker.git
# Database connection
DB_SERVER=
DB=
DB_USER=
DB_PWD=
DB_SCHEMA=
# Google Developer Service Account
ACCOUNT_EMAIL=
# Mailgun variables
MG_API_KEY=
MG_API_URL=
MG_DOMAIN=
# Email Notification Variables
SENDER_EMAIL=
RECIPIENT_EMAIL=
A list of dictionaries that represent each school. List needs to be named SCHOOL_META_DATA. Each dictionary needs to contain the following key, value pairs:
school_name = String: School name from data warehouse for use in much of the Python code
school_id = Int: School ID of record
short_name = String: Used for runtime arguments. Do not use spaces in names.
sheets_key = String: Google Sheets key for school's attendance tracking worksheet
status = String: For beginning of school year as schools come online. Unless the value is 'ACTIVE', school will be
filtered out.
The sheets_key is the same keys used for the as attendance_response_letters job.
Use the same credentials as the attendance_response_letters job.
docker build -t student_conversion_tracker .
If running locally on a Mac with M1 chip, add --platform linux/amd46
.
docker build -t student_conversion_tracker . --platform linux/amd64
For all jobs and all schools:
docker run --rm -t student_conversion_tracker