[go: up one dir, main page]

0% found this document useful (0 votes)
162 views5 pages

Data Scientist ML Resume

Saiteja Sirikonda is a Senior Data Scientist with over 10 years of experience in machine learning and data-driven solutions across various industries, including Retail, Banking, Healthcare, and Education. He has expertise in MLOps, cloud platforms, and advanced AI workflows, and has led significant projects at companies like Southwest Airlines and Amazon. His technical skills encompass a wide range of programming languages, databases, and tools, with a strong focus on developing scalable machine learning models and data pipelines.

Uploaded by

pal
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
162 views5 pages

Data Scientist ML Resume

Saiteja Sirikonda is a Senior Data Scientist with over 10 years of experience in machine learning and data-driven solutions across various industries, including Retail, Banking, Healthcare, and Education. He has expertise in MLOps, cloud platforms, and advanced AI workflows, and has led significant projects at companies like Southwest Airlines and Amazon. His technical skills encompass a wide range of programming languages, databases, and tools, with a strong focus on developing scalable machine learning models and data pipelines.

Uploaded by

pal
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 5

SAITEJA SIRIKONDA

Senior Data Scientist


saitejasirikonda5@gmail.com|972-998-5268

PROFESSIONAL SUMMARY:
 Over 10 years of experience in developing and deploying advanced machine learning models and data-driven
solutions across industries including Retail, Banking, Healthcare, and Education.
 Experienced in Software Engineering, and its intricacies – unit testing, CI/CD, pipelines, integration testing, code
reviews, on-call shifts, customer interaction, Back-end API development (REST APIs & SOAP based), privacy and
security concerns, being a scrum master and bug fixes.
 Domain knowledge and experience in Retail, Banking, Healthcare & Education industries.
 Expertise in transforming business resources and requirements into manageable data formats and analytical
models, designing algorithms, building models, developing data mining and reporting solutions that scale across
a massive volume of structured and unstructured data.
 Proficient in managing entire data science project life cycle and actively involved in all the phases of project life
cycle including data acquisition, data cleaning, data engineering, features scaling, features engineering, statistical
modeling, testing and validation and data visualization.
 Familiar with MLOps that automate and simplify ML work flows using AWS Sagemaker to facilitate experiments,
model retraining and fast deployments to all environments and also used Gitlab runners and actions to facilitate
the same.
 Familiar with various cloud environments like AWS – EC2, S3, Lambda, Kinesis, Sage Maker, Quick Sight,
DynamoDB e. and also GCP - Big Query, Cloud Storage, Data prep, Cloud Composer (managed Apache Airflow),
Data Studio.
 Over 10 years of experience in developing and deploying advanced machine learning models and data-driven
solutions across industries including Retail, Banking, Healthcare, and Education.
 Extensive expertise in designing end-to-end machine learning pipelines and implementing MLOps frameworks for
scalable and automated workflows.
 Hands-on experience with Natural Language Processing (NLP), Generative AI, and Large Language Models (LLMs),
including transformer architectures and fine-tuning techniques.
 Proficient in Python and related ML/DL libraries such as PyTorch, TensorFlow, and Hugging Face.
 Skilled in building Retrieval-Augmented Generation (RAG) systems and integrating semantic search solutions.
 Familiar with deploying models on cloud-native environments using Docker, Kubernetes, and CI/CD practices.
 Experience with advanced AI workflows using LangChain, vector databases (Pinecone), and orchestrators.
 Demonstrated knowledge in cloud platforms, including AWS (SageMaker, Kendra, Bedrock, Redshift, Glue) and
GCP (Vertex AI, BigQuery).
 Spent significant time learning and applying Generative AI and LLM technologies, including OpenAI, DALL-E,
Claude, LangChain, and LLM optimization techniques.
 Tested various newly built and existing models for Performance (ROC AUC, F1 score, MSE, R*2), hyperparameters
(using grid search CV), robustness, bias and fairness testing, outliers, Cross-browser and platform testing using
Selenium, stress testing, scalability. Used AWS Sage maker and Python libraries like LIME, locust, aif360 to
perform and automate tasks.
 Proficient in Python 2.x/3.x with SciPy/ML/DL libraries Stack packages including NumPy, Pandas, SciPy,
Matplotlib, TensorFlow, Kera’s, Network X, Seaborn, PyTorch & NLTK.
 Very good experience and knowledge in provisioning virtual clusters under AWS cloud which includes services
like EC2, S3, ECR, API Gateway, Lambda and EKS & also worked with Azure SQL & Cosmos DB.
 Proficient in building data pipelines and automations for both block and streaming data, involving complex
transformations using PySpark involving AWS tools such as – S3, Glue, fargate, Lamdba, kinesis, kinesis firehose,
Aurora, Redshift and provide necessary dashboards using Tableau and Quicksight.
 Proficient in data visualization tools such as Tableau, Python Matplotlib, R Shiny to create visually powerful and
actionable interactive reports and dashboards and OLAP tools – AWS RedShift and Databricks.
 Excellent Tableau Developer, expertise in building, publishing customized interactive reports and dashboards with
customized parameters and user - filters using Tableau (9.x/10.x).
 Strong business sense and abilities to communicate data insights to both technical and nontechnical clients.

EDUCATION:
 Bachelors in Computer Science, Indian Institute of Technology, Mandi-2010 -2014
 Masters in Computer Science, Arizona State University, 2016-2018

TECHNICAL SKILLS:

 Languages/Scripts: Python (TensorFlow, PyTorch, NLTK, NetworkX, Pandas, Flask, Django, SpaCy), Java, C#, Kotlin,
Scala
 Databases: MySQL, PostgreSQL, Oracle, AWS RDS, DynamoDB, MongoDB
 Cloud Platforms: AWS (S3, EC2, SageMaker, Lambda, Redshift, Glue, API Gateway), GCP (BigQuery, Cloud Storage,
Vertex AI)
 Tools: Docker, Kubernetes, Jenkins, Apache Kafka, Apache Airflow, Tableau, QuickSight
 MLOps Tools: Kubeflow, MLflow, Vertex Pipelines, AWS SageMaker Pipelines

PROFESSIONAL EXPERIENCE:

Southwest Airlines Oct 2023 – till date


Senior Data Scientist

Responsibilities:
 Spearheaded the development of SWA-GPT, a cutting-edge Generative AI solution aimed at transforming the
customer support process by aggregating flight data from multiple sources using AWS Kendra and Bedrock.
Implemented advanced RAG techniques to enhance information retrieval and response accuracy.
 Designed and deployed scalable machine learning models for customer engagement, incorporating LLM fine-
tuning, prompt engineering, and semantic search using embeddings.
 Led MLOps initiatives, ensuring seamless deployment of models into production environments using Kubeflow,
Vertex AI, and CI/CD pipelines.
 Built a RAG-based Knowledge Retrieval System using LangChain and vector databases like Pinecone, enabling
real-time query resolution with enhanced accuracy.
 Collaborated with engineering teams to consolidate various data sources, perform exploratory data analysis
(EDA), and implement data transformations while adhering to data privacy regulations.
 Built and maintained a Model Accuracy and Metric Deviation Framework to monitor performance, data drift, and
retraining needs, integrating alerts and dashboards for real-time monitoring.
 Automated ETL pipelines using AWS Glue, Kinesis, Lambda, and Step Functions to process large volumes of
streaming flight data, enabling downstream ML applications.
 Designed and deployed a real-time dynamic pricing system for ticket sales, leveraging multiple data sources such
as booking trends, seasonality, and competitor pricing. Built scalable machine learning models using Python,
AWS Sage Maker, and SQL, optimizing revenue and customer acquisition. Integrated the system with internal
booking platforms via AWS Lambda and API Gateway, improving Revenue Per Available Seat Mile (RASM) by 15%
and increasing Load Factor through precise price adjustments.
 Developed a seasonality and demand forecasting module using LSTM-based models in AWS Sage Maker,
processing real-time data streams via Apache Kafka and AWS Glue. Integrated predictions into a dynamic pricing
framework, leading to a 20% increase in seasonal profitability and reducing overbooking and under booking
incidents.
 Spent time learning advanced concepts in Generative AI, LLM optimization techniques, and vector databases to
enhance solution efficiency and scalability.
Environment Tools – Python (Beautiful Soup, NLTK, SpaCy, Scikit-learn, Flask, PyTorch, Pandas, TensorFlow, Pytorch
lightning, Numpy, Matplotlib, seaborn, plotly, pySpark, boto-3, SpaCy), SQL, Apache Spark (MLLib, MLFlow), AWS
tools (S3, EC2, DynamoDB, SQS, Lambda, EKS, API Gateway, Document DB, CloudWatch, Kibana, Sage Maker,
RedShift, Quick Sight, EMR, Kinesis, Step Functions, SNS, Aurora)

Amazon Fulfillment Technologies, Amazon June 2022 – July 2023


Data Scientist

Responsibilities:
 Part of a high functioning team that worked on HR solutions (Coaching, Performance feedback & scheduling) for
Amazon Warehouses serving millions of customers across the world.
 Coordinate with Engineering team to design, validate, evaluate model deployment methodologies.
 Experience in building, editing, testing, and deploying large scale machine learning models using AWS Sage
Maker, lambda, API Gateway, and CloudWatch.
 Part of the team that performed MLOps - built and tested existing models for Performance, provide necessary
support for model deployments, tune hyperparameters, bias and fairness testing, keep any eye out for outliers,
scalability by monitoring various metrics on AWS CloudWatch, also evaluate the constraints for model re-training
by gathering feedback on model performance and communicating the findings to Data scientists
 Other responsibilities included communicating with Stakeholders, being a scrum master, solving customer tickets,
backend or frontend software development, Unit & Integration testing, code deployment and bug fixes.
 Designed & Implemented on the backend of a tool that creates & stores feedback, based on the employees
Productivity & Quantity, used in warehouses across world.
 Automated and scheduled computations that figure Feedback eligibility among warehouse employees which
saved 3 hours of manual work/month across all warehouses globally.
 Implemented MLOps pipelines using Vertex AI, Kubeflow, and CI/CD tools, ensuring robust model deployment,
monitoring, and retraining processes.
 Developed two key MLOps projects:
o Predictive Demand Forecasting: Built and deployed a machine learning model to forecast product
demand across warehouses, reducing stockouts and excess inventory. Integrated with AWS Lambda, API
Gateway, and CloudWatch for serverless operations.
o Dynamic Routing Optimization: Created an AI-powered solution for real-time route optimization,
leveraging streaming data and integrating with Kinesis and Fargate.
 Developed and deployed a Kafka-based Data Streaming Platform to enable real-time data ingestion and
processing for ML models, improving decision-making speed and accuracy.
 Led a Data Engineering Project to create a centralized data lake using AWS Glue, S3, and Redshift Spectrum to
unify warehouse data from disparate sources, enabling advanced analytics and machine learning.
 Collaborated with cross-functional teams to gather requirements, design scalable solutions, and deliver high-
impact insights through data visualization tools like Tableau.
 Enhanced model robustness through hyperparameter tuning, bias mitigation, and fairness testing, ensuring
compliance with Amazon’s internal guidelines.
 Developed end-to-end machine learning models, ETLs, Dashboards and scaled them to run in serverless
production environments using AWS cloud services.
 Environment Tools – Python (Beautiful Soup, NLTK, SpaCy, Scikit-learn, Flask, PyTorch, Pandas, Numpy,
Matplotlib, seaborn, plotly), SQL, Java, React, Kotlin, Apache Kafka, NAWS tools (S3, EC2, Elasticsearch,
DynamoDB, SQS, Lambda, CloudWatch, Kibana, Sage Maker, RedShift, Quick Sight, Amazon EMR, Kinesis)

ASU Decision Theatre for Educational Excellence, Tempe, AZ MAR 2020 – Apr 2022
SR.DATA SCIENTIST

Responsibilities:
 Leading a team that performed MLOps and built multiple visualizations for high schools to visualize & assess their
performance and provide feedback on how their course structure is translating to college enrollment.
 Built dashboards to Analyze high school student transcripts and performance data before and during the COVID-
19 pandemic to assess the impact of disruptions caused by the pandemic on university enrollment.
 Used NLP techniques to Develop ML models that predicted the likelihood of students succeeding in a remote
learning environment based on their past performance, course selection, and engagement data to additional
support can be provided.
 Migrated all the cleaned, transformed & encoded 1000+ school course tables from Oracle to AWS RedShift
 Built an ensemble of Ensemble ML models to label the school course data, to evaluate & automate student’s
eligibility criterion for University Admissions office.
 Engineered ETL pipelines to analyze data from multiple sources using AWS Glue, enumerating the school
resources, factors influencing the graduation rates, population demographics, community factors and funding in
Arizona.
 Built Dashboards that visualize shifts in course selections, extracurricular activities, and student-led initiatives to
evaluate the effects on BLM in schools & colleges.
 Develop various ML models to identify students who may require additional support for successful remote
learning, considering factors such as access to technology and home environments.
 Create a dashboard that visualizes changes in student performance, attendance, and grades over time,
highlighting specific courses or demographics that were most affected.
 Environment Tools – Python (cx oracle, Beautiful Soup, NLTK, Apache Kafka, SpaCy, Scikit-learn, Flask, PyTorch,
Pandas, Numpy, Matplotlib, seaborn, plotly), AWS (S3, Lambda, RedShift, Quick Sight, DynamoDB, Sage Maker,
Glue)

AGORDIA, AZ Jun 2018 – MAR 2020


DATA SCIENTIST

Responsibilities:
 Only Data Scientist among a team of energetic & fast-paced developers, aiming to revolutionize the US
Healthcare Realm.
 Designed multiple pipelines to filter, clean and process large data sets(~200Gb), to come up with valuable
insights, metrics, valuable data aggregations and product for the company.
 Designed, Implemented and Deployed Agordia Store score, a metric used to rank around ~1 million pages in
Agordia website.
 Directed the build, storage, optimization, analysis and the visualization of Agordia’s proprietary US Healthcare
Professional Social Interaction graph, with around ~800,000 nodes and ~40 million edges, to give us novel
insights and growth strategies.
 Built custom NLP models (Sentimental Analysis, Event Extraction, Key-Word extraction, Synonym Nets) to analyze
and leverage the medical channels in the app. These models have increased the task Throughput 10-fold, then
manual labor.
 Designed, implemented & Deployed novel Identity Authentication API to protect ~2 million physicians & Agordia
from HIPAA.
 Moderated support for multiple Growth Strategy pipelines and campaigns from a Social Network Analysis point
of view. E.g.: using the social graph, figured who/to which specialty, a given Physician is strongly connected too.
 Handled the CI/CD pipelining for rapid releasing using Jenkins, Apache2, Docker Images, AWS EC2 and Selenium
test suites.
 Environment Tools –Python (Network X, SpaCy, TensorFlow, Flask, PyTorch, NLTK, Stanford NLP, Matplotlib,
Seaborn, Pandas, NumPy, Beautiful Soup, selenium, scikit-learn, py-mongo), Neo4j, Gephi, VSCode, Tableau, AWS
EC2, S3 & Sage Maker, Apache Kafka, MongoDB, Jupiter notebook, Confluence, Docker, CI/CD with Jenkins & Gi
Kraken

Genzeon May 2016 -Apr 2018


Data Scientist Co-Op

Responsibilities:
 Participated in all phases of research including data collection, data cleaning, data mining, developing models
and visualizations.
 Collaborated with data engineers and operation team to collect data from internal system to fit the analytical
requirements.
 Redefined many attributes and relationships and cleansed unwanted tables/columns using SQL queries.
 Utilized Spark SQL API in PySpark to extract and load data and perform SQL queries and also used C# connector
to perform SQL queries by creating and connecting to SQL engine.
 Performed data imputation using Scikit-learn package in Python.
 Performed data processing using Python libraries like NumPy and Pandas.
 Worked with data analysis using ggplot2 library in R to build visualizations for better understanding of customer
behaviors.
 Implemented statistical modeling with XGBoost machine learning software package using R to determine the
predicted probabilities of each model.
 Delivered the results with operation team for better decisions.

Indian Institute of Technology, Hyderabad, INDIA Jun 2014 – APR 2016


RESEARCH Scientist

Responsibilities:
 Developed & migrated the entire Twitter API from Java to Python using Tweepy library to facilitate creation of
NLP models
 Provided support for OAuth & various queries – Timeframe, UserId, Geolocation & Keyword and stored them in
Oracle.
 Implemented various Text Mining algorithms, Sentimental Analysis and various NLP classification algorithms
using Python
 Opioid abuse detection through Twitter, an effort to curb Opioid abuse (Model Accuracy – 84.8%)
 Developed & deployed APIs for Ensemble models - SVM, NB and LSTM with word2vec to predict Opioid abuse.
 Environment tools: Python (tweepy, cx oracle, sklearn, nltk, pandas, re, string, NumPy, json, TensorFlow), Gi
Kraken,

DRDO, Bangalore, India Dec 2012 – May 2013


RESEARCH SCIENTIST Co-Op
Responsibilities:
 Built a Prescription database, which is still in use to handle the medical records and transactions of the company.
 Technology used – Java, SQL, Tomcat, PhP, HTML, CSS, Microsoft SQL Server

You might also like