[go: up one dir, main page]

0% found this document useful (0 votes)
15 views7 pages

Yeruva Vijay Data Engineer II Contract

Vijay Yeruva is an experienced Data Engineer with 8 years of expertise in cloud technologies, data visualization, ETL, and data warehousing across various industries. He has a strong proficiency in tools such as Power BI, AWS, SnapLogic, and SQL, and has successfully developed complex data pipelines and automated cloud infrastructure management. His professional experience includes roles at Mouritech, Corteva Agriscience, and VRDL, where he contributed to significant data engineering projects and improved operational efficiencies.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
15 views7 pages

Yeruva Vijay Data Engineer II Contract

Vijay Yeruva is an experienced Data Engineer with 8 years of expertise in cloud technologies, data visualization, ETL, and data warehousing across various industries. He has a strong proficiency in tools such as Power BI, AWS, SnapLogic, and SQL, and has successfully developed complex data pipelines and automated cloud infrastructure management. His professional experience includes roles at Mouritech, Corteva Agriscience, and VRDL, where he contributed to significant data engineering projects and improved operational efficiencies.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 7

Vijay Yeruva

Neha@sptecinc.com
972-752-2039

SUMMARY
 8 years as Data Engineer with expertise in Cloud, Visualizations, KPI, ETL and ELT Data Pipelines with
Statistical Modelling, Data Mining and Data Warehousing Methodologies.
 Gained progressive working experience in Food& Beverages, Asset management, Retail, Manufacturing,
Hospitality and Supply Chain Industries.
 Extensive experience in implementing ETL, Tableau Dashboards, Microsoft BI/Azure BI solutions like Azure
Data Factory, Power BI, and Azure Data bricks, Azure Analysis Services, SQL Server Reporting Services.
 Strong Data Analysis experience & Proficiency in languages – Power BI, R, SQL, Python, PySpark, Scala, Hive,
Shell, Microsoft Excel, and Teradata SQL.
 Built the business cases for making a change to current practices, programs, or procedures. Conducted current
state assessment, opportunity assessment, cost/benefit analysis, feasibility assessment, etc.
 Physical and Logical Data Modeling using Erwin and Data Warehousing concepts including Metadata,
Schemas, Data Types, Partitioning, SAP BW Cubes, Indexes, RDBMS, Constraints.
 Experience in developing applications using AWS services Analytics, EC2, S3, RDS, IAM, Glue, SNS & SQS
Quick Sight, Dynamo DB, Lambda, Kinesis, Route 53, Cloud Front, Cloud Formation
 Gathered Technical requirements with understanding of API, Mapping & Event driven architecture.
 Developed complex DAX, MDX queries using stored procedures, common table expressions (CTEs),
temporary table to support Power BI and SSRS, SSIS reports.
 Created interactive data visualizations in Power BI, using relational and aggregate data sources.
 Good Knowledge & Certified in Google Cloud Platform components – Google Big Query, Data Studio, Data
Flow, Data Proc, Cloud SQL, Big Table, Cloud Pub-Sub, Google Cloud Storage, Google Auto ML, Cloud Vison,
Natural Language Processing & Dialogue flow APIs.
 Designed and implemented PowerShell scripts to automate cloud infrastructure deployment and
management in Microsoft Azure, Amazon Web Services (AWS), or Google Cloud Platform (GCP).
 Proficient in using SSIS tools and features, including Control Flow, Data Flow, and Script Tasks.
 Strong understanding of SQL Server database concepts, including indexing, performance tuning, and query
optimization.
 Familiarity with various SSIS data sources and destinations, including SQL Server, Oracle, Excel, and Flat Files.
 Reporting dashboards experience using Python, VBA & Power Queries development, Cloud Azure & AWS.
 Proficient in using SnapLogic Designer to create and manage complex integrations between various systems and
applications.
 Skilled in working with SnapLogic Snaplexes to optimize performance, scalability of integration solutions.
 Familiarity with various SnapLogic connectors, including REST, SOAP, JDBC, and file-based connectors.
 Collaborated with development and operations teams to integrate the CI/CD pipelines with source control,
continuous integration, and other tools to ensure consistent and reliable builds & deployments.
 Strong communication skills and demonstrated to work across both Business Users and IT teams under
highly demanding environments with the ability to lead project’s vision.
 Proficient in using SnapLogic APIs to automate and streamline integration processes.
 Developed and maintained PowerShell-based CI/CD pipelines to deploy and test cloud applications.
 Recommended, developed, and reviewed QA standards, policies, and procedures for all functions involved
with or related to the quality and testing, in accordance with company standards.
 Understanding of Partitioning and Bucketing concepts in Hive, and performance improvement in Hive
modifying Joins, groups, and aggregations
 Proficiency with RPNL, PNL, AML, Bloomberg, Equities, FX, ACH, and Treasury related products as QRISK,
Quantum, Swift.

PROFESSIONAL EXPERIENCE
Insurance Client - Mouritech, Irving, TX, USA May 2022 to Present
Data Engineer – Amazon Webservices
Environment: Agile, JIRA, BI Analytics, AWS, SQL, AWS Doc DB, EMR, Py notebooks, PowerShell, Mongo DB
Responsibilities:

 Owned report development for dashboards of high complexity. Developed, modified, and distributed standard
and ad hoc management dashboards to C- Level Executives understand the overall business unit. Developed
presentations and shared KPI as well as recommendations.
 Designed and implemented PowerShell scripts to automate cloud infrastructure deployment and management in
Microsoft Azure, Amazon Web Services (AWS), or Google Cloud Platform (GCP) environments.
 Exposure on Amazon Webservices including Analytics, Doc DB, S3, EMR, Elastic Cloud compute, classifiers,
Relational Databases as MySQL, MSSQL, PostgreSQL and non-relational MongoDB, Cassandra
 Configured and managed PowerShell-based monitoring and alerting solutions for cloud resources, such as Azure
Monitor, AWS CloudWatch
 Used Analytics on Power BI, Python, PySpark, SQL, and No SQL to collaborate on the project to create cloud first
data ingestion that improved speed of data preprocessing using autoscaling.
 Perform administration of Salesforce, security settings, as well as managing users as it related to Loans.
 Created and managed AWS resources such as IAM roles, VPCs, EC2 instances, and RDS databases to support data
engineering tasks and workflows.
 Extensive experience with designing, developing, and implementing ETL pipelines using SnapLogic.
 Strong understanding of SnapLogic's core concepts, including Snaps, pipelines, and triggers.
 Designed and implemented scalable data pipelines using AWS services such as S3, Glue, EMR, and Lambda to
support data processing and analytics workflows.
 Designed and implemented a data cleaning process that reduced data errors by 50%.
 Developed and maintained real-time data streaming applications using Kafka and its ecosystem tools (e.g., Kafka
Connect, Kafka Streams, Confluent Schema Registry)
 Developed SQL queries to extract data from databases and created reports for senior management, improving
decision-making processes.
 Utilized PowerShell to automate cloud-related tasks as scaling, load balancing, or auto-healing.
 Created and maintained data pipelines for real-time streaming data using Apache Kafka and Spark Streaming.
 Managed cloud infrastructure and services using PowerShell-based tools like Terraform, Ansible, or Puppet.
 Conducted a cost-benefit analysis of using new approaches and provided recommendations to management.
 Created automated tests in PowerShell to validate the functionality and performance of cloud applications and
services, reducing the need for manual testing and improving the overall quality of the software.
 Proficiency in creating complex data transformation logic using SnapLogic's built-in functions and expressions.
 Knowledge of SnapLogic's monitoring and error handling capabilities, including the ability to troubleshoot issues.
 Troubleshot and resolved issues related to the CI/CD pipelines, such as failures in the build or deployment
process, and provided timely support to development and operations teams.
 Monitored and analyzed the performance of the CI/CD pipelines using PowerShell and other tools to identify
areas for optimization and improvement.
 Proficient in using SSIS expressions and variables to configure package behavior dynamically.
 Familiarity with SSIS logging and error handling to ensure package reliability and fault tolerance.
 Experience in troubleshooting and debugging SSIS packages to ensure data accuracy and integrity.
 Develop and maintain custom reports, dashboards, and processes to continuously improve data quality, process
integrity and productivity.
 Optimized data processing and query performance by tuning AWS services such as EMR, Glue, and Athena, and
monitoring system metrics and logs using CloudWatch.
 Ability to continuously learn and adapt to new technologies and trends in data integration and management, and
ability to apply them using SnapLogic.
 Integrated Kafka with various data sources and sinks (e.g., databases, message queues, Hadoop)

Corteva Agriscience, Paris, France April 2021 - May 2022


Senior Data Engineer
Environment: Visio, Project, Python, Analytics, SAP, SQL, Hadoop, Spark, HDFS, PowerShell, ORACLE & Azure DF
Project: Finance reports for Capital Process and Systems for Manufacturing and Supply chain industry
Responsibilities:

 Conducted analyses based on operational, economic and/or financial data to quantify the competitive
performance of business segments, evaluate potential operational changes, to design new approaches.
 Received, evaluated, and responded to complex data-related inquiries by applying knowledge of data and
business operations and obtaining information from various sources.
 Created and managed AWS resources such as IAM roles, VPCs, EC2 instances, and RDS databases to support
data engineering tasks and workflows.
 Collaborated with DevOps and cloud engineering teams to troubleshoot PowerShell-based automation
solutions and provide technical support to cloud operations.
 Designed and implemented PowerShell scripts to automate routine tasks such as cloud infrastructure
backups, disaster recovery, or security audits.
 Developed complex SQL queries using stored procedures, common table expressions (CTEs), temporary
table to support Power BI and SSRS reports.
 Developed Environmental Protection Dashboard, Multiple incidents employee report.
 Analyzed, integrated, and migrated multiple source systems and create data models required for BI.
 Handled importing of data from various data sources, performed transformations using Hive, MapReduce,
loaded data into HDFS and Extracted the data from various RDBMS, API’s into HDFS.
 Mastery with the Python programming language and significant experience using Python’s data analysis,
machine learning, and NLP libraries such as pandas, NumPy, Keras, Tensor Flow, scikit-learn.
 Hands on experience in Azure Data Bricks for Extract, Transform & Load (ETL) development using SQL Server
Integration Services (SSIS), creating Jobs, Alerts and SSIS Packages
 Built Pipelines & Data frames with Python & PySpark, involving Aggregation, Filters, Lookups,
Transformations using functions, Lookups, Joiners, Insert-Updates, CDC, Transpose etc.
 Familiarity with SnapLogic Enterprise Integration Platform for large-scale data integration and management.
 Knowledge of Snap Logic Data Science for integrating machine learning models into data pipelines.
 Utilized PowerShell to automate cloud-related tasks as scaling, load balancing, auto-healing of applications.
 Implemented complex business logic through T-SQL stored procedures, Functions, Views, and queries.
 Experienced in writing live Real-time Data Processing Queries and core jobs using SQL & EMR Streaming
from SAP HANA in a data pipe-line system.
 Extensively used SSIS transformations such as Lookup, Derived column, Data conversion, Aggregate,
Conditional split, SQL task, Script task and Send Mail task etc.
 Developed complex calculated measures using Data Analysis Expression language (DAX), Well versed on
Data Warehousing ETL concepts using Informatica Power Center, OLAP, OLTP
 Created weekly development trackers and risk trackers to highlight the progress in ETL development, road
blockers, risks, and mitigations for the weekly status updates
 Experience in troubleshooting and debugging ETL processes and performance tuning using Informatica
PowerCenter.
 Experience in using Informatica Metadata Manager for data lineage and impact analysis.
 Utilized PowerShell to automate cloud-related tasks such as scaling and load balancing.
 Managed cloud infrastructure and services using PowerShell-based tools like Terraform
 Documented the CI/CD pipelines and related processes in detail, including the PowerShell scripts and
modules, to ensure easy maintenance and knowledge transfer to other team members.
 Designed and implemented PowerShell scripts to automate routine tasks such as cloud infrastructure
backups, disaster recovery, or security audits.
 Developed PowerShell scripts to provision and configure Clusters, Instances, and storage resources in AWS.
 Designed and implemented Kafka monitoring and alerting solutions (e.g., Prometheus, Grafana)
 Conducted periodic reviews of data architecture and infrastructure to identify opportunities for
optimization, cost reduction, and process improvement using AWS services.
VRDL, Paris, France December 2019 - March 2021
Consulting - Data Engineer
Environment: AWS QuickSight, Python, SAP, SQL, AWS Analytics & Tableau, Spark, Lambda, JIRA, Agile
Project: Reports on Trends in for Banking Sector on various devises and commodities
Responsibilities:

 Participate in the execution and validation of all testing cycles (enhancement and regression testing, support pack
testing etc.) integrating with other streams as necessity

 Knowledge of Loan to Cost order to Cash and RTR accounts receivable & cash remittance

 Created AWS Analytics dashboards using stack bars, bar graphs, scattered plots, geographical maps, Gantt
charts etc. using show me functionality.
 Created interactive data visualizations in Analytics from Amazon BI Tools, using relational and aggregate
data sources & Development of BI data lake POCs using AWS Services including S3, Ec2 and Quick sight.
 Having Complete understanding of Financial, Manufacturing, sales, marketing, HR analytics tables - created
dimensions and fact tables like star and snowflake schema.
 Automated the deployment of web applications and databases using PowerShell scripts in Azure and AWS.
 Created PowerShell scripts to automate AWS resources, such as scaling, monitoring, and backup.
 Worked closely with developers and operations teams to identify automation and improve efficiency PS.
 Utilized Analytics gateway to keep dashboards and reports with on premise data sources.
 Utilized Power BI (Power Pivot/View) to design multiple scorecards and dashboards to display information
required by different departments and upper-level management.
 Skilled in integrating various data sources and targets such as databases, flat files, and web services using
Informatica PowerCenter.
 Experience in using Informatica Power Exchange for real-time data integration with external systems.
 Generated dashboards in Power BI Desktop and Power BI Service using latest data visualizations.
 Integrated custom visuals based on business requirements using Power BI desktop for commission-based
models and charged equities
 Performed as a Data Analyst with ETL and Reporting & Statistical modelling for designing a new model of
demand forecasting and Inventory optimization for healthcare products across chains, Boutiques & drugs.

 Backfill for other Loan to Cost LTC team member as required

 Developed Tableau reports on for daily inventory status, 2 week Forecast & production planning, Past 1-
month Service Levels & Logistics/ reports to chains, & Regional distribution centers.
 Performed Power BI Server admin duties; creating users/groups and schedule instances in BI Services for
daily inventory and forecast results
 IICS (Informatica Intelligent Cloud Services) Product suite : esp. Cloud Application Integration, Data
Integration and Experience in handling APIs
 Build interactive dashboards and publish Visualization reports utilizing parameters, calculated fields and
table calculations, user filters, action filters and sets to handle views more efficiently
 Designed and implemented business NLP solutions along pre-built solutions (KPI & Dashboards/Reports).

 Provide stakeholder facing on call support as needed (working cap related tickets and issues)

 Modeled data to generate reports comparing business process evaluation, assisting C-level staff with
appropriate KPIs based on improved operations
 Organize, extrapolate, and disseminate pandas and data frames across departments to be used for drawing
conclusions about the success of current methods.
 Implemented several DAX functions for various fact calculations for efficient data visualizations.
 Written full Query DSL (Domain Specific Language) based on JSON to define queries for retrieving the data
from Elastic Search cluster.
 Monitored and scheduled daily backup jobs for SharePoint and databases using SQL Management Studio.
 Generated ad-hoc reports in Excel Power Pivot and shared them using Power BI to the decision makers.
 Tweaking the forecasting model adjusting the seasonality, inconsistency for slow moving goods & exceptions
in sales & distribution methods for certain tires leading to the sudden Spike of certain products.

KEA Medicals, Paris, France August 2019 - December 2019


Data Engineer – Retail Supply Chain
Environment: BI, GCP, Python Notebooks, Python Pandas, SAP BW, Graph DB & SQL
Project: Financial Visualizations reports on Healthcare Industry, Diagnosis
Worked as a Reporting and Statistical modeler on the KEA Medicals Team for Diagnosis campaigns, Finance analysis on
invoice data and impact of diseases on age groups.
Responsibilities:
 Identified operating improvements from internal data with SQL, which reduced hours by 8%
 Presented predictive modeling insights to C-level suite and stakeholders, participating in decisions
surrounding policy packages saved company $3.2M in legal costs over FY 19-20
 Scheduled data loads to BW from SAP R/3, Monitored data loads which are running daily, weekly, and
monthly basis and solving errors when errors occurred.
 Involved in Creating Generic Extractors using Table, View, and Function module.
 Responsible for designing ETL (extraction, transform, load) the data, building queries on data, scheduling
data loads on daily, monthly, weekly decision-making reports.
 Designed and implemented custom business intelligence solutions along with the pre-built solutions
(Metadata and Dashboards/Reports).
 Integrated Jenkins with various tools and services (e.g., GitHub, Slack, JIRA, SonarQube)
 Troubleshot and resolved issues related to Jenkins performance, scalability, and security
 End to End experience in designing and deploying data through different environments.
 Created and modified the schema objects like Attributes, Facts, Views, and Logical Tables.
 Experience in building data integration pipelines using SnapLogic Integration Cloud.
 Proficient in creating and configuring SnapLogic components such as snaps, pipelines, and tasks.
 Skilled in integrating various data sources and targets such as databases, cloud applications, and APIs using
SnapLogic Integration Cloud.
 Heavily built different types of visualizations using Histograms, Scatter plots, Box Whisker plots, Bar charts,
Heat Maps, Area charts, Geographic Maps in Tableau Desktop based on the requirement.
 Experienced in using various dashboard components like Horizontal, Vertical containers to display multiple
chart types and filters in a single dashboard.
 Published extracts to Tableau server with custom extract refresh schedules and tracked and maintained the
refresh timelines on a timely basis.
 Developed Spark SQL code to extract data from Teradata and Data Lake and push Elastic cluster
 Built prototypes, and deployed machine learning models in production environments.
 Strong Understanding in Normalization (1NF/2NF/3NF) /De-normalization techniques in
relational/dimensional database environments.

Bank of Queensland, India Mar 2017 - August


2019
Data Engineer

Environment: AWS, Swift, QRISK, Quantum, RPA, Excel VBA, Python & Pandas, Tableau, Cognos, Microsoft SQL
Project: Fraud Analytics reports in Banking Industry, Treasury and Commodities
Responsibilities:
 Generated ad-hoc reports in Excel Power Pivot and shared them using Power BI to the decision makers for
strategic planning.
 Experience with managing data warehouse models- Dimensional modeling, Star Schema, Snowflake
Schemas, De-normalized
 Experience in designing and developing MPP (Massive Parallel Processing) based Physical data model using
distribution and replication methodology
 Implemented Data Archiving strategies to handle the problems with large volumes of data by moving
inactive data to another storage location that can be accessed easily.
 Created an AWS RedShift cluster and did operations like starting and stopping the clusters, adding new node
to the cluster, and reducing the nodes etc.
 Deployed and managed containerized applications using Kubernetes and its ecosystem tools (e.g., kubectl,
Helm, Istio)
 Configured and optimized Kubernetes clusters for high availability, scalability, and resilience
 Implemented Kubernetes security features (e.g., RBAC, network policies, secrets management)
 Designed and implemented Kubernetes monitoring and logging solutions (e.g., Prometheus, ELK stack)
 Multiple reporting dashboards experience using Python, VBA and Power Queries development, Cloud
technologies (Microsoft Azure & AWS)
 Skilled understanding of deployment in applying machine learning methods (distribution analysis,
regression, classification, clustering, etc.)
 Made configuration changes to Airflow like Authentication backend, database backend etc.
 Launched an AWS EC2 instance and configured it with ELB and Auto Scaling Groups to handle the
unexpected spike in traffic to the website.
 Organize, extrapolate, and disseminate pandas and data frames across departments to be used for drawing
conclusions about the success of current methods
 Completed 17 SQL database design projects, optimizing queries and developing stored procedures, triggers,
tables, views, and functions.
 Involved in logical and Physical Database design & development, Normalization and Data modeling

Aero Technologies, India Aug 2015 - Feb 2017


Python Engineer - Analytics
Environment: AWS, Java Script, Excel VBA, Python & Pandas, Tableau, MongoDB, S3
Project: Analytic reports in Healthcare Industry
Responsibilities:
 Experience with Amazon SNS, S3, Kafka, Kinesis MongoDB, Kubernetes, Unix shell scripts, and GIT
 Experience with JavaScript, Python, Pandas, Apache Airflow, Apache Spark, Apache Beam
 Good hands-on experience in developing the Dashboard reports using ‘Key Performance Indicators (KPI)’ for
Top management for quick decision making to drive the business.
 Extensive experience in using ‘Level of Detail’ using Fixed, Include and Exclude.
 Analyzed and interpreted customer survey data, resulting in a 10% increase in customer satisfaction ratings.
 Experience in using Informatica Power Exchange for real-time data integration with external systems.
 Familiarity with Informatica MDM for master data management.
 Knowledge of Informatica Cloud for cloud-based data integration and management.
 Experience in troubleshooting and debugging ETL processes and performance tuning using Informatica
PowerCenter.
 Developed a predictive model for customer churn that reduced the churn rate by 15%.
 Created a dashboard using Tableau to track sales performance and identify opportunities for improvement,
resulting in a 20% increase in sales revenue.
 Conducted A/B testing on website design changes and identified a new design that improved conversion
rates by 25%.
 Collaborated with marketing team to analyze social media data and develop targeted marketing campaigns,
resulting in a 30% increase in website traffic.
 Provided Production Support to Tableau users and wrote Custom SQL to support business.
 Experience in streaming technologies like Kafka, Kinesis, Spark Streaming.
 Experience in designing and developing Serverless Peers (Massive Parallel Processing) based Physical data
model using distribution and replication methodology using AWS Lambda
 Optimize database performance through SQL tuning, index optimization, or architectural changes.
 Develop testing strategies and scenarios for review with both project and client teams to ensure alignment
to test strategy and completeness
 Made configuration changes to Airflow like Authentication backend, database backend etc.
 Launched an AWS EC2 instance and configured it with ELB and Auto Scaling Groups to handle the unexpected
spike in traffic to the website.
 Knowledge of REST APIs and JSON for building and consuming web services.
 Familiarity with cloud computing concepts and platforms such as AWS, Azure, and GCP.

EDUCATION

 Bachelor of Sciences, JNTU, Kakinada India - Computer Science and Engineering, April 2015

TECHNICAL SKILLS

Tools Visio, MS- Project, R Studio, MS Power BI, Tableau, ASPEN,


Matplot lib, GGPlot, Minitab
Frameworks AWS, Apache SPARK, Data Bricks, GCP, Azure, Hadoop, HDFS
Programming Languages Python, PL/SQL, JavaScript, C, PySpark, Scala, Hive
Microsoft SQL Server, MySQL, Oracle DB, DB2, MS Access,
Database Tools
Mongo DB
AWS (EC2, ES, S3, VPC), SAP (S4 Hana, BW) and Microsoft
Packages
Azure (DB).
ETL Tools Informatica, SSIS, Snowflake, Excel, Oracle Warehouse Builder
Data Libraries Pandas, NumPy, PyTorch, Keras, TensorFlow, Neural
Networks
Linear/Logistic Regressions, RNN, CNN, LSTM, NLP, RF, SVM,
ML Algorithms
Sentiment Analysis, K Means, Clustering, Gradient Descent
Other Tools Jira, Dockers, Jenkins, MATLAB, VS Code, Postman, Aspen PE.
Misc. GIT, Jupyter, Visual Studio, IIS, Visio

You might also like