Associate Data Scientist - Intern
Department: Data Science
Location: Pune, India
Company Details
Innoplexus is a leading global AI-based company at the forefront of drug discovery and
development. With over 350 employees and a strong commitment to innovation, we
hold 140+ patent applications, including 69 grants, in artificial intelligence, machine
learning, and blockchain technologies. Our cutting-edge solutions generate real-time
insights from massive volumes of structured and unstructured data, enabling
continuous, informed decision-making for our diverse customer base at unprecedented
speed. Founded in 2015, Innoplexus AG is headquartered in Eschborn, Germany, with
additional offices in Pune, India, and Iselin and San Francisco, United States.
About the Role
Are you ready to build impactful solutions with cutting-edge technology and drive
innovation in data science? We are seeking enthusiastic and talented Machine Learning
& Deep Learning enthusiasts with a passion for data analysis to join our Data Science
team as an Associate Data Scientist Intern. In this role, you will apply your skills in
Natural Language Processing (NLP) and Computer Vision to transform complex data
into actionable insights, contributing directly to our advanced AI solutions.
Key Responsibilities
• Problem Solving:
o Implement and fine-tune neural network architectures, including Large
Language Models (LLMs) and transformer-based models.
o Optimize model performance, scalability, and efficiency for production
environments.
o Conduct rigorous experiments to evaluate model performance,
robustness, and generalization capabilities.
o Work with large-scale datasets, preprocess them, and create appropriate
data representations.
o Select relevant features and ensure data quality, applying the most
appropriate metrics as expected by product and project managers.
o Deploy trained models into production environments and monitor their
performance.
o Troubleshoot issues that arise post-deployment and iterate on
improvements.
• Collaboration:
o Work closely with cross-functional teams, including researchers,
software engineers, and product managers.
o Communicate complex technical findings and insights effectively to
diverse audiences.
• Innovation:
o Stay up to date with the latest advancements in NLP, deep learning,
Generative AI, LLMs, and broader AI research.
o Explore novel techniques and approaches to enhance model capabilities
and drive innovation.
Qualifications
• Education: B.Tech/B.E. in Computer Science or an Integrated B.Tech+M.Tech in
Computer Science, Statistics, Electrical, Electronics, Applied Mathematics, or a
related Machine Learning field.
• Coding Proficiency: Hands-on experience coding in at least one programming
language, with Python strongly preferred.
• Quantitative Skills: Solid knowledge of basic concepts in mathematics and
statistics.
• Core AI/ML: Strong background in machine learning, deep learning, and natural
language processing.
• Data Science Expertise: Demonstrated expertise in data science and
Generative AI.
• Technical Proficiency: Proficiency in Python, Pandas, and core ML libraries
(e.g., Scikit-learn, PyTorch).
• Theoretical Foundation: Solid understanding of statistics, linear algebra, and
probability theory.
• Data Handling: Experience working with both structured and unstructured
datasets.
• Problem-Solving: Excellent problem-solving skills and the ability to work both
independently and collaboratively.
• Advantageous: Experience with transformer-based models (e.g., BERT, GPT, T5,
Llama) is a plus.
• Good to have: Familiarity with MLOps practices.
What We Offer (Perks & Benefits)
• A challenging and rewarding internship experience in a leading AI company.
• Exposure to cutting-edge technologies and real-world drug discovery problems.
• Mentorship from experienced data scientists and AI experts.
• A dynamic, collaborative, and innovative work environment.