8000 GitHub - deepvineet-prog/ai-engineer-roadmap: The most comprehensive free guide for becoming an AI Engineer in 2024
[go: up one dir, main page]

Skip to content

deepvineet-prog/ai-engineer-roadmap

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

23 Commits
 
 
 
 

Repository files navigation

ai engineer roadmap

AI Engineer Roadmap

The fastest, most comprehensive way to become an AI Engineer in 2024

Welcome to the AI Engineer Roadmap! This guide offers a project-based approach to mastering AI engineering, whether you're a beginner or looking to expand your skills. Each section includes practical projects to apply your knowledge, build real-world AI applications, and develop crucial problem-solving skills ᕙ( •̀ ᗜ •́ )ᕗ

Table of Contents

  1. Web/App Development
  2. Beginner Text Generation
  3. Advanced Text Generation
  4. Image Generation
  5. Speech
  6. Computer Vision

Web/App Development

application development

It helps to have the ability to code your own interfaces, but it's also 100% possible to build AI products without knowing how to program. It's up to you if you wanna go down the coding (full-stack) route or no-code (Webflow, Zapier, etc) route.

Full-stack Route (recommended)

  • Front-end: Learn React for building interactive user interfaces
  • Back-end: Master NodeJS/NextJS for server-side development
  • Database: Understand and implement Postgres for data storage

There are tons of roadmaps out there for learning web development. One of my favorites is Scrimba. I also have a bootcamp on Youtube that covers full-stack web dev + building AI apps

No-code Route

  • Website Builder: Explore Webflow for creating professional websites without coding
  • Workflow Builder: Use Zapier to automate processes and integrate applications
  • Database: Leverage Firebase or Airtable for easy-to-use, scalable data storage solutions

Beginner Text Generation

beginner text generation
  1. Understanding Large Language Models (LLMs)

  2. Proprietary LLMs

    • OpenAI's GPT models
    • Anthropic's Claude 3 family
    • Google's Gemini
  3. Open-source LLMs

    • Meta's LLaMA 3
    • Cohere's Command-R
  4. Prompt Engineering

  5. Basic Chatbots

  6. Handling Structured Output

    • Learn techniques for generating and parsing structured data from LLMs
    • Check out Instructor or use string parsing

Advanced Text Generation

advanced text generation
  1. Function Calling and Tool Usage

    • Implement LLM-powered tools and integrate external functions
    • Project: Build a personal assistant that can interact with your calendar, email, and task list
  2. Web-browsing Capabilities

    • Learn about techniques for scraping and summarizing web content
    • Project: Build an open-source version of Perplexity (like morph.so)
  3. Fine-tuning LLMs

    • Techniques for adapting pre-trained models to specific tasks
    • Project: Fine-tune a model on a specific domain (e.g., medical terminology, legal jargon)
  4. Embeddings and Vector Databases

    • Understand and implement vector representations of text
    • Explore vector database solutions for efficient similarity search (e.g. Chroma, Supabase, Weaviate)
    • Project: Build a semantic search engine for a large corpus of documents
  5. Retrieval Augmented Generation (RAG)

    • Learn about different RAG architectures and when to use them
    • Project: Develop a "Chat with PDF" application
  6. AI Agents

    • Study projects like OpenDevin to understand autonomous AI systems
    • Project: Autonomous research agent

Speech

speech
  1. Text-to-Speech (TTS)

    • Implement TTS using services like ElevenLabs and OpenAI
    • Project: Create an audiobook generator from text input
  2. Speech-to-Text (STT)

    • Utilize models like OpenAI's Whisper for transcription
    • Project: Create a job interview coach application
  3. Speech Analysis

    • Explore emotion and intent analysis using tools like Hume AI or Google Gemini 1.5 Pro
    • Project: Create an AI Therapist with emotion detection
    • Learn about prosody analysis and its applications in understanding speaker intent

Image Generation

CleanShot July 2

image generation
  1. Prompt Engineering for Image Generation

    • Read up on art history and photography terminology to craft effective prompts
    • Join the Midjourney Discord to study how experts prompt image models
    • Project: Create a series of images that tell a story, using consistent style and characters
  2. Proprietary Image Generation Models

    • Explore capabilities of models like GPT-4o, Claude, and Gemini
    • Project: Children's coloring/story book generator
    • Learn about image-to-image transformations (style transfer, inpainting, outpainting)
  3. Open-source Image Generation Models

    • Experiment with Stable Diffusion and other accessible models
    • Project: Build a custom image generation UI with fine-grained controls

Computer Vision

computer vision
  1. Image Analysis

    • Leverage models like Claude or GPT-4o for comprehensive image understanding
    • Project: Develop an app that can analyze and describe the contents of photos
    • Learn about object detection, segmentation, and classification techniques
  2. Video Analysis

    • Explore advanced capabilities with models like Google Gemini 1.5 Pro
    • Project: Video narration
    • Study techniques for tracking objects and analyzing motion in videos
    • Project: Create a sports analysis tool that can break down player movements and tactics

Happy learning and building!

  • Zack

my twitter

About

The most comprehensive free guide for becoming an AI Engineer in 2024

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published
0