This is a series of computer vision foundational projects that anyone diving into the field must tackle.
-
Updated
Nov 1, 2024 - Jupyter Notebook
This is a series of computer vision foundational projects that anyone diving into the field must tackle.
Enhance your skills in prompt engineering for vision models. Learn to effectively prompt, fine-tune, and track experiments for models like SAM, OWL-ViT, and Stable Diffusion 2.0 to achieve precise image generation, segmentation, and object detection.
Vision-based swarms in the Presence of Occlusions
An implementation of gated MLPs in tinygrad, as an alternative to transformers.
A simple to use package to call various model providers such as openai, anthropic, and others with utmost reliability, security, and performance.
Testing the Moondream tiny vision model
Implementation of Midas from [Towards Robust Monocular Depth Estimation] in Pytorch and Zeta
Implementation of VisionLLaMA from the paper: "VisionLLaMA: A Unified LLaMA Interface for Vision Tasks" in PyTorch and Zeta
A comprehensive repository for research, code, and insights on convolutional neural networks and deep vision models
Diffusion Models crash course with Pytorch from DeepLearningAI
building AVA from ex-machina; a lightweight multi-modal system from scratch, just for learning & experimentation
A framework to compute threshold sensitivity of deep networks to visual stimuli.
In This repo i FineTuned a Pretrained ResNet18 model from PyTorch library
These notes and resources are compiled from the crash course Prompt Engineering for Vision Models offered by DeepLearning.AI.
we generate captions to the images which are given by user(user input) using prompt engineering and Generative AI
Add a description, image, and links to the vision-models topic page so that developers can more easily learn about it.
To associate your repository with the vision-models topic, visit your repo's landing page and select "manage topics."