This project uses the Flickr8k dataset from Kaggle.
Please download it from the above link and place it in your local environment if you want to run the full code and experiments.
This repository contains code for image captioning and segmentation using deep learning techniques.
It combines CNN-based feature extraction and LSTM-based sequence modeling to generate captions for images.
main.py
— Main training and evaluation script..ipynb
files — Notebooks with experiments and visualizations.model.keras
— Trained Keras model.tokenizer.pkl
— Tokenizer used for captions.
- Clone this repository.
- Download the dataset from Kaggle and place images and captions in your project folder.
- Install requirements (
pip install -r requirements.txt
if you have one). - Run
main.py
or open notebooks to start experimenting.
This project is for academic and research purposes only.