[go: up one dir, main page]

0% found this document useful (0 votes)
15 views11 pages

Progress Report

The document outlines an internship focused on Generative AI and medical image processing, detailing the use of models like GANs and CNNs for tasks such as image synthesis and classification. It discusses the objectives of the internship, the organization Snipe Tech Pvt Ltd, and the technologies employed in AI research and development. The document also provides a theoretical background on key concepts like backpropagation, optimizers, and various generative models used in medical imaging.

Uploaded by

kgruchitha
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
15 views11 pages

Progress Report

The document outlines an internship focused on Generative AI and medical image processing, detailing the use of models like GANs and CNNs for tasks such as image synthesis and classification. It discusses the objectives of the internship, the organization Snipe Tech Pvt Ltd, and the technologies employed in AI research and development. The document also provides a theoretical background on key concepts like backpropagation, optimizers, and various generative models used in medical imaging.

Uploaded by

kgruchitha
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 11

INTERNSHIP PROGRESS

CHAPTER 1

INTRODUCTION
1.1Generative AI

Generative Artificial Intelligence (AI) is a subset of machine learning that focuses on


creating new data instances that resemble existing ones. Unlike traditional AI models, which
are primarily designed for classification or prediction tasks, generative models aim to learn the
underlying patterns and distributions in data to produce new, synthetic samples. These models
have gained significant attention due to their ability to generate high-quality images, text,
audio, and even code. The rise of deep learning techniques, particularly neural networks, has
accelerated advancements in generative AI, leading to the development of models like
Generative Adversarial Networks (GANs), Variational Autoencoders (VAEs), and Diffusion
Models.

One of the most widely used generative models is GANs (Generative Adversarial
Networks), which consist of two neural networks—a generator and a discriminator—
competing against each other in a zero-sum game. The generator produces synthetic data, while
the discriminator evaluates the authenticity of the generated data. Over time, the generator
improves its ability to create realistic outputs. Another powerful approach is Variational
Autoencoders (VAEs), which encode input data into a lower-dimensional latent space and then
reconstruct new samples from that space. More recently, Diffusion Models have emerged as
state-of-the-art generative models, especially in image synthesis, enabling high-fidelity outputs
with improved realism.

Generative AI has found applications in various industries, including healthcare, finance,


entertainment, and robotics. In the medical field, it is particularly valuable for data
augmentation, anomaly detection, and automated medical report generation. Many medical
datasets are small or imbalanced due to the complexity and cost of data collection. Generative
models help bridge this gap by creating additional training samples, thereby improving the
robustness of machine learning models used in diagnosis and treatment planning. For example,
GANs have been used to generate synthetic MRI or CT scans, allowing AI models to learn
from a more diverse dataset while preserving patient privacy.

DEPT OF ISE, MyCEM, Mysore 1 2024-2025


INTERNSHIP PROGRESS

1.2Medical image processing

Medical Image Processing is a specialized field of artificial intelligence and computer


vision that focuses on analyzing, enhancing, and interpreting medical images for diagnosis,
treatment planning, and research. It plays a crucial role in modern healthcare, enabling doctors
and radiologists to detect diseases, segment anatomical structures, and classify abnormalities
with higher accuracy and efficiency. Medical imaging techniques such as X-ray, MRI
(Magnetic Resonance Imaging), CT (Computed Tomography), ultrasound, and histopathology
slides provide valuable insights into a patient’s health condition. However, interpreting these
images manually is time-consuming, subject to human error, and requires extensive expertise.
AI-driven image processing techniques aim to automate and augment this process, making it
faster, more reliable, and scalable.

The fundamental steps in medical image processing include image acquisition,


preprocessing, feature extraction, segmentation, classification, and visualization.
Preprocessing involves tasks such as noise reduction, contrast enhancement, and normalization
to improve the quality of the images. Feature extraction techniques help in identifying key
patterns, such as tumor boundaries, organ structures, or abnormalities, which are then fed into
machine learning or deep learning models for classification. Segmentation is another crucial
aspect, where algorithms such as U-Net, Mask R-CNN, and level-set methods are used to
separate different regions in an image, such as a tumor from healthy tissue.

Deep learning models, especially Convolutional Neural Networks (CNNs), have


significantly improved the accuracy of medical image analysis. CNN-based architectures, such
as ResNet, EfficientNet, and VGG, have been trained on large-scale medical datasets to detect
diseases like breast cancer, lung nodules, and brain tumors with expert-level accuracy. Apart
from classification, object detection models like Faster R-CNN and YOLO are used to locate
abnormalities in medical scans, while segmentation models like U-Net help in precise
localization of affected areas.

One of the biggest challenges in medical image processing is the availability of labeled
data, as annotating medical images requires skilled radiologists and pathologists. This is where
Generative AI is being integrated into medical image analysis. By generating synthetic medical
images, AI can augment datasets, allowing for better model training without the need for
additional manual annotations.

DEPT OF ISE, MyCEM, Mysore 2 2024-2025


INTERNSHIP PROGRESS

1.3Objectives of internship

To understand and implement Generative AI models:


The goal was to explore and apply Generative AI techniques such as GANs and diffusion
models for medical image synthesis. These models were used to generate realistic synthetic
data to augment training datasets. The objective included learning model architecture,
training methods, and evaluating image quality.

To develop classification and segmentation models for medical imaging:


This involved building deep learning models like CNNs, U-Net, ResNet, and XGBoost for
classifying and segmenting breast cancer images. The models were trained on preprocessed
datasets and evaluated using metrics such as accuracy and IoU. The objective aimed to
automate tumor detection tasks.

To explore hybrid AI techniques, including classical and quantum models:


The internship included experimentation with quantum-classical models using Qiskit and
EstimatorQNN. These hybrid approaches were compared against traditional models to
evaluate their performance in classification tasks. It provided exposure to emerging Quantum
AI concepts in healthcare.

To preprocess, annotate, and manage medical imaging datasets:


This objective involved working with structured (WDBC) and unstructured image datasets.
Tasks included data cleaning, normalization, label encoding, and generating segmentation
mask pairs. Proper dataset handling ensured high-quality inputs for model training and
testing.

To document and report the research progress and findings:


Throughout the internship, project progress, experimental results, and technical insights were
regularly documented. Reports, charts, and performance summaries were maintained to track
development. This helped improve scientific communication and structured reporting skills.

DEPT OF ISE, MyCEM, Mysore 3 2024-2025


INTERNSHIP PROGRESS

CHAPTER 2

COMPANY PROFILE
2.1 About the organization

Snipe Tech Pvt Ltd is a technology-driven company focused on delivering innovative


solutions in the fields of Artificial Intelligence, Data Science, and Software Development.
Established with the mission to bridge the gap between research and real-world
implementation, the company offers end-to-end services in AI product development, cloud
computing, web application design, and data analytics.

The organization emphasizes emerging technologies such as Generative AI, Quantum


Computing, Natural Language Processing (NLP), and Computer Vision to build advanced and
scalable solutions. Snipe Tech collaborates with startups, enterprises, and research institutions
to co-develop intelligent systems across domains like healthcare, finance, education, and
sustainability.

2.2 Nature of work

Snipe Tech operates primarily in the domain of AI research and custom software development,
offering consulting and product engineering services. The company engages in:

• Research and development of AI-powered applications

• Implementation of deep learning and machine learning pipelines

• Development of AI-based automation tools

• Building web and mobile platforms integrated with backend intelligence

• Experimentation with Generative AI, Quantum AI, and Edge AI technologies

During my internship, I was part of an AI research team working on the intersection of


Generative AI and Medical Image Analysis, particularly focusing on breast cancer
classification and segmentation using hybrid models.

2.3 Departments involved

The core departments that drive Snipe Tech’s operations include:

DEPT OF ISE, MyCEM, Mysore 4 2024-2025


INTERNSHIP PROGRESS

• AI Research and Development (R&D): Focused on algorithm development, model


training, generative modeling, and exploratory data analysis.

• Software Engineering: Responsible for developing the front-end and back-end of AI-
integrated applications and dashboards.

• Quality Assurance (QA): Ensures that developed models and applications are stable,
reliable, and accurate.

• Data Engineering: Manages data pipelines, preprocessing, and large-scale data handling
for training and inference.

• Product Management: Bridges communication between clients and development teams,


defining project goals and deliverables.

2.4 Technologies used by the organization

Snipe Tech actively works with modern and cutting-edge technologies to ensure efficiency,
scalability, and innovation. Some of the key technologies and tools used include:

• Programming Languages: Python, JavaScript, TypeScript, SQL

• Machine Learning & AI: PyTorch, TensorFlow, Scikit-learn, XGBoost, Hugging Face
Transformers

• Generative AI: GANs, Diffusion Models, VAE, DALL·E, Stable Diffusion

• Medical Imaging: MONAI, OpenCV, Nibabel, DICOM/NIfTI processing tools

• Web & App Development: React.js, Node.js, Flask, FastAPI

• Databases: MySQL, MongoDB, PostgreSQL

• Deployment & DevOps: Docker, Firebase, GitHub Actions, AWS, GCP

The organization ensures that interns are provided with the opportunity to explore real-world
challenges, apply their theoretical knowledge, and contribute meaningfully to high-impact
projects.

DEPT OF ISE, MyCEM, Mysore 5 2024-2025


INTERNSHIP PROGRESS

CHAPTER 3

THEORETICAL BACKGROUND

3.1 Backpropagation:

Backpropagation is a fundamental algorithm used to train artificial neural networks. It is a


supervised learning technique that efficiently computes the gradient of the loss function with
respect to the model’s weights. This gradient is then used to update the weights during training,
allowing the network to minimize the error between predicted and actual outputs.

The backpropagation algorithm works by propagating the error backward from the output layer
to the input layer. It involves two main phases:

1. Forward Pass: The input data is passed through the network, and an output is generated.
The loss (error) is then calculated using a loss function such as Mean Squared Error or
Cross-Entropy Loss.

2. Backward Pass: The gradients of the loss with respect to each weight are computed
using the chain rule of calculus. These gradients indicate how much each weight
contributed to the loss.

The computed gradients are used by an optimizer (such as SGD or Adam) to adjust the weights
in the direction that reduces the loss. Backpropagation is the backbone of training deep learning
models, including Convolutional Neural Networks (CNNs) used in image classification and
segmentation tasks.

3.2 Optimizers:

Optimizers are algorithms that update the weights and biases of a neural network to minimize
the loss function. They determine how fast and how accurately a model learns during training.
Different optimizers offer different trade-offs between speed, stability, and convergence
quality.

Common Types of Optimizers:

• Stochastic Gradient Descent (SGD): Updates weights using the gradient of the loss with
respect to a single batch. It's simple but may converge slowly.

DEPT OF ISE, MyCEM, Mysore 6 2024-2025


INTERNSHIP PROGRESS

• Momentum: An extension of SGD that adds a fraction of the previous update to the
current one to accelerate learning.

• RMSProp: Adapts the learning rate for each parameter individually using a moving
average of squared gradients.

• Adam (Adaptive Moment Estimation): Combines Momentum and RMSProp. It’s one
of the most widely used optimizers for training deep learning models due to its
efficiency and effectiveness.

In my internship, Adam Optimizer was used to train classification and segmentation models
effectively, especially in optimizing complex loss functions for image-based tasks.

3.3 Types of Generative models

Generative models learn the underlying distribution of input data and are capable of generating
new data instances that resemble the original data. They play a crucial role in fields such as
image synthesis, data augmentation, anomaly detection, and unsupervised learning.

1. Generative Adversarial Networks (GANs)

GANs consist of two neural networks—a generator and a discriminator—trained


simultaneously in a competitive setting. The generator tries to create realistic data, while the
discriminator attempts to distinguish between real and generated samples. Over time, the
generator learns to produce highly realistic outputs. GANs are widely used in medical image
synthesis and augmentation.

2. Variational Autoencoders (VAEs)

VAEs are probabilistic models that encode input data into a latent space and then decode it
back to reconstruct the original input. Unlike standard autoencoders, VAEs introduce a
probabilistic component that allows them to generate new, varied outputs. VAEs are useful in
generating medical images with controlled features (e.g., tumor size, shape).

3. Diffusion Models

These are relatively newer models that iteratively add noise to data and then learn to reverse
the process, generating new data samples from pure noise. They have achieved state-of-the-art
results in image generation tasks (e.g., DALL·E 2, Stable Diffusion). In healthcare, they can
be used for high-resolution image generation and de-noising.

DEPT OF ISE, MyCEM, Mysore 7 2024-2025


INTERNSHIP PROGRESS

3.4 AI techniques in medical imaging:

Artificial Intelligence (AI) has become a powerful tool in medical imaging, enabling automated
analysis, improved diagnosis, and efficient clinical decision-making. With the increasing
availability of imaging data from modalities such as MRI, CT, X-ray, PET, and histopathology,
AI techniques—particularly machine learning and deep learning—have been extensively
applied to extract meaningful insights from medical images.

Image Classification

Image classification involves assigning a label to a whole image based on its content, such as
determining whether a tumor is benign or malignant. Deep learning models, especially
Convolutional Neural Networks (CNNs), are widely used for this task due to their ability to
learn spatial hierarchies from pixel data.

• Use Case: Breast cancer classification, skin lesion detection

• Models Used: CNN, ResNet, EfficientNet, XGBoost (on extracted features)

Image Segmentation

Segmentation is the process of dividing an image into regions or segments, often used to isolate
organs, lesions, or tumors in medical scans. Semantic segmentation assigns a class label to each
pixel, while instance segmentation distinguishes between multiple objects of the same class.

• Use Case: Tumor boundary detection, organ segmentation in MRI

• Models Used: U-Net, SegNet, Mask R-CNN, DeepLabV3

Image Reconstruction and Enhancement

AI is used to enhance low-quality or noisy images, reconstruct missing data, and improve the
resolution of scans. These techniques are especially useful in reducing scan time and radiation
exposure while maintaining image quality.

• Use Case: Super-resolution MRI, de-noising low-dose CT

• Models Used: GANs, Autoencoders, Denoising CNNs

DEPT OF ISE, MyCEM, Mysore 8 2024-2025


INTERNSHIP PROGRESS

CHAPTER 4

MODELS
4.1 Convolution neural networks (CNN) :

Convolutional Neural Networks (CNNs) are a class of deep learning models specifically
designed to process grid-like data such as images. They are composed of multiple layers,
including convolutional layers, pooling layers, and fully connected layers. The convolution
operation allows the network to extract spatial features such as edges, textures, and patterns
from the image, making CNNs highly effective in image classification tasks.

In medical imaging, CNNs are widely used to classify X-rays, MRI scans, and
histopathological images. For example, in breast cancer detection, CNNs can be trained to
distinguish between benign and malignant tumors by learning discriminative features from
labeled datasets. Their ability to learn hierarchical representations makes them ideal for
diagnostic applications where fine-grained image features are critical.

Figure 4.1: CNN Architecture

4.2 Generative Adversarial Networks (GANs):

Generative Adversarial Networks (GANs) are a class of generative models consisting of two
neural networks — a generator and a discriminator — engaged in a competitive game. The
generator attempts to create realistic synthetic data, while the discriminator tries to distinguish
between real and fake samples. Through this adversarial process, the generator gradually
improves its ability to produce convincing outputs.

DEPT OF ISE, MyCEM, Mysore 9 2024-2025


INTERNSHIP PROGRESS

GANs have found significant applications in medical imaging, such as data augmentation,
domain translation (e.g., MRI to CT), image enhancement, and anonymization. In this
internship, GANs were used to generate synthetic breast cancer images to augment training
data and improve model generalization. This was especially useful in addressing class
imbalance and increasing the diversity of samples for better training of classification and
segmentation models.

Figure 4.2: GAN Architecture

4.3 U-Net:

U-Net is a convolutional neural network architecture developed specifically for biomedical


image segmentation. It follows an encoder-decoder structure with skip connections that help
retain spatial information during the decoding process. The encoder compresses the input
image into a latent representation, while the decoder reconstructs a pixel-wise segmented
output.

U-Net is especially effective when dealing with limited training data and is widely used for
tasks such as tumor boundary detection, organ segmentation, and lesion masking. In this
internship, U-Net was applied to perform semantic segmentation of breast cancer regions in
histopathology images, producing binary masks to highlight tumor areas. Its architecture
ensures accurate localization while maintaining contextual understanding.

DEPT OF ISE, MyCEM, Mysore 10 2024-2025


INTERNSHIP PROGRESS

Figure 4.3: U-Net Architecture

4.4 Residual Network (ResNet):

ResNet, short for Residual Network, introduced the concept of skip connections or residual
connections that allow gradients to flow through a deep network without vanishing. Traditional
deep networks suffer from performance degradation as the depth increases, but ResNet
addresses this by learning residual mappings instead of direct mappings.

ResNet is highly suitable for image classification in medical imaging due to its depth,
robustness, and transfer learning capabilities. Pretrained ResNet variants such as ResNet-50 or
ResNet-101 can be fine-tuned on medical datasets, reducing the need for large training data.
During the internship, ResNet was explored for improving classification accuracy in breast
cancer detection by capturing high-level semantic features.

Figure 4.4: ResNet architecture

DEPT OF ISE, MyCEM, Mysore 11 2024-2025

You might also like