Final
Final
A Main project thesis submitted in partial fulfillment of requirements for the award of
degree for VIII semester.
BACHELOR OF TECHNOLOGY
1|Page
CERTIFICATE
This is to certify that the main project entitled “AI-Powered Interview Assistant” being submitted by
in partial fulfilment for the award of the degree “Bachelor of Technology” in Computer Science and
Engineering to the Jawaharlal Nehru Technological University, Kakinada is a record of bonafide work
done under my guidance and supervision during VIII semester of the academic year 2022-2023.
The results embodied in this record have not been submitted to any other university or institution
for the award of any Degree or Diploma.
DECLARATION
2|Page
We hereby declare that this project entitled “AI-Powered Interview Assistant” is a
bonafide work done by us and submitted to “Department of Computer Science and Engineering,
G. V. P College of Engineering (Autonomous) Visakhapatnam, in partial fulfilment for the award
of the degree of B. Tech is of our own and it is not submitted to any other university or has been
published anytime before.
ACKNOWLEDGEMENT
3|Page
We would like to express our deep sense of gratitude to our esteemed institute Gayatri
Vidya Parishad College of Engineering (Autonomous), which has provided us an opportunity
to fulfill our cherished desire.
We express our sincere thanks to our principal Dr. A. B. KOTESWARA RAO, Gayatri
Vidya Parishad College of Engineering (Autonomous) for his encouragement to us during this
project, giving us a chance to explore and learn new technologies in the form of mini projects.
We express our profound gratitude and our deep indebtedness to our guide Dr. R.
Seeta Sireesha , Associate Professor, Department of Computer Science and Engineering
whose valuable suggestions, guidance and comprehensive assessments helped us a lot in
realizing our project.
We also thank our coordinator, Dr. CH. SITA KUMARI, Associate Professor,
Department of Computer Science and Engineering, for the kind suggestions and guidance for
the successful completion of our project work.
4|Page
ABSTRACT
In today's rapidly evolving educational landscape, where students pursuing technical degrees are
constantly balancing coursework, internships, and part-time jobs, the allocation of time for crucial skill
assessments becomes increasingly challenging. As graduation looms, the absence of adequate
interview preparation can instill fear and undermine confidence levels.
The challenge is posed by inflexible schedules and set interview timelines, which make it difficult for
students to connect theoretical knowledge with the practical skills necessary in the competitive tech
industry. It's crucial for students to continuously adapt and acquire advanced skills to stay relevant in
the face of rapid technological advancements..
With technology evolving rapidly, students must continually acquire advanced skills to remain
competitive. While current models contribute to skill development, they often fall short in providing
personalized one-on-one interview experiences.
Existing skill development models often fail to provide personalized one-on-one interview
experiences, leaving students without adequate guidance to excel. However, assessing interview
performance and identifying areas for improvement remains challenging.
Our proposed approach leverages OpenAI keys and LangChain models to revolutionize interview
preparation by generating tailored questions from user resumes, enhancing the learning experience.
Streamlit facilitates seamless interaction, while OpenAI integration enhances simulation sophistication,
bridging the gap between theory and practice. This comprehensive solution represents a paradigm
shift, empowering students to excel confidently in job interviews.
5|Page
INDEX
MODULE Topic name Page number
1. INTRODUCTION
6|Page
In the realm of career advancement, meticulous interview preparation emerges as a cornerstone of
success. This innovative application, powered by cutting-edge technologies such as Streamlit,
LangChain, and OpenAI, epitomizes a sophisticated solution tailored to meet the evolving needs of
today's job seekers. By seamlessly integrating advanced language processing mechanisms, it offers a
refined approach to interview readiness, empowering individuals with personalized insights and
guidance
At its essence, the application embodies efficiency and efficacy, leveraging the prowess of
Streamlit for intuitive user interaction and LangChain for seamless text processing. Through the lens of
OpenAI's language models, it navigates the complexities of resume parsing, swiftly distilling pertinent
information and crafting tailored interview questions. This symbiotic integration of technology not only
optimizes the preparation journey but also ensures that candidates are equipped with a comprehensive
understanding of the topics they may encounter during interviews.
Furthermore, the project objective transcends mere question generation; it aspires to foster a
culture of continuous improvement and empowerment. By facilitating audio recording capabilities and
leveraging LangChain's capabilities, it enables candidates to articulate their responses with clarity and
precision. Through iterative analysis and feedback loops, the application empowers individuals to refine
their communication skills, ultimately enhancing their confidence and competitiveness in the job
market.
1.1 OBJECTIVE
The primary objective of this application is to revolutionize the interview preparation process, offering
a holistic solution that combines technological innovation with strategic foresight. Through the
seamless integration of Streamlit, LangChain, and OpenAI, it endeavors to provide individuals with a
tailored and immersive experience that transcends traditional methods of preparation.
Firstly, the application aims to streamline the preparation journey by leveraging Streamlit's
interactive interface, ensuring user-friendly navigation and engagement. Additionally, through
LangChain's text processing capabilities, it seeks to automate the extraction of relevant information
from resumes, facilitating the generation of personalized interview questions that align with the
candidate's skills and experiences.
Moreover, the application aspires to enhance candidates' communication proficiency through the
integration of audio recording functionality. By leveraging LangChain's speech-to-text conversion
7|Page
capabilities, it enables individuals to articulate their responses verbally, fostering a more dynamic and
immersive preparation experience. Through iterative analysis and feedback mechanisms, the
application empowers candidates to refine their responses and elevate their interview performance
Overall, the objective of this application is to empower individuals with the tools and insights
necessary to confidently navigate the interview process and secure their desired career opportunities.
By harnessing the combined capabilities of Streamlit, LangChain, and OpenAI, it aims to redefine the
paradigm of interview preparation, setting a new standard for efficiency, effectiveness, and
empowerment.
In our algorithm, we aim to develop a robust system for efficient document retrieval and processing,
leveraging advanced techniques such as document loaders, text splitting, embedding models, vector
stores, retrievers, and indexing. This algorithmic framework is crucial for enabling streamlined access
to information, enhancing search capabilities, and facilitating seamless integration with user interfaces.
Document loaders.
Document loaders act as the primary entry point for bringing data into our system. They provide the
initial step in the data ingestion process, facilitating the seamless integration of textual content from
various sources.
Text Loader:
8|Page
The Text Loader component serves as a foundational element in our system, responsible for sourcing
textual documents from various data repositories. By seamlessly interfacing with diverse sources
including local files and cloud-based storage solutions, Text Loader ensures the reliable acquisition of
data essential for subsequent processing and analysis.
Text Splitters
Text Splitter efficiently breaks down large documents into manageable chunks, enhancing processing
efficiency and enabling targeted analysis. Coherent Chunking: Utilizes advanced algorithms to ensure
that text chunks maintain coherence and relevance, preserving the contextual integrity of the original
document. Optimized Processing: By segmenting text into smaller units, Text Splitter optimizes
subsequent retrieval and analysis processes, facilitating faster and more accurate information
extraction.
9|Page
At the core of our data preprocessing pipeline, the Character Text Splitter module plays a pivotal role
in segmenting large textual documents into manageable fragments. Utilizing sophisticated character-
based splitting algorithms, this component optimizes data processing efficiency and enhances retrieval
performance by isolating relevant sections of text.
Vector Database:
In the ever-evolving landscape of artificial intelligence, vector databases stand as pivotal solutions,
indexing and storing vector embeddings to enable swift retrieval and similarity searches. As we
navigate through the AI revolution, these databases emerge as indispensable tools, addressing the
escalating complexity and scale of modern data processing. By harnessing the semantic richness
embedded within vector representations, they empower applications reliant on large language models
and generative AI, facilitating efficient knowledge retrieval and long-term memory maintenance.
Through seamless integration with embedding models, these databases augment AI capabilities,
facilitating tasks such as semantic information retrieval with unparalleled efficiency. Thus, they play a
pivotal role in enhancing the effectiveness of AI-driven applications, embodying the synergy between
advanced data management and transformative AI innovation.
FIASS:
FAISS (Facebook AI Similarity Search) is a cutting-edge library designed for efficient similarity
10 | P a g e
search and clustering of high-dimensional vector data. Developed by Facebook AI Research, FAISS
offers optimized algorithms tailored for large-scale datasets encountered in AI applications. Its
advanced indexing techniques, such as Product Quantization (PQ) and Hierarchical Navigable Small
World (HNSW), ensure rapid and accurate nearest neighbor search operations.
FAISS supports essential functionalities like CRUD operations and metadata filtering, simplifying data
management. Additionally, FAISS enables horizontal scaling, distributing index structures across
multiple machines for enhanced performance and scalability. As a cornerstone technology, FAISS
empowers AI systems with swift and precise retrieval of semantic information
Retrieval:
Retrieval mechanisms orchestrate the process of fetching relevant data based on user queries, bridging
the gap between raw data and actionable insights. The RetrievalQAWithSourcesChain leverages
sophisticated algorithms to identify and retrieve pertinent information, taking into account multiple
data sources and query types. By employing techniques such as semantic search and ensemble
retrieval, it enhances the precision and comprehensiveness of search results, empowering users with
actionable knowledge.
11 | P a g e
The RetrievalQAWithSourcesChain module represents the pinnacle of our system's retrieval
capabilities. Incorporating advanced algorithms, this component enables users to pose complex queries
and retrieve relevant documents with exceptional efficiency. By integrating multiple data sources and
leveraging semantic understanding, RetrievalQAWithSourcesChain empowers users to extract
actionable insights from vast repositories of textual data with unparalleled accuracy and speed.
Fig-4 Retrieval
Streamlit UI:
The Streamlit UI component serves as the user-facing interface of our system, providing intuitive
access to its functionalities. Designed for simplicity and ease of use, Streamlit UI enables users to
explore, query, and visualize data effortlessly. By offering a seamless and interactive experience, the
UI enhances user engagement and ensures efficient utilization of our system's capabilities across
diverse applications and use cases.
Built upon Streamlit's framework, the UI offers a user-friendly experience, enabling effortless access
to various functionalities and insights. Concurrently, project coding encompasses the implementation
of underlying algorithms and logic, ensuring the robustness and functionality of our system. Through
meticulous coding practices and adherence to best practices, we uphold the integrity and reliability of
our solution.
1.3 PURPOSE
12 | P a g e
The purpose of the provided code and application is to streamline and enhance the interview
preparation process for job seekers. By leveraging advanced technologies such as Streamlit,
LangChain, and OpenAI, the application offers a sophisticated platform for generating personalized
technical interview questions based on the content of uploaded resumes.
Through seamless integration with document loaders and text splitters, the application
efficiently extracts relevant information from resumes, ensuring that generated questions are tailored to
each candidate's unique skills and experiences.
Overall, the code and application aim to revolutionize interview preparation by providing a
user-friendly interface, intelligent question generation capabilities, and interactive features for audio-
based responses.
At its core, the application seeks to empower individuals with a strategic advantage in their
career pursuits. Through intelligent question generation and personalized feedback mechanisms, it
fosters a deeper understanding of one's strengths and areas for improvement, enabling candidates to
showcase their capabilities with confidence and precision during interviews.
1.4 SCOPE
The scope of our project encompasses the development of a comprehensive platform tailored to
streamline the interview process through the integration of advanced AI technologies. By leveraging
natural language processing and machine learning algorithms, our application aims to analyze
candidate resumes, generate personalized technical questions, and facilitate efficient evaluation of their
skills and experiences. With a focus on enhancing both the candidate and recruiter experience, our
platform seeks to revolutionize traditional hiring practices by providing a data-driven approach to
talent assessment and selection. Here are some potential areas of focus:
Document Loaders: Retrieve documents from diverse sources including private S3 buckets and
13 | P a g e
public websites. Integrates with major providers such as AirByte and Unstructured.
Text Splitters: Segment large documents into manageable chunks using specialized algorithms for
different document types like code, markdown, etc.
Embedding Models: Generate embeddings to capture semantic meaning of text, offering over 25
integrations with diverse providers from open source to proprietary API models.
Vector Stores: Facilitate efficient storage and retrieval of embeddings with integrations with over
50 vector stores, ranging from open-source local options to cloud-hosted proprietary solutions.
Retrievers: Retrieve relevant documents from the database using various algorithms including
basic semantic search and advanced techniques like Parent Document Retriever, Self Query
Retriever, and Ensemble Retriever.
Indexing: Sync data from any source into a vector store for efficient retrieval, preventing
duplication, re-computation of unchanged content, and minimizing resource utilization while
improving search results.
14 | P a g e
2. SRS DOCUMENT
A software requirements specification (SRS) is a document that describes what the software will do and
how it will be expected to perform.
• The system will seamlessly load resumes/documents from various sources, including local
files and web URLs. It will analyze document content to identify technical skills and project
experiences, generating personalized questions accordingly.
• The system will offer a user-friendly interface for candidates to interact with generated
questions, supporting both audio recording and text input. It will analyze candidate responses
using natural language processing techniques, evaluating relevance, coherence, and depth of
knowledge, and providing feedback to candidates and recruiters accordingly.
15 | P a g e
candidates of diverse technical backgrounds throughout the interview process.
• Efficiency : Once user has learned about the system through his interaction, he can perform the
task easily.
• Understandability : Because of user friendly interfaces, it is understandable to the users
3. ANALYSIS
16 | P a g e
3.1 EXISTING SYSTEMS
• OpenAI's high API usage costs and ethical concerns regarding biases in question generation
may hinder its suitability for large-scale interview preparation. Similarly, platforms like
Gemini may lack customization, while Hugging Face's models might require complex
integration and lack specialized capabilities, contrasting with the project's objectives.
• Brad.ai's focus on coaching may not align with automated question generation goals, and 1:1
mock interviews could lack scalability compared to automated systems. Concerns arise over
the cost and ethics of OpenAI's language models, while Gemini's focus on scheduling may
limit customization.
• Integrating Hugging Face's models may be complex, lacking specialized capabilities, and
Brad.ai's coaching emphasis might not align with the project's aims. 1:1 mock interviews
could lack scalability compared to automated systems.
Additionally, the inclusion of audio recording functionality enables dynamic responses to these
questions, fostering immersive preparation sessions. With intuitive user interface design powered by
Streamlit, our application aims to elevate interview readiness to new heights, empowering candidates
with confidence and proficiency.
17 | P a g e
For training our interview question generation model, we employ a combination of advanced
techniques:
• Contextual Analysis: Utilizing LangChain's OpenAI API, we capture nuanced patterns within
resume content to generate contextually relevant questions.
• Semantic Understanding: Leveraging LangChain's language processing capabilities, we assess
the semantic relevance of questions generated from resume data.
• Fluency Optimization: Fine-tuning OpenAI's GPT models ensures the fluency and coherence
of interview questions, enhancing their natural language generation capabilities.
• Personalization Strategies: Implementing LangChain's adaptive algorithms, we tailor question
generation based on individual user feedback and preferences. Interactive Learning:
Integrating user interaction mechanisms, we employ ensemble learning approaches to refine
question generation processes, incorporating user input for continual enhancement. Iterative
• Improvement: Through iterative training and model refinement using LangChain's resources,
we continuously optimize the question generation process, ensuring the highest quality output.
• It is very time-saving
• Dynamic Question Generation
• Accurate results
• Automated Resume Parsing
• User- friendly graphical interface
• Highly reliable
• Cost effective
18 | P a g e
To determine if the product is technically and financially feasible to develop, is the main aim of the
feasibility study activity. A feasibility study should provide management with enough information to
decide:
To analyze if the software meets organizational requirements. There are various types of feasibility
that can be determined. They are:
• Operational : Define the urgency of the problem and the acceptability of any solution,
includes people-oriented and social issues: internal issues, such as manpower problems, labor
objections, manager resistance, organizational conflicts, and policies; also, external issues,
including social acceptability, legal aspects, and government regulations.
• Technical : Is the feasibility within the limits of current technology? Does the technology
exist at all? Is it available within a given resource?
• Economic : Is the project possible, given resource constraints? Are the benefits that will
accrue from the new system worth the costs? What are the savings that will result from the
system, including tangible and intangible ones? What are the development and operational
costs?
• Schedule : Constraints on the project schedule and whether they could be reasonably met.
ECONOMIC FEASIBILITY:
Economic analysis could also be referred to as cost/benefit analysis. It is the most frequently used
method for evaluating the effectiveness of a new system. In economic analysis the procedure is to
determine the benefits and savings that are expected from a candidate system and compare them with
costs. Economic feasibility study related to price, and all kinds of expenditure related to the scheme
before the project starts. This study also improves project reliability. It is also helpful for the decision-
makers to decide the planned scheme processed latter or now, depending on the financial condition of
19 | P a g e
the organization. This evaluation process also studies the price benefits of the proposed scheme.
Economic Feasibility also performs the following tasks.
• Cost of packaged software.
• Cost of doing full system study.
• Is the system cost Effective?
TECHNICAL FEASIBILITY:
A large part of determining resources has to do with assessing technical feasibility. It considers the
technical requirements of the proposed project. The technical requirements are then compared to the
technical capability of the organization. The systems project is considered technically feasible if the
internal technical capability is sufficient to support the project requirements. The analyst must find out
whether current technical resources can be where the expertise of system analysts is
beneficial, since using their own experience and their contact with vendors they will be able to answer
the question of technical feasibility. Technical Feasibility also performs the following tasks.
OPERATIONAL FEASIBILITY:
Operational feasibility is a measure of how well a proposed system solves the problems and takes
advantage of the opportunities identified during scope definition and how it satisfies the requirements
identified in the requirements analysis phase of system development. The operational feasibility refers
to the availability of the operational resources needed to extend research results beyond on which they
were developed and for which all the operational requirements are minimal and easily accommodated.
In addition, the operational feasibility would include any rational compromises farmers make in
adjusting the technology to the limited operational resources available to them. The operational
Feasibility also perform the tasks like
• Does the current mode of operation provide adequate response time?
• Does the current of operation make maximum use of resources.
• Determines whether the solution suggested by the software development team is acceptable.
20 | P a g e
• Does the operation offer an effective way to control the data?
• Our project operates with a processor and packages installed are supported by the system.
21 | P a g e
4. SOFTWARE DESCRIPTION
4.2 LangChain
LangChain is an innovative blockchain-based platform that revolutionizes multilingual
communication and translation services. It offers a decentralized solution to bridge language barriers,
providing a secure and efficient environment for users to interact across linguistic boundaries. By
leveraging blockchain technology, LangChain ensures transparency, immutability, and trust in
language transactions. Users can access a wide range of language services, including translation,
interpretation, and language learning, all within a decentralized ecosystem. With LangChain,
individuals, businesses, and organizations can seamlessly communicate and collaborate on a global
scale, unlocking new opportunities and fostering cross-cultural understanding.
4.3 Python
Python is an interpreted, object-oriented, high-level programming language with dynamic semantics.
Its high-level built-in data structures, combined with dynamic typing and dynamic binding, make it
very attractive for Rapid Application Development, as well as for use as a scripting or glue language to
connect existing components together. Python's simple, easy to learn syntax emphasizes readability
and therefore reduces the cost of program maintenance. Python supports modules and packages, which
encourages program modularity and code reuse. The Python interpreter and the extensive standard
library are available in source or binary form without charge for all major platforms, and can be freely
distributed.
22 | P a g e
4.4 Open AI
OpenAI stands as a pioneering research organization at the forefront of artificial intelligence
development, dedicated to advancing the boundaries of AI technology and its accessibility. Renowned
for its groundbreaking research and innovative solutions, OpenAI strives to democratize AI through its
APIs, tools, and research findings, empowering developers, businesses, and researchers worldwide.
With a focus on responsible AI deployment, OpenAI fosters collaborations, conducts cutting-edge
research, and promotes ethical AI practices. Its contributions span various domains, from natural
language processing and computer vision to reinforcement learning and robotics. Through its
commitment to transparency and collaboration, OpenAI continues to shape the future of AI, driving
impactful advancements that benefit society as a whole.
4.5 PyCharm
PyCharm stands as a premier integrated development environment (IDE) meticulously crafted for
Python programming, renowned for its robust features and user-friendly interface. Developed by
JetBrains, PyCharm offers a comprehensive suite of tools designed to enhance the productivity and
efficiency of Python developers. Its intelligent code completion, advanced debugging capabilities, and
seamless integration with version control systems streamline the development workflow. PyCharm
provides support for various Python frameworks and libraries, facilitating the creation of diverse
applications ranging from web development to data analysis and machine learning. With its extensive
plugin ecosystem and customizable settings, PyCharm caters to the unique needs of developers,
enabling them to build high-quality software with ease. Whether working on personal projects or large-
scale enterprise applications, PyCharm remains a preferred choice for Python developers seeking a
feature-rich and intuitive development environment.
4.6 Streamlit
Streamlit is a Python library that simplifies the creation of interactive web applications for data science
and machine learning projects. It offers a straightforward and intuitive way to build user-friendly
interfaces without the need for extensive web development experience. With Streamlit, developers can
seamlessly integrate data visualizations, input widgets, and text elements to create dynamic
applications that enable users to explore and interact with data in real-time. Its declarative syntax and
automatic widget rendering make prototyping and deploying applications quick and efficient.
Streamlit's seamless integration with popular data science libraries like Pandas, Matplotlib, and
TensorFlow further enhances its capabilities, allowing developers to leverage their existing knowledge
and tools. Overall, Streamlit empowers data scientists and machine learning engineers to share
23 | P a g e
insights, prototypes, and models with stakeholders effectively, accelerating the development and
deployment of data-driven applications.
24 | P a g e
5. PROBLEM DESCRIPTION
We used OpenAI and LangChain to bridge the gap between theoretical knowledge and
practical skills. The main objective is to develop a reliable system for interview preparation that
provides accurate results.
We employed LangChain and OpenAI to analyze the content of uploaded PDF resumes and generate
personalized interview questions. The system adapts its approach based on the type of input provided,
whether it's textual information from resumes or audio recordings of user responses. This
comprehensive solution empowers users to practice interview scenarios effectively, bridging the gap
between theoretical knowledge and practical skills required in the competitive tech industry.
Fig-5 Overview
25 | P a g e
The steps involved in the project are: -
The output of our project is a user-friendly interface where technical students can upload their
resumes and final result is to provide a score between 0 to 100 and for each question along with the
areas of improvement.
26 | P a g e
library are available in source or binary form without charge for all major platforms, and can be freely
distributed.
Python Streamlit is a powerful open-source framework that allows developers to create interactive web
applications with ease. With its simple and intuitive syntax, developers can quickly build data-driven
applications using Python code. Streamlit provides various components and widgets for creating
interactive elements such as buttons, sliders, and charts, making it ideal for building user-friendly
interfaces.
Upon PDF upload and question generation, the interface prompts users to commence answering
questions sequentially. Each question triggers the initiation of audio recording upon user interaction
with a designated button, leveraging the device's microphone. Recorded audio is subsequently
transcribed into text format using the Google Speech Recognition API. Captured responses are stored
in session state variables for subsequent analysis. Upon completion of all questions, the application
proceeds to analyze user responses. A formulated query facilitates scrutiny of questions and
corresponding answers, yielding scores and areas for improvement for each question-answer pair.
LangChain's question-answering capabilities process the query, presenting findings to the user via the
application interface.
27 | P a g e
Fig- Module control flow
In the project, LangChain's LLM (Language Learning Model) plays a crucial role in generating
tailored interview questions based on the content of uploaded resumes. Leveraging advanced natural
language processing techniques, the LLM comprehensively analyzes the textual data to identify
relevant skills and experiences. It then formulates personalized questions to simulate real-world
28 | P a g e
interview scenarios. Additionally, the LLM evaluates user responses, providing constructive feedback
and areas for improvement. By harnessing the power of the LLM, the project enhances interview
preparation by offering dynamic and targeted question-answering interactions, ultimately empowering
users to refine their technical communication skills.
Fig- Implementations
29 | P a g e
API to convert recorded audio files to text format. This conversion enables seamless
integration of spoken responses into the question-answering workflow.
• The RecursiveCharacterTextSplitter() class initializes a text splitter object designed to break
down text into smaller, manageable chunks. This functionality aids in processing large
volumes of text efficiently, particularly when dealing with lengthy documents such as PDF
files.
• Furthermore, the OpenAIEmbeddings() class initializes an embeddings object used for
handling text embeddings, which are essential for various natural language processing tasks
such as semantic similarity analysis.
• The FAISS.from_texts() method is employed to create a FAISS vector store from the text data
extracted from documents. This vector store facilitates efficient similarity searches and other
vector-based operations, enhancing the performance of the question-answering system.
• Within the Streamlit framework, several functions and widgets are utilized to create the user
interface and manage application state. Functions such as st.file_uploader(), st.header(),
st.write(), st.button(), st.title(), st.empty(), and st.error() are employed to display text, widgets,
and interactive elements on the Streamlit app.
• Additionally, the st.session_state attribute is utilized to access and manage session state
variables, enabling data persistence and user interaction tracking within the application.
Overall, these functionalities contribute to the creation of a user-friendly and interactive
interview preparation tool.
30 | P a g e
6. SYSTEM DESIGN
Language as a standard in 1997. It's been managed by OMG ever since. International Organization for
Standardization (ISO) published UML as an approved standard in 2005. UML has been revised over
the years and is reviewed periodically.
UML is linked with object-oriented design and analysis. UML makes the use of elements and forms
associations between them to form diagrams. Diagrams in UML can be broadly classified as:
• Structural Diagrams – Capture static aspects or structure of a system. Structural Diagrams
include Component Diagrams, Object Diagrams, Class Diagrams and Deployment Diagrams.
• Behaviour Diagrams – Capture dynamic aspects or behaviour of the system. Behaviour
diagrams include Use Case Diagrams, State Diagrams, Activity Diagrams and Interaction
Diagrams.
31 | P a g e
Fig-Building Blocks of UML
Things are the abstractions that are first-class citizens in a model; relationships tie these things
together; diagrams group interesting collections of things.
These things are the basic object-oriented building blocks of the UML. You use them to write well-
formed models.
32 | P a g e
Structural Things
Structural things are the nouns of UML models. These are the mostly static parts of a model,
representing elements that are either conceptual or physical. Collectively, the structural things are
called classifiers.
A class is a description of a set of objects that share the same attributes, operations, relationships, and
semantics. A class implements one or more interfaces. Graphically, a class is rendered as a rectangle,
usually including its name, attributes, and operations
Class - A Class is a set of identical things that outlines the functionality and properties of an object. It
also represents the abstract class whose functionalities are not defined. Its notation is as follows
Interface - A collection of functions that specify a service of a class or component, i.e., Externally
visible behavior of that class.
Collaboration - A larger pattern of behaviors and actions. Example: All classes and behaviors that
create the modeling of a moving tank in a simulation.
Use Case - A sequence of actions that a system performs that yields an observable result. Used to
structure behavior in a model. Is realized by collaboration.
33 | P a g e
Component - A physical and replaceable part of a system that implements a number of interfaces.
Example: a set of classes, interfaces, and collaborations.
Node - A physical element existing at run time and represents are source.
Behavioral Things
Behavioral things are the dynamic parts of UML models. These are the verbs of a model, representing
behavior over time and space. In all, there are three primary kinds of behavioral things
• Interaction
• State machine
Interaction
It is a behavior that comprises a set of messages exchanged among a set of objects or roles within a
particular context to accomplish a specific purpose. The behavior of a society of objects or of an
individual operation may be specified with an interaction. An interaction involves a number of other
elements, including messages, actions, and connectors (the connection between objects). Graphically, a
message is rendered as a directed line, almost always including the name of its operation.
34 | P a g e
State machine
State machine is a behaviour that specifies the sequences of states an object or an interaction goes
through during its lifetime in response to events, together with its responses to those events. The
behaviour of an individual class or a collaboration of classes may be specified with a state machine. A
state machine involves a number of other elements, including states, transitions (the flow from state to
state), events (things that trigger a transition), and activities (the response to a transition). Graphically,
a state is rendered as a rounded rectangle, usually including its name and its substates.
Grouping Things
Grouping things can be defined as a mechanism to group elements of a UML model together. There is
only one grouping thing available.
Package − Package is the only one grouping thing available for gathering structural and behavioural
things.
Annotational Things
Annotational things are the explanatory parts of UML models. These are the comments you may apply
to describe, illuminate, and remark about any element in a model. There is one primary kind of
annotation thing, called a note. A note is simply a symbol for rendering constraints and comments
attached to an element or a collection of elements.
Dependency
It is an element (the independent one) that may affect the semantics of the other element (the
dependent one). Graphically, a dependency is rendered as a dashed line, possibly directed, and
occasionally including a label.
Association
Association is basically a set of links that connects the elements of a UML model. It also describes
how many objects are taking part in that relationship.
Generalization
It is a specialization/generalization relationship in which the specialized element (the child) builds on
the specification of the generalized element (the parent). The child shares the structure and the
behavior of the parent. Graphically, a generalization relationship is rendered as a solid line with a
hollow arrowhead pointing to the parent.
Realization
36 | P a g e
Realization can be defined as a relationship in which two elements are connected. One element
describes some responsibility, which is not implemented and the other one implements them. This
relationship exists in case of interfaces.
• Class diagram
• Object diagram
• Component diagram
• Composite structure diagram
• Use case diagram
• Sequence diagram
• Communication diagram
• State diagram
• Activity diagram
37 | P a g e
7. DEVELOPMENT
38 | P a g e
Fig- Resume pdf file
39 | P a g e
7.2 SAMPLE CODE (app.py)
import streamlit as st
import pickle
import os
import time
import sounddevice as sd
import soundfile as sf
import tempfile
import speech_recognition as sr
st.session_state.questions = []
st.session_state.recorded_answers = {}
40 | P a g e
if 'n' not in st.session_state:
# number of Questions
st.session_state.n=5
# number of Questions
st.session_state.analysis=False
# Sidebar contents
with st.sidebar:
st.markdown('''
## About
- [Streamlit](https://streamlit.io/)
- [LangChain](https://python.langchain.com/)
''')
os.environ['OPENAI_API_KEY'] = "sk-
efcKQr3uh0wXn7kbOYJWT3BlbkFJDnA3c8ozTQ6Outcr4HW1"
file_path = "faiss_store_openai.pkl"
main_placeholder = st.empty()
41 | P a g e
recording = sd.rec(int(duration * fs), samplerate=fs, channels=1, dtype='int16')
sd.wait()
return tmpfile.name
def convert_audio_to_text(audio_file):
r = sr.Recognizer()
try:
audio_data = r.record(source)
text = r.recognize_google(audio_data)
return text
except sr.UnknownValueError:
return ""
except sr.RequestError as e:
return f"Could not request results from Google Speech Recognition service; {e}"
# Sample rate
fs = 44100
duration = 60
# Main function
def main():
42 | P a g e
pdf = st.file_uploader("Upload your PDF", type='pdf')
main_placeholder.text("Data Loading...Started...✅✅✅")
try:
if not st.session_state.analysis:
pdf_reader = PdfReader(pdf)
if len(pdf_reader.pages) == 0:
return
text = ""
page_text = pdf_reader.pages[page_num].extract_text()
if page_text:
text += page_text
else:
# Break out of the loop if the page is empty, indicating the end of the file
break
if not text.strip():
return
43 | P a g e
# Split text into chunks
text_splitter = RecursiveCharacterTextSplitter(
chunk_size=1000,
chunk_overlap=200,
length_function=len
chunks = text_splitter.split_text(text=text)
if os.path.exists(file_path):
embeddings = OpenAIEmbeddings()
st.session_state.vectorstore.save_local("faiss_store_openai")
else:
embeddings = OpenAIEmbeddings()
st.session_state.vectorstore.save_local("faiss_store_openai")
# Generate questions
query = f"Give {st.session_state.n} technical questions on the skills and projects from the above
pdf"
if query:
44 | P a g e
with get_openai_callback() as cb:
if not st.session_state.questions:
st.session_state.questions = list(response.split('?'))[0:-1]
if not st.session_state.recorded_answers:
for i in range(st.session_state.n):
curr_qns_ans = {}
curr_qns_ans["Question"] = st.session_state.questions[i]
st.session_state.recorded_answers[i] = curr_qns_ans
st.session_state.analysis=True
st.header("Questions")
st.write(f"{question}")
if start_recording:
st.write("Listening...")
st.write("Time's up!")
45 | P a g e
text = convert_audio_to_text(audio_file)
curr_qns_ans = {}
curr_qns_ans["Question"] = question
curr_qns_ans["Answer"] = text
st.session_state.recorded_answers[i] = curr_qns_ans
if st.session_state.recorded_answers:
st.write(f'{ele["Question"]}')
st.write(f'Answer: {ele["Answer"]}')
query = f"""Analyze all the above questions and corresponding answers and give a score between
0 to 100 and also provide the areas of improvement for betterment of the candidate. The list of questions
and answers are as follows, providing a review only for answered questions:
{str(st.session_state.recorded_answers)}. Give analysis for every question and corresponding answer. The
format of the review is '[Question number] : [score]/100 Areas of improvement: [suggestions to
improve]'. Every question's response should be separated by '###'. For example:
Question 1: Score - [score]/100 Areas of improvement: The candidate provided a brief answer, but
it could be improved by providing more specific details about the methods used to fine-tune the VGG16
model and the results achieved ###
Question 2: Score - N/A Areas of improvement: The candidate did not provide an answer for this
question, so no score or areas of improvement can be given
46 | P a g e
and question number starts from 1.Please give each answer in a newline"""
count = 0
count += 1
reviews=response.split("###")
st.write(review)
# st.write(response)
except Exception as e:
if __name__ == '__main__':
main()
47 | P a g e
Fig- Resume Uploading
48 | P a g e
Fig- Analyzing Resume
49 | P a g e
Fig- Interview Questions generation from Resume
50 | P a g e
Fig- Answer Completion and speech to text conversion and analysis
51 | P a g e
Fig- Analysis and suggestions after all questions being answered.
52 | P a g e
8. TESTING
Importance of Testing
The importance of software testing is imperative. A lot of times this process is skipped, therefore, the
product and business might suffer. To understand the importance of testing, here are some key points
to explain
• Software Testing saves money
• Provides Security
• Improves Product Quality
• Customer satisfaction
Testing is of different ways The main idea behind the testing is to reduce the errors and do it with a
minimum time and effort.
Benefits of Testing
• Cost-Effective: It is one of the important advantages of software testing. Testing any IT
project on time helps you to save your money for the long term. In case if the bugs caught in
the earlier stage of software testing, it costs less to fix.
• Security: It is the most vulnerable and sensitive benefit of software testing. People are looking
for trusted products. It helps in removing risks and problems earlier.
• Product quality: It is an essential requirement of any software product. Testing ensures a
quality product is delivered to customers.
• Customer Satisfaction: The main aim of any product is to give satisfaction to their customers.
UI/UX Testing ensures the best user experience.
53 | P a g e
Different types of Testing
Unit Testing: Unit tests are very low level, close to the source of your application. They consist in
testing individual methods and functions of the classes, components or modules used by your software.
Unit tests are in general quite cheap to automate and can be run very quickly by a continuous
integration server.
Integration Testing: Integration tests verify that different modules or services used by your
application work well together. For example, it can be testing the interaction with the database or
making sure that microservices work together as expected. These types of tests are more expensive to
run as they require multiple parts of the application to be up and running.
Functional Tests: Functional tests focus on the business requirements of an application. They only
verify the output of an action and do not check the intermediate states of the system when performing
that action. There is sometimes a confusion between integration tests and functional tests as they both
require multiple components to interact with each other. The difference is that an integration test may
simply verify that you can query the database while a functional test would expect to get a specific
value from the database as defined by the product requirements.
Regression Testing: Regression testing is a crucial stage for the product & very useful for the
developers to identify the stability of the product with the changing requirements. Regression testing is
a testing that is done to verify that a code change in the software does not impact the existing
functionality of the product.
System Testing: System testing of software or hardware is testing conducted on a complete integrated
system to evaluate the system’s compliance with its specified requirements. System testing is a series
of different tests whose primary purpose is to fully exercise the computer-based system.
Performance Testing: It checks the speed, response time, reliability, resource usage, scalability of a
software program under their expected workload. The purpose of Performance Testing is not to find
functional defects but to eliminate performance bottlenecks in the software or device.
Alpha Testing: This is a form of internal acceptance testing performed mainly by the in- house
software QA and testing teams. Alpha testing is the last testing done by the test teams at the
development site after the acceptance testing and before releasing the software for the beta test. It can
also be done by the potential users or customers of the application. But still, this is a form of in-house
acceptance testing.
54 | P a g e
Beta Testing: This is a testing stage followed by the internal full alpha test cycle. This is the final
testing phase where the companies release the software to a few external user groups outside the
company test teams or employees. This initial software version is known as the beta version. Most
companies gather user feedback in this release.
Black Box Testing: It is also known as Behavioural Testing, is a software testing method in which the
internal structure/design/implementation of the item being tested is not known to the tester. These tests
can be functional or non-functional, though usually functional.
This method is named so because the software program, in the eyes of the tester, is like a black box;
inside which one cannot see. This method attempts to find errors in the following categories:
• Incorrect or missing functions
• Interface errors
• Errors in data structures or external database access
• Behaviour or performance errors
• Initialization and termination errors
55 | P a g e
White Box Testing: White box testing (also known as Clear Box Testing, Open Box Testing, Glass
Box Testing, Transparent Box Testing, Code-Based Testing or Structural Testing) is a software testing
method in which the internal structure/design/implementation of the item being tested is known to the
tester. The tester chooses inputs to exercise paths through the code and determines the appropriate
outputs. Programming know-how and the implementation knowledge is essential. White box testing is
testing beyond the user interface and into the nitty-gritty of a system. This method is named so because
the software program, in the eyes of the tester, is like a white/transparent box; inside which one clearly
sees.
56 | P a g e
9. CONCLUSION
Our project leverages advanced technologies like LangChain and OpenAI to revolutionize the interview
preparation process. By automating question generation based on resume content and enabling
personalized audio responses, we offer a dynamic and efficient platform for candidates to hone their
interview skills. The seamless integration of machine learning algorithms ensures objectivity, fairness,
and real-time feedback, enhancing the overall interview experience
Both benefits and drawbacks exist with our project. On the positive side, it automates question
generation and response recording, streamlining the interview preparation process. Additionally, it
provides personalized feedback and analysis, enhancing candidate performance and confidence.
However, reliance on machine learning algorithms may introduce biases or inaccuracies in question
generation, impacting the quality of interview practice. Our system may not fully replicate the nuances
of human interaction in interview scenarios, and users should supplement their preparation with real-
world practice and feedback.
The main challenge that we faced while working on this project was the need for internet connectivity
and API access may limit accessibility and usability in certain environments.
57 | P a g e
10. FUTURE SCOPE
The future scope of our project is expansive, driven by our overarching objective of
revolutionizing interview preparation processes.
We continue to refine our system, we aim to leverage cutting-edge technologies to enhance user
experience and effectiveness.
This includes exploring advanced natural language processing techniques to generate more
contextually relevant and diverse interview questions.
Additionally, we envision integrating machine learning algorithms to provide personalized
feedback and performance analytics to users.
Moreover, we plan to expand the application's capabilities by incorporating features such as mock
interview simulations and industry-specific question sets.
These enhancements will ensure that our platform remains at the forefront of interview preparation
innovation, catering to diverse user needs and preferences.
Therefore, these are some upcoming upgrades or enhancements that we intend to make.
58 | P a g e
11. REFERENCE LINKS
59 | P a g e