[go: up one dir, main page]

0% found this document useful (0 votes)
40 views59 pages

Final

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
40 views59 pages

Final

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 59

AI-Powered Interview Assistant

A Main project thesis submitted in partial fulfillment of requirements for the award of
degree for VIII semester.

BACHELOR OF TECHNOLOGY

COMPUTER SCIENCE AND ENGINEERING

(AI and ML)


by

J.BHAVYA SRI (20131A4245)


RAMEEZ AHMAD (20131A4245)
NUZHATH TAHSEEN (20131A4245)
P. VIJAY SIMHA REDDY (20131A4245)

Under the esteemed guidance of


Dr.R.Seetha Sireesha,
Assistant Professor,
Department of Computer Science and Engineering

GAYATRI VIDYA PARISHAD COLLEGE OF ENGINEERING(AUTONOMUOS)


(Affiliated to JNTU-K, Kakinada)
VISKHAPATNAM
2023-2024

1|Page
CERTIFICATE

This is to certify that the main project entitled “AI-Powered Interview Assistant” being submitted by

J.BHAVYA SRI (20131A4245)


RAMEEZ AHMAD (20131A4245)
NUZHATH TAHSEEN (20131A4245)
P. VIJAY SIMHA REDDY (20131A4245)

in partial fulfilment for the award of the degree “Bachelor of Technology” in Computer Science and
Engineering to the Jawaharlal Nehru Technological University, Kakinada is a record of bonafide work
done under my guidance and supervision during VIII semester of the academic year 2022-2023.

The results embodied in this record have not been submitted to any other university or institution
for the award of any Degree or Diploma.

Guide Head of the Department


Dr. R Seeta Sireesha Dr. R Seeta Sireesha
Assistant Professor Associate Professor snd H.O.D
Department of CSE Department of CSE
GVPCE(A) GVPCE(A)

DECLARATION

2|Page
We hereby declare that this project entitled “AI-Powered Interview Assistant” is a
bonafide work done by us and submitted to “Department of Computer Science and Engineering,
G. V. P College of Engineering (Autonomous) Visakhapatnam, in partial fulfilment for the award
of the degree of B. Tech is of our own and it is not submitted to any other university or has been
published anytime before.

PLACE : VISAKHAPATNAM J.BHAVYA SRI (20131A4245)

DATE : RAMEEZ AHMAD (20131A4245)


NUZHATH TAHSEEN (20131A4245)
P. VIJAY SIMHA REDDY (20131A4245)

ACKNOWLEDGEMENT

3|Page
We would like to express our deep sense of gratitude to our esteemed institute Gayatri
Vidya Parishad College of Engineering (Autonomous), which has provided us an opportunity
to fulfill our cherished desire.

We express our sincere thanks to our principal Dr. A. B. KOTESWARA RAO, Gayatri
Vidya Parishad College of Engineering (Autonomous) for his encouragement to us during this
project, giving us a chance to explore and learn new technologies in the form of mini projects.

We express our deep sense of Gratitude to Dr. D. N. D. HARINI, Associate Professor


and Head of the Department of Computer Science and Engineering, Gayatri Vidya
Parishad College of Engineering (Autonomous) for giving us an opportunity to do the project
in college.

We express our profound gratitude and our deep indebtedness to our guide Dr. R.
Seeta Sireesha , Associate Professor, Department of Computer Science and Engineering
whose valuable suggestions, guidance and comprehensive assessments helped us a lot in
realizing our project.

We also thank our coordinator, Dr. CH. SITA KUMARI, Associate Professor,
Department of Computer Science and Engineering, for the kind suggestions and guidance for
the successful completion of our project work.

J.BHAVYA SRI (20131A4245)


RAMEEZ AHMAD (20131A4245)
NUZHATH TAHSEEN (20131A4245)
P. VIJAY SIMHA REDDY (20131A4245)

4|Page
ABSTRACT

In today's rapidly evolving educational landscape, where students pursuing technical degrees are
constantly balancing coursework, internships, and part-time jobs, the allocation of time for crucial skill
assessments becomes increasingly challenging. As graduation looms, the absence of adequate
interview preparation can instill fear and undermine confidence levels.

The challenge is posed by inflexible schedules and set interview timelines, which make it difficult for
students to connect theoretical knowledge with the practical skills necessary in the competitive tech
industry. It's crucial for students to continuously adapt and acquire advanced skills to stay relevant in
the face of rapid technological advancements..
With technology evolving rapidly, students must continually acquire advanced skills to remain
competitive. While current models contribute to skill development, they often fall short in providing
personalized one-on-one interview experiences.

Existing skill development models often fail to provide personalized one-on-one interview
experiences, leaving students without adequate guidance to excel. However, assessing interview
performance and identifying areas for improvement remains challenging.

Our proposed approach leverages OpenAI keys and LangChain models to revolutionize interview
preparation by generating tailored questions from user resumes, enhancing the learning experience.
Streamlit facilitates seamless interaction, while OpenAI integration enhances simulation sophistication,
bridging the gap between theory and practice. This comprehensive solution represents a paradigm
shift, empowering students to excel confidently in job interviews.

KEYWORDS : Langchain , OpenAI , Streamlit.

5|Page
INDEX
MODULE Topic name Page number

Module 1: Cloud Concepts


1.1 Introduction to cloud computing 12
Overview

2.1 What is AWS ?


2.2 How we pay for the resources used in AWS?
2.3 What is TCO?
Module 2: Cloud Economics
2.4 Why Use TCO? 13
and Billing
2.5 Uses of AWS Pricing Calculator
2.6 AWS Organizations
2.7 How to access AWS Organizations?
Module 3: AWS Global
3.1 AWS Infrastructure Features 15
Infrastructure Overview
4.1 AWS Cloud Security:
4.2 AWS Shared Responsibility Model

Module 4: AWS Cloud 4.3 AWS IAM


16
Security 4.4 Securing a new AWS account
4.5 Security Accounts
4.6 Security data on AWS
5.1 Networking basics
5.2 Amazon VPC
Module 5: Networking and
5.3 VPC Networking 18
Content Delivery
5.4 Amazon Route 53
5.5 Amazon CloudFront benefits
Module 6: Compute 6.1 Compute services overview 20
7.1 Amazon Elastic Block Store (Amazon EBS)
7.2 Amazon S3
Module 7: Storage 21
7.2 Amazon Elastic File System (Amazon EFS)
7.4 Amazon S3 Glacier
8.1 Amazon Relational database service
Module-8: Databases 23
8.2 Challenges of relational databases
9.1 Cloud Architects
Module 9: Cloud Architecture 24
9.2 AWS Trusted Advisor
10.1 Elastic Load Balancing:

Module 10 : Automatic Scaling 10.2 Types of load balancers


25
and Monitoring 10.3 Elastic Load Balancing use cases:
10.4 Amazon Watch Cloud

1. INTRODUCTION
6|Page
In the realm of career advancement, meticulous interview preparation emerges as a cornerstone of
success. This innovative application, powered by cutting-edge technologies such as Streamlit,
LangChain, and OpenAI, epitomizes a sophisticated solution tailored to meet the evolving needs of
today's job seekers. By seamlessly integrating advanced language processing mechanisms, it offers a
refined approach to interview readiness, empowering individuals with personalized insights and
guidance
At its essence, the application embodies efficiency and efficacy, leveraging the prowess of
Streamlit for intuitive user interaction and LangChain for seamless text processing. Through the lens of
OpenAI's language models, it navigates the complexities of resume parsing, swiftly distilling pertinent
information and crafting tailored interview questions. This symbiotic integration of technology not only
optimizes the preparation journey but also ensures that candidates are equipped with a comprehensive
understanding of the topics they may encounter during interviews.
Furthermore, the project objective transcends mere question generation; it aspires to foster a
culture of continuous improvement and empowerment. By facilitating audio recording capabilities and
leveraging LangChain's capabilities, it enables candidates to articulate their responses with clarity and
precision. Through iterative analysis and feedback loops, the application empowers individuals to refine
their communication skills, ultimately enhancing their confidence and competitiveness in the job
market.

1.1 OBJECTIVE
The primary objective of this application is to revolutionize the interview preparation process, offering
a holistic solution that combines technological innovation with strategic foresight. Through the
seamless integration of Streamlit, LangChain, and OpenAI, it endeavors to provide individuals with a
tailored and immersive experience that transcends traditional methods of preparation.

Firstly, the application aims to streamline the preparation journey by leveraging Streamlit's
interactive interface, ensuring user-friendly navigation and engagement. Additionally, through
LangChain's text processing capabilities, it seeks to automate the extraction of relevant information
from resumes, facilitating the generation of personalized interview questions that align with the
candidate's skills and experiences.

Moreover, the application aspires to enhance candidates' communication proficiency through the
integration of audio recording functionality. By leveraging LangChain's speech-to-text conversion

7|Page
capabilities, it enables individuals to articulate their responses verbally, fostering a more dynamic and
immersive preparation experience. Through iterative analysis and feedback mechanisms, the
application empowers candidates to refine their responses and elevate their interview performance
Overall, the objective of this application is to empower individuals with the tools and insights
necessary to confidently navigate the interview process and secure their desired career opportunities.
By harnessing the combined capabilities of Streamlit, LangChain, and OpenAI, it aims to redefine the
paradigm of interview preparation, setting a new standard for efficiency, effectiveness, and
empowerment.

1.2 ABOUT THE ALGORITHM

In our algorithm, we aim to develop a robust system for efficient document retrieval and processing,
leveraging advanced techniques such as document loaders, text splitting, embedding models, vector
stores, retrievers, and indexing. This algorithmic framework is crucial for enabling streamlined access
to information, enhancing search capabilities, and facilitating seamless integration with user interfaces.

Fig-1 Langchain Architecture

Document loaders.
Document loaders act as the primary entry point for bringing data into our system. They provide the
initial step in the data ingestion process, facilitating the seamless integration of textual content from
various sources.

Text Loader:
8|Page
The Text Loader component serves as a foundational element in our system, responsible for sourcing
textual documents from various data repositories. By seamlessly interfacing with diverse sources
including local files and cloud-based storage solutions, Text Loader ensures the reliable acquisition of
data essential for subsequent processing and analysis.

Unstructured URL Loader:


The Unstructured URL Loader expands our system's capabilities by enabling the retrieval of
unstructured data from web sources. Through sophisticated web scraping techniques, this component
facilitates the extraction of information from publicly accessible URLs, enriching our dataset with
external content for comprehensive analysis and insight generation.

Text Splitters
Text Splitter efficiently breaks down large documents into manageable chunks, enhancing processing
efficiency and enabling targeted analysis. Coherent Chunking: Utilizes advanced algorithms to ensure
that text chunks maintain coherence and relevance, preserving the contextual integrity of the original
document. Optimized Processing: By segmenting text into smaller units, Text Splitter optimizes
subsequent retrieval and analysis processes, facilitating faster and more accurate information
extraction.

Fig- 2 Text Splitters

Character Text Splitter:

9|Page
At the core of our data preprocessing pipeline, the Character Text Splitter module plays a pivotal role
in segmenting large textual documents into manageable fragments. Utilizing sophisticated character-
based splitting algorithms, this component optimizes data processing efficiency and enhances retrieval
performance by isolating relevant sections of text.

Recursive Character Text Splitter:


Building upon the capabilities of its predecessor, the Recursive Character Text Splitter further refines
the text segmentation process through recursive parsing techniques. This advanced algorithm ensures
precise extraction of meaningful content from complex documents, facilitating accurate representation
across diverse formats and structures.

Vector Database:
In the ever-evolving landscape of artificial intelligence, vector databases stand as pivotal solutions,
indexing and storing vector embeddings to enable swift retrieval and similarity searches. As we
navigate through the AI revolution, these databases emerge as indispensable tools, addressing the
escalating complexity and scale of modern data processing. By harnessing the semantic richness
embedded within vector representations, they empower applications reliant on large language models
and generative AI, facilitating efficient knowledge retrieval and long-term memory maintenance.

Through seamless integration with embedding models, these databases augment AI capabilities,
facilitating tasks such as semantic information retrieval with unparalleled efficiency. Thus, they play a

Fig-3 Vector Database

pivotal role in enhancing the effectiveness of AI-driven applications, embodying the synergy between
advanced data management and transformative AI innovation.

FIASS:
FAISS (Facebook AI Similarity Search) is a cutting-edge library designed for efficient similarity

10 | P a g e
search and clustering of high-dimensional vector data. Developed by Facebook AI Research, FAISS
offers optimized algorithms tailored for large-scale datasets encountered in AI applications. Its
advanced indexing techniques, such as Product Quantization (PQ) and Hierarchical Navigable Small
World (HNSW), ensure rapid and accurate nearest neighbor search operations.

FAISS supports essential functionalities like CRUD operations and metadata filtering, simplifying data
management. Additionally, FAISS enables horizontal scaling, distributing index structures across
multiple machines for enhanced performance and scalability. As a cornerstone technology, FAISS
empowers AI systems with swift and precise retrieval of semantic information

Fig-4 FAISS Indexing

Retrieval:
Retrieval mechanisms orchestrate the process of fetching relevant data based on user queries, bridging
the gap between raw data and actionable insights. The RetrievalQAWithSourcesChain leverages
sophisticated algorithms to identify and retrieve pertinent information, taking into account multiple
data sources and query types. By employing techniques such as semantic search and ensemble
retrieval, it enhances the precision and comprehensiveness of search results, empowering users with
actionable knowledge.

Retrieval Questions and Answers With Sources Chain :

11 | P a g e
The RetrievalQAWithSourcesChain module represents the pinnacle of our system's retrieval
capabilities. Incorporating advanced algorithms, this component enables users to pose complex queries
and retrieve relevant documents with exceptional efficiency. By integrating multiple data sources and
leveraging semantic understanding, RetrievalQAWithSourcesChain empowers users to extract
actionable insights from vast repositories of textual data with unparalleled accuracy and speed.

Fig-4 Retrieval

Streamlit UI:
The Streamlit UI component serves as the user-facing interface of our system, providing intuitive
access to its functionalities. Designed for simplicity and ease of use, Streamlit UI enables users to
explore, query, and visualize data effortlessly. By offering a seamless and interactive experience, the
UI enhances user engagement and ensures efficient utilization of our system's capabilities across
diverse applications and use cases.

Built upon Streamlit's framework, the UI offers a user-friendly experience, enabling effortless access
to various functionalities and insights. Concurrently, project coding encompasses the implementation
of underlying algorithms and logic, ensuring the robustness and functionality of our system. Through
meticulous coding practices and adherence to best practices, we uphold the integrity and reliability of
our solution.

1.3 PURPOSE

12 | P a g e
The purpose of the provided code and application is to streamline and enhance the interview
preparation process for job seekers. By leveraging advanced technologies such as Streamlit,
LangChain, and OpenAI, the application offers a sophisticated platform for generating personalized
technical interview questions based on the content of uploaded resumes.

Through seamless integration with document loaders and text splitters, the application
efficiently extracts relevant information from resumes, ensuring that generated questions are tailored to
each candidate's unique skills and experiences.

Additionally, the incorporation of audio recording functionality allows candidates to verbally


respond to interview questions, fostering dynamic and immersive preparation sessions. The
application's objective is to empower job seekers with the tools and resources needed to confidently
navigate the interview process and secure their desired career opportunities.

Overall, the code and application aim to revolutionize interview preparation by providing a
user-friendly interface, intelligent question generation capabilities, and interactive features for audio-
based responses.

By combining cutting-edge technologies with a focus on user-centric design, the application


strives to enhance the efficiency, effectiveness, and confidence of job seekers as they prepare for
interviews. With its comprehensive approach and innovative features, the application sets out to
redefine the standard for interview preparation in the modern job market.

At its core, the application seeks to empower individuals with a strategic advantage in their
career pursuits. Through intelligent question generation and personalized feedback mechanisms, it
fosters a deeper understanding of one's strengths and areas for improvement, enabling candidates to
showcase their capabilities with confidence and precision during interviews.

1.4 SCOPE
The scope of our project encompasses the development of a comprehensive platform tailored to
streamline the interview process through the integration of advanced AI technologies. By leveraging
natural language processing and machine learning algorithms, our application aims to analyze
candidate resumes, generate personalized technical questions, and facilitate efficient evaluation of their
skills and experiences. With a focus on enhancing both the candidate and recruiter experience, our
platform seeks to revolutionize traditional hiring practices by providing a data-driven approach to
talent assessment and selection. Here are some potential areas of focus:
 Document Loaders: Retrieve documents from diverse sources including private S3 buckets and

13 | P a g e
public websites. Integrates with major providers such as AirByte and Unstructured.
 Text Splitters: Segment large documents into manageable chunks using specialized algorithms for
different document types like code, markdown, etc.
 Embedding Models: Generate embeddings to capture semantic meaning of text, offering over 25
integrations with diverse providers from open source to proprietary API models.
 Vector Stores: Facilitate efficient storage and retrieval of embeddings with integrations with over
50 vector stores, ranging from open-source local options to cloud-hosted proprietary solutions.
 Retrievers: Retrieve relevant documents from the database using various algorithms including
basic semantic search and advanced techniques like Parent Document Retriever, Self Query
Retriever, and Ensemble Retriever.
 Indexing: Sync data from any source into a vector store for efficient retrieval, preventing
duplication, re-computation of unchanged content, and minimizing resource utilization while
improving search results.

14 | P a g e
2. SRS DOCUMENT

A software requirements specification (SRS) is a document that describes what the software will do and
how it will be expected to perform.

2.1 FUNCTIONAL REQUIREMENTS


A Functional Requirement (FR) is a description of the service that the software must offer. It describes
a software system or its component. A function is nothing but inputs to the software system, its
behavior, and outputs. It can be a calculation, data manipulation, business process, user interaction, or
any other specific functionality which defines what function a system is likely to perform. Functional
Requirements are also called Functional Specification.

• The system will seamlessly load resumes/documents from various sources, including local
files and web URLs. It will analyze document content to identify technical skills and project
experiences, generating personalized questions accordingly.
• The system will offer a user-friendly interface for candidates to interact with generated
questions, supporting both audio recording and text input. It will analyze candidate responses
using natural language processing techniques, evaluating relevance, coherence, and depth of
knowledge, and providing feedback to candidates and recruiters accordingly.

2.2 NON-FUNCTIONAL REQUIREMENTS


NON-FUNCTIONAL REQUIREMENT (NFR) specifies the quality attribute of a software system.
They judge the software system based on Responsiveness, Usability, Security, Portability. Non-
functional requirements are called qualities of a system, there are as follows:

• Performance-Performance : The system will offer real-time feedback during question


generation and answer submission while maintaining high performance under concurrent user
interactions.
• Reliability : The system will ensure data integrity and confidentiality throughout the
interview process. It shall have mechanisms in place to recover gracefully from unexpected
errors or failures.
• Operability : The interface of the system will be consistent.
• Usability : The user interface must be intuitive and provide clear instructions, accommodating

15 | P a g e
candidates of diverse technical backgrounds throughout the interview process.
• Efficiency : Once user has learned about the system through his interaction, he can perform the
task easily.
• Understandability : Because of user friendly interfaces, it is understandable to the users

2.3 MINIMUM HARDWARE REQUIREMENTS


• Processor -Intel Core i3 or above
• Hard Disk – 256GB
• RAM – 8GB
• Operating System – Windows 10

2.4 MINIMUM SOFTWARE REQUIREMENTS


Python based Computer Vision and Deep Learning libraries will be exploited for the development and
experimentation of the project.

• Programming Language – PYTHON 3.5


• IDE – Visual Studio Code
• Langchain
• OpenAI API
• Google API
• Streamlit
• Pycharm

3. ANALYSIS

16 | P a g e
3.1 EXISTING SYSTEMS
• OpenAI's high API usage costs and ethical concerns regarding biases in question generation
may hinder its suitability for large-scale interview preparation. Similarly, platforms like
Gemini may lack customization, while Hugging Face's models might require complex
integration and lack specialized capabilities, contrasting with the project's objectives.
• Brad.ai's focus on coaching may not align with automated question generation goals, and 1:1
mock interviews could lack scalability compared to automated systems. Concerns arise over
the cost and ethics of OpenAI's language models, while Gemini's focus on scheduling may
limit customization.
• Integrating Hugging Face's models may be complex, lacking specialized capabilities, and
Brad.ai's coaching emphasis might not align with the project's aims. 1:1 mock interviews
could lack scalability compared to automated systems.

DRAWBACKS OF EXISTING ALGORITHM

• Limited Information Extraction: Discuss how existing systems struggle to extract


comprehensive information from PDF resumes. Emphasize the limitations in parsing complex
formats and layouts.
• Time-Consuming Manual Review: Elaborate on the time and effort required to manually
review resumes. Highlight the inefficiency in handling a large volume of resumes.
• Subjective Evaluation: Point out the subjectivity in human-based evaluations. Discuss the
potential for bias and inconsistency in decision-making

3.2 PROPOSED ALGORITHM


We envisioned a pioneering solution aimed at revolutionizing interview preparation. Our platform
seamlessly integrates cutting-edge technologies to offer a comprehensive suite of features. Users can
upload their resumes, triggering advanced algorithms to generate personalized interview questions
tailored to their skills and experiences.

Additionally, the inclusion of audio recording functionality enables dynamic responses to these
questions, fostering immersive preparation sessions. With intuitive user interface design powered by
Streamlit, our application aims to elevate interview readiness to new heights, empowering candidates
with confidence and proficiency.

17 | P a g e
For training our interview question generation model, we employ a combination of advanced
techniques:
• Contextual Analysis: Utilizing LangChain's OpenAI API, we capture nuanced patterns within
resume content to generate contextually relevant questions.
• Semantic Understanding: Leveraging LangChain's language processing capabilities, we assess
the semantic relevance of questions generated from resume data.
• Fluency Optimization: Fine-tuning OpenAI's GPT models ensures the fluency and coherence
of interview questions, enhancing their natural language generation capabilities.
• Personalization Strategies: Implementing LangChain's adaptive algorithms, we tailor question
generation based on individual user feedback and preferences. Interactive Learning:
Integrating user interaction mechanisms, we employ ensemble learning approaches to refine
question generation processes, incorporating user input for continual enhancement. Iterative
• Improvement: Through iterative training and model refinement using LangChain's resources,
we continuously optimize the question generation process, ensuring the highest quality output.

ADVANTAGES OF PROPOSED MODEL

• It is very time-saving
• Dynamic Question Generation
• Accurate results
• Automated Resume Parsing
• User- friendly graphical interface
• Highly reliable
• Cost effective

3.3 FEASIBILITY STUDY


A feasibility study is an analysis that takes all a project's relevant factors into account including
economic, technical, legal, and scheduling considerations to ascertain the likelihood of completing the
project successfully. A feasibility study is important and essential to evolute any proposed project is
feasible or not. A feasibility study is simply an assessment of the practicality of a proposed plan or
project.

The main objectives of feasibility are mentioned below:

18 | P a g e
To determine if the product is technically and financially feasible to develop, is the main aim of the
feasibility study activity. A feasibility study should provide management with enough information to
decide:

• Whether the project can be done.


• To determine how successful your proposed action will be.
• Whether the final product will benefit its intended users.
• To describe the nature and complexity of the project.
• What are the alternatives among which a solution will be chosen (During subsequent phases)

To analyze if the software meets organizational requirements. There are various types of feasibility
that can be determined. They are:

• Operational : Define the urgency of the problem and the acceptability of any solution,
includes people-oriented and social issues: internal issues, such as manpower problems, labor
objections, manager resistance, organizational conflicts, and policies; also, external issues,
including social acceptability, legal aspects, and government regulations.
• Technical : Is the feasibility within the limits of current technology? Does the technology
exist at all? Is it available within a given resource?
• Economic : Is the project possible, given resource constraints? Are the benefits that will
accrue from the new system worth the costs? What are the savings that will result from the
system, including tangible and intangible ones? What are the development and operational
costs?
• Schedule : Constraints on the project schedule and whether they could be reasonably met.

ECONOMIC FEASIBILITY:
Economic analysis could also be referred to as cost/benefit analysis. It is the most frequently used
method for evaluating the effectiveness of a new system. In economic analysis the procedure is to
determine the benefits and savings that are expected from a candidate system and compare them with
costs. Economic feasibility study related to price, and all kinds of expenditure related to the scheme
before the project starts. This study also improves project reliability. It is also helpful for the decision-
makers to decide the planned scheme processed latter or now, depending on the financial condition of

19 | P a g e
the organization. This evaluation process also studies the price benefits of the proposed scheme.
Economic Feasibility also performs the following tasks.
• Cost of packaged software.
• Cost of doing full system study.
• Is the system cost Effective?

TECHNICAL FEASIBILITY:
A large part of determining resources has to do with assessing technical feasibility. It considers the
technical requirements of the proposed project. The technical requirements are then compared to the
technical capability of the organization. The systems project is considered technically feasible if the
internal technical capability is sufficient to support the project requirements. The analyst must find out
whether current technical resources can be where the expertise of system analysts is

beneficial, since using their own experience and their contact with vendors they will be able to answer
the question of technical feasibility. Technical Feasibility also performs the following tasks.

• Is the technology available within the given resource constraints?


• Is the technology have the capacity to handle the solution
• Determines whether the relevant technology is stable and established.
• Is the technology chosen for software development has a large number of users so that they
can be consulted when problems arise, or improvements are required?

OPERATIONAL FEASIBILITY:
Operational feasibility is a measure of how well a proposed system solves the problems and takes
advantage of the opportunities identified during scope definition and how it satisfies the requirements
identified in the requirements analysis phase of system development. The operational feasibility refers
to the availability of the operational resources needed to extend research results beyond on which they
were developed and for which all the operational requirements are minimal and easily accommodated.
In addition, the operational feasibility would include any rational compromises farmers make in
adjusting the technology to the limited operational resources available to them. The operational
Feasibility also perform the tasks like
• Does the current mode of operation provide adequate response time?
• Does the current of operation make maximum use of resources.
• Determines whether the solution suggested by the software development team is acceptable.

20 | P a g e
• Does the operation offer an effective way to control the data?
• Our project operates with a processor and packages installed are supported by the system.

3.4 COST BENEFIT ANALYSIS


The financial and the economic questions during the preliminary investigation are verified to estimate
the following:
• The cost of the hardware and software for the class of application being considered.
• The benefits in the form of reduced cost.
• The proposed system will give the minute information, as a result.
• Performance is improved which in turn may be expected to provide increased profits.
• This feasibility checks whether the system can be developed with the available funds.
• This can be done economically if planned judicially, so it is economically feasible.
• The cost of the project depends upon the number of man-hours required

21 | P a g e
4. SOFTWARE DESCRIPTION

4.1 Visual Studio Code


Visual Studio Code (famously known as VS Code) is a free open-source text editor by Microsoft. VS
Code is available for Windows, Linux, and macOS. VS Code supports a wide array of programming
languages from Java, C++, and Python to CSS, Go, and Docker file. Moreover, VS Code allows you to
add on and even creating new extensions including code linters, debuggers, and cloud and web
development support. The VS Code user interface allows for a lot of interaction compared to other text
editors.

4.2 LangChain
LangChain is an innovative blockchain-based platform that revolutionizes multilingual
communication and translation services. It offers a decentralized solution to bridge language barriers,
providing a secure and efficient environment for users to interact across linguistic boundaries. By
leveraging blockchain technology, LangChain ensures transparency, immutability, and trust in
language transactions. Users can access a wide range of language services, including translation,
interpretation, and language learning, all within a decentralized ecosystem. With LangChain,
individuals, businesses, and organizations can seamlessly communicate and collaborate on a global
scale, unlocking new opportunities and fostering cross-cultural understanding.

4.3 Python
Python is an interpreted, object-oriented, high-level programming language with dynamic semantics.
Its high-level built-in data structures, combined with dynamic typing and dynamic binding, make it
very attractive for Rapid Application Development, as well as for use as a scripting or glue language to
connect existing components together. Python's simple, easy to learn syntax emphasizes readability
and therefore reduces the cost of program maintenance. Python supports modules and packages, which
encourages program modularity and code reuse. The Python interpreter and the extensive standard
library are available in source or binary form without charge for all major platforms, and can be freely
distributed.

22 | P a g e
4.4 Open AI
OpenAI stands as a pioneering research organization at the forefront of artificial intelligence
development, dedicated to advancing the boundaries of AI technology and its accessibility. Renowned
for its groundbreaking research and innovative solutions, OpenAI strives to democratize AI through its
APIs, tools, and research findings, empowering developers, businesses, and researchers worldwide.
With a focus on responsible AI deployment, OpenAI fosters collaborations, conducts cutting-edge
research, and promotes ethical AI practices. Its contributions span various domains, from natural
language processing and computer vision to reinforcement learning and robotics. Through its
commitment to transparency and collaboration, OpenAI continues to shape the future of AI, driving
impactful advancements that benefit society as a whole.

4.5 PyCharm
PyCharm stands as a premier integrated development environment (IDE) meticulously crafted for
Python programming, renowned for its robust features and user-friendly interface. Developed by
JetBrains, PyCharm offers a comprehensive suite of tools designed to enhance the productivity and
efficiency of Python developers. Its intelligent code completion, advanced debugging capabilities, and
seamless integration with version control systems streamline the development workflow. PyCharm
provides support for various Python frameworks and libraries, facilitating the creation of diverse
applications ranging from web development to data analysis and machine learning. With its extensive
plugin ecosystem and customizable settings, PyCharm caters to the unique needs of developers,
enabling them to build high-quality software with ease. Whether working on personal projects or large-
scale enterprise applications, PyCharm remains a preferred choice for Python developers seeking a
feature-rich and intuitive development environment.

4.6 Streamlit
Streamlit is a Python library that simplifies the creation of interactive web applications for data science
and machine learning projects. It offers a straightforward and intuitive way to build user-friendly
interfaces without the need for extensive web development experience. With Streamlit, developers can
seamlessly integrate data visualizations, input widgets, and text elements to create dynamic
applications that enable users to explore and interact with data in real-time. Its declarative syntax and
automatic widget rendering make prototyping and deploying applications quick and efficient.
Streamlit's seamless integration with popular data science libraries like Pandas, Matplotlib, and
TensorFlow further enhances its capabilities, allowing developers to leverage their existing knowledge
and tools. Overall, Streamlit empowers data scientists and machine learning engineers to share

23 | P a g e
insights, prototypes, and models with stakeholders effectively, accelerating the development and
deployment of data-driven applications.

4.7 Jupiter Notebook


Jupyter Notebook is an open-source web application that allows users to create and share documents
containing live code, equations, visualizations, and narrative text. It supports various programming
languages, including Python, R, and Julia, making it a versatile tool for data analysis, scientific
computing, machine learning, and more. Users can write code in individual cells and execute them
independently, seeing the results directly within the notebook. This interactive nature facilitates
experimentation and rapid prototyping. Moreover, Jupyter Notebook integrates seamlessly with
libraries and frameworks commonly used in data science, such as NumPy, Pandas, Matplotlib, and
TensorFlow. One of the standout features of Jupyter Notebook is its support for Markdown, a
lightweight markup language, enabling users to create rich-text documents with formatted text, images,
links, and mathematical equations. This makes it an excellent platform for documenting workflows,
explaining code snippets, and presenting research findings. Furthermore, Jupyter Notebooks can be
easily shared and collaborated on through platforms like GitHub, allowing for reproducible research
and collaborative analysis.

24 | P a g e
5. PROBLEM DESCRIPTION

5.1 PROBLEM DEFINITION


The Personalized Interviewer application is designed to assist technical students in preparing
for job interviews by generating tailored interview questions based on their resumes. It analyzes the
content of uploaded PDF resumes, extracts relevant information, and utilizes machine learning models
to create personalized interview simulations. The objective is to enhance students' interview
preparation experience by offering a user-friendly interface and personalized feedback mechanism.

We used OpenAI and LangChain to bridge the gap between theoretical knowledge and
practical skills. The main objective is to develop a reliable system for interview preparation that
provides accurate results.

5.2 PROJECT OVERVIEW


Our project's primary goal is to assist users , particularly technical students, in preparing for job
interviews effectively. By allowing users to record their answers using audio input, the system aims to
facilitate practice and improvement, ultimately enhancing their interview performance and boosting
confidence levels. We utilized a combination of LangChain, OpenAI, and Streamlit to develop an
advanced interview preparation tool tailored for technical students. Leveraging LangChain's text
processing capabilities, OpenAI's language models, and Streamlit's user-friendly interface, our project
offers a seamless experience for users.

We employed LangChain and OpenAI to analyze the content of uploaded PDF resumes and generate
personalized interview questions. The system adapts its approach based on the type of input provided,
whether it's textual information from resumes or audio recordings of user responses. This
comprehensive solution empowers users to practice interview scenarios effectively, bridging the gap
between theoretical knowledge and practical skills required in the competitive tech industry.

Fig-5 Overview

25 | P a g e
The steps involved in the project are: -

• Upload a PDF file(Resume)


• Extract text from PDF
• Split text into chunks
• Load or create Vector store
• Generate Questions
• Record Answers
• Convert audio to text(Answers)
• Analyze questions and corresponding answers after all recordings
• Displaying score between 0 to 100 and the areas of improvement for betterment of the
candidate.

The output of our project is a user-friendly interface where technical students can upload their
resumes and final result is to provide a score between 0 to 100 and for each question along with the
areas of improvement.

5.3. MODULE DESCRIPTION

PYTHON AND STREAMLIT FRAMEWORK


PyCharm, is a powerful integrated development environment (IDE) specifically designed for Python
development. PyCharm, developed by JetBrains, is a Python IDE renowned for its powerful features
tailored for Python development. It offers advanced code editing tools such as syntax highlighting,
code completion, and refactoring. Integrated with version control systems like Git, PyCharm facilitates
efficient project management. Its intelligent code analysis and debugging capabilities enable quick
error identification and resolution, ensuring application reliability. PyCharm serves as an essential tool
for Python developers, offering a robust environment for writing, testing, and debugging Python code.

Python is an interpreted, object-oriented, high-level programming language with dynamic semantics.


Its high-level built-in data structures, combined with dynamic typing and dynamic binding, make it
very attractive for Rapid Application Development, as well as for use as a scripting or glue language to
connect existing components together. Python's simple, easy to learn syntax emphasizes readability
and therefore reduces the cost of program maintenance. Python supports modules and packages, which
encourages program modularity and code reuse. The Python interpreter and the extensive standard

26 | P a g e
library are available in source or binary form without charge for all major platforms, and can be freely
distributed.

Python Streamlit is a powerful open-source framework that allows developers to create interactive web
applications with ease. With its simple and intuitive syntax, developers can quickly build data-driven
applications using Python code. Streamlit provides various components and widgets for creating
interactive elements such as buttons, sliders, and charts, making it ideal for building user-friendly
interfaces.

Streamlit App Setup and PDF Processing


The script commences with the importation of essential libraries, including Streamlit for web
application development, PyPDF2 for PDF manipulation, and various modules from the LangChain
library tailored for natural language processing endeavors. Configurations for the Streamlit sidebar are
established to furnish users with pertinent information regarding the application's functionality. The
OpenAI API key is designated as an environment variable to authorize subsequent API requests.
Subsequently, functions for audio recording and conversion are defined, setting the groundwork for
capturing user responses. The main() function orchestrates the setup of the Streamlit application
interface. Within this function, users upload PDF files, which are then parsed and analyzed utilizing
PyPDF2. Text extraction ensues, followed by segmentation into manageable chunks, culminating in
the creation of a vector store using LangChain's FAISS module. Questions pertinent to the PDF
content are generated via an OpenAI language model. Finally, the user interface is updated to
showcase the generated questions, awaiting user input.

Recording and Analyzing User Responses

Upon PDF upload and question generation, the interface prompts users to commence answering
questions sequentially. Each question triggers the initiation of audio recording upon user interaction
with a designated button, leveraging the device's microphone. Recorded audio is subsequently
transcribed into text format using the Google Speech Recognition API. Captured responses are stored
in session state variables for subsequent analysis. Upon completion of all questions, the application
proceeds to analyze user responses. A formulated query facilitates scrutiny of questions and
corresponding answers, yielding scores and areas for improvement for each question-answer pair.
LangChain's question-answering capabilities process the query, presenting findings to the user via the
application interface.

27 | P a g e
Fig- Module control flow

Error Handling and Overall Application Workflow


The script encompasses robust error handling mechanisms to gracefully navigate exceptions that may
arise during execution. Instances of errors, such as file upload or audio recording mishaps, prompt the
display of informative error messages, preserving a seamless user experience. Throughout the
codebase, various control flow structures, including conditional statements and loops, orchestrate the
application's workflow and handle diverse scenarios adeptly. Modular code architecture enhances
maintainability and readability, facilitating comprehension and modification endeavors. In sum, the
script embodies adept utilization of libraries and tools, such as Streamlit, PyPDF2, and LangChain,
culminating in the development of an interactive and user-centric application tailored for personalized
interview preparation.Error Handling and Overall Application Workflow The script encompasses
robust error handling mechanisms to gracefully navigate exceptions that may arise during execution.
Instances of errors, such as file upload or audio recording mishaps, prompt the display of informative
error messages, preserving a seamless user experience. Throughout the codebase, various control flow
structures, including conditional statements and loops, orchestrate the application's workflow and
handle diverse scenarios adeptly. Modular code architecture enhances maintainability and readability,
facilitating comprehension and modification endeavors. In sum, the script embodies adept utilization
of libraries and tools, such as Streamlit, PyPDF2, and LangChain, culminating in the development of
an interactive and user-centric application tailored for personalized interview preparation.

In the project, LangChain's LLM (Language Learning Model) plays a crucial role in generating
tailored interview questions based on the content of uploaded resumes. Leveraging advanced natural
language processing techniques, the LLM comprehensively analyzes the textual data to identify
relevant skills and experiences. It then formulates personalized questions to simulate real-world

28 | P a g e
interview scenarios. Additionally, the LLM evaluates user responses, providing constructive feedback
and areas for improvement. By harnessing the power of the LLM, the project enhances interview
preparation by offering dynamic and targeted question-answering interactions, ultimately empowering
users to refine their technical communication skills.

Fig- Implementations

The implementation consists of following modules:


• The PdfReader() class initializes a PdfReader object, allowing the script to read the content of
PDF files.
• The extract_text() method is then utilized to extract text content from individual pages of the
PDF, enabling further processing and analysis
• The load_qa_chain() function is responsible for loading a question-answering chain for
processing documents. This chain is essential for generating responses to user queries based
on the content of the documents provided.
• Similarly, the get_openai_callback() function retrieves an OpenAI callback function, which is
crucial for interacting with OpenAI's API during the question-answering process. These
functions encapsulate complex logic and functionality, enabling streamlined document
processing and response generation.
• The record_audio() function facilitates audio recording for a specified duration using the
sounddevice library and saves the recorded audio to a temporary WAV file. This functionality
is vital for allowing users to provide verbal responses to interview questions, adding an
interactive element to the application.
• Additionally, the convert_audio_to_text() function leverages the Google Speech Recognition

29 | P a g e
API to convert recorded audio files to text format. This conversion enables seamless
integration of spoken responses into the question-answering workflow.
• The RecursiveCharacterTextSplitter() class initializes a text splitter object designed to break
down text into smaller, manageable chunks. This functionality aids in processing large
volumes of text efficiently, particularly when dealing with lengthy documents such as PDF
files.
• Furthermore, the OpenAIEmbeddings() class initializes an embeddings object used for
handling text embeddings, which are essential for various natural language processing tasks
such as semantic similarity analysis.
• The FAISS.from_texts() method is employed to create a FAISS vector store from the text data
extracted from documents. This vector store facilitates efficient similarity searches and other
vector-based operations, enhancing the performance of the question-answering system.
• Within the Streamlit framework, several functions and widgets are utilized to create the user
interface and manage application state. Functions such as st.file_uploader(), st.header(),
st.write(), st.button(), st.title(), st.empty(), and st.error() are employed to display text, widgets,
and interactive elements on the Streamlit app.
• Additionally, the st.session_state attribute is utilized to access and manage session state
variables, enabling data persistence and user interaction tracking within the application.
Overall, these functionalities contribute to the creation of a user-friendly and interactive
interview preparation tool.

Fig-Internal flow control

30 | P a g e
6. SYSTEM DESIGN

6.1 Introduction to UML


Unified Modeling Language (UML) is a general-purpose modeling language. The main aim of UML is
to define a standard way to visualize the way a system has been designed. It is quite like blueprints
used in other fields of engineering. UML is not a programming language; it is rather a visual language.
We use UML diagrams to portray the behavior and structure of a system. UML helps software
engineers, businessmen and system architects with modeling, design and analysis. The Object
Management Group (OMG) adopted Unified Modelling

Language as a standard in 1997. It's been managed by OMG ever since. International Organization for
Standardization (ISO) published UML as an approved standard in 2005. UML has been revised over
the years and is reviewed periodically.

Why we need UML


• Complex applications need collaboration and planning from multiple teams and hence require
a clear and concise way to communicate amongst them.
• Businessmen do not understand code. So, UML becomes essential to communicate with
nonprogrammers’ essential requirements, functionalities and processes of the system.
• A lot of time is saved down the line when teams can visualize processes, user interactions and
static structure of the system.

UML is linked with object-oriented design and analysis. UML makes the use of elements and forms
associations between them to form diagrams. Diagrams in UML can be broadly classified as:
• Structural Diagrams – Capture static aspects or structure of a system. Structural Diagrams
include Component Diagrams, Object Diagrams, Class Diagrams and Deployment Diagrams.
• Behaviour Diagrams – Capture dynamic aspects or behaviour of the system. Behaviour
diagrams include Use Case Diagrams, State Diagrams, Activity Diagrams and Interaction
Diagrams.

31 | P a g e
Fig-Building Blocks of UML

6.2 Building Block of the UML


The vocabulary of the UML encompasses three kinds of building blocks:
• Things
• Relationships
• Diagrams

Things are the abstractions that are first-class citizens in a model; relationships tie these things
together; diagrams group interesting collections of things.

Things in the UML


There are four kinds of things in the UML:
• Structural things
• Behavioural things
• Grouping things
• Annotational things

These things are the basic object-oriented building blocks of the UML. You use them to write well-
formed models.

32 | P a g e
Structural Things
Structural things are the nouns of UML models. These are the mostly static parts of a model,
representing elements that are either conceptual or physical. Collectively, the structural things are
called classifiers.

A class is a description of a set of objects that share the same attributes, operations, relationships, and
semantics. A class implements one or more interfaces. Graphically, a class is rendered as a rectangle,
usually including its name, attributes, and operations

Class - A Class is a set of identical things that outlines the functionality and properties of an object. It
also represents the abstract class whose functionalities are not defined. Its notation is as follows

Interface - A collection of functions that specify a service of a class or component, i.e., Externally
visible behavior of that class.

Collaboration - A larger pattern of behaviors and actions. Example: All classes and behaviors that
create the modeling of a moving tank in a simulation.

Use Case - A sequence of actions that a system performs that yields an observable result. Used to
structure behavior in a model. Is realized by collaboration.

33 | P a g e
Component - A physical and replaceable part of a system that implements a number of interfaces.
Example: a set of classes, interfaces, and collaborations.

Node - A physical element existing at run time and represents are source.

Behavioral Things
Behavioral things are the dynamic parts of UML models. These are the verbs of a model, representing
behavior over time and space. In all, there are three primary kinds of behavioral things
• Interaction
• State machine

Interaction
It is a behavior that comprises a set of messages exchanged among a set of objects or roles within a
particular context to accomplish a specific purpose. The behavior of a society of objects or of an
individual operation may be specified with an interaction. An interaction involves a number of other
elements, including messages, actions, and connectors (the connection between objects). Graphically, a
message is rendered as a directed line, almost always including the name of its operation.

34 | P a g e
State machine
State machine is a behaviour that specifies the sequences of states an object or an interaction goes
through during its lifetime in response to events, together with its responses to those events. The
behaviour of an individual class or a collaboration of classes may be specified with a state machine. A
state machine involves a number of other elements, including states, transitions (the flow from state to
state), events (things that trigger a transition), and activities (the response to a transition). Graphically,
a state is rendered as a rounded rectangle, usually including its name and its substates.

Grouping Things
Grouping things can be defined as a mechanism to group elements of a UML model together. There is
only one grouping thing available.

Package − Package is the only one grouping thing available for gathering structural and behavioural
things.

Annotational Things
Annotational things are the explanatory parts of UML models. These are the comments you may apply
to describe, illuminate, and remark about any element in a model. There is one primary kind of
annotation thing, called a note. A note is simply a symbol for rendering constraints and comments
attached to an element or a collection of elements.

Relationships in the UML


35 | P a g e
Relationships are another most important building block of UML. It shows how the elements are
associated with each other and this association describes the functionality of an application.

There are four kinds of relationships in the UML:


• Dependency
• Association
• Generalization
• Realization

Dependency
It is an element (the independent one) that may affect the semantics of the other element (the
dependent one). Graphically, a dependency is rendered as a dashed line, possibly directed, and
occasionally including a label.

Association
Association is basically a set of links that connects the elements of a UML model. It also describes
how many objects are taking part in that relationship.

Generalization
It is a specialization/generalization relationship in which the specialized element (the child) builds on
the specification of the generalized element (the parent). The child shares the structure and the
behavior of the parent. Graphically, a generalization relationship is rendered as a solid line with a
hollow arrowhead pointing to the parent.

Realization
36 | P a g e
Realization can be defined as a relationship in which two elements are connected. One element
describes some responsibility, which is not implemented and the other one implements them. This
relationship exists in case of interfaces.

6.3 UML DIAGRAMS


UML is a modern approach to modeling and documenting software. It is based on diagrammatic
representations of software components. It is the final output, and the diagram represents the system.

UML includes the following

• Class diagram
• Object diagram
• Component diagram
• Composite structure diagram
• Use case diagram
• Sequence diagram
• Communication diagram
• State diagram
• Activity diagram

37 | P a g e
7. DEVELOPMENT

7.1 RAW DATA

Fig- Resume pdf file

Fig- Resume pdf file

38 | P a g e
Fig- Resume pdf file

39 | P a g e
7.2 SAMPLE CODE (app.py)
import streamlit as st

import pickle

from PyPDF2 import PdfReader

from langchain.text_splitter import RecursiveCharacterTextSplitter

from langchain.embeddings.openai import OpenAIEmbeddings

from langchain.vectorstores import FAISS

from langchain.llms import OpenAI

from langchain.chains.question_answering import load_qa_chain

from langchain.callbacks import get_openai_callback

import os

import time

import sounddevice as sd

import soundfile as sf

import tempfile

import speech_recognition as sr

# Initialize session state variables

if 'questions' not in st.session_state:

st.session_state.questions = []

if 'recorded_answers' not in st.session_state:

st.session_state.recorded_answers = {}

40 | P a g e
if 'n' not in st.session_state:

# number of Questions

st.session_state.n=5

if 'analysis' not in st.session_state:

# number of Questions

st.session_state.analysis=False

# Sidebar contents

with st.sidebar:

st.title('🤗💬 LLM Application')

st.markdown('''

## About

References used for building the APP:

- [Streamlit](https://streamlit.io/)

- [LangChain](https://python.langchain.com/)

- [OpenAI](https://platform.openai.com/docs/models) LLM model

''')

# Set OpenAI API key

os.environ['OPENAI_API_KEY'] = "sk-
efcKQr3uh0wXn7kbOYJWT3BlbkFJDnA3c8ozTQ6Outcr4HW1"

file_path = "faiss_store_openai.pkl"

main_placeholder = st.empty()

def record_audio(duration, fs):

with tempfile.NamedTemporaryFile(delete=False, suffix='.wav') as tmpfile:

41 | P a g e
recording = sd.rec(int(duration * fs), samplerate=fs, channels=1, dtype='int16')

sd.wait()

sf.write(tmpfile.name, recording, fs, format='wav')

return tmpfile.name

def convert_audio_to_text(audio_file):

r = sr.Recognizer()

try:

with sr.AudioFile(audio_file) as source:

audio_data = r.record(source)

text = r.recognize_google(audio_data)

return text

except sr.UnknownValueError:

return ""

except sr.RequestError as e:

return f"Could not request results from Google Speech Recognition service; {e}"

# Sample rate

fs = 44100

# Default recording duration

duration = 60

# Main function

def main():

st.header("Personalized Interviewer 💬")

# Upload a PDF file

42 | P a g e
pdf = st.file_uploader("Upload your PDF", type='pdf')

main_placeholder.text("Data Loading...Started...✅✅✅")

if pdf is not None:

try:

if not st.session_state.analysis:

# Read PDF content

pdf_reader = PdfReader(pdf)

# Check if the PDF is empty

if len(pdf_reader.pages) == 0:

st.error("The uploaded PDF is empty.")

return

# Extract text from PDF

text = ""

for page_num in range(len(pdf_reader.pages)):

page_text = pdf_reader.pages[page_num].extract_text()

if page_text:

text += page_text

else:

# Break out of the loop if the page is empty, indicating the end of the file

break

# Check if the extracted text is empty

if not text.strip():

st.error("No text found in the PDF.")

return

43 | P a g e
# Split text into chunks

text_splitter = RecursiveCharacterTextSplitter(

chunk_size=1000,

chunk_overlap=200,

length_function=len

chunks = text_splitter.split_text(text=text)

# Load or create st.session_state.vectorstore

if os.path.exists(file_path):

with open(file_path, "rb") as f:

embeddings = OpenAIEmbeddings()

st.session_state.vectorstore = FAISS.from_texts(chunks, embedding=embeddings)

st.session_state.vectorstore.save_local("faiss_store_openai")

else:

st.write("Analysing your resume...")

embeddings = OpenAIEmbeddings()

st.session_state.vectorstore = FAISS.from_texts(chunks, embedding=embeddings)

st.session_state.vectorstore.save_local("faiss_store_openai")

# Generate questions

query = f"Give {st.session_state.n} technical questions on the skills and projects from the above
pdf"

if query:

docs = st.session_state.vectorstore.similarity_search(query=query, k=3)

llm = OpenAI(model_name="gpt-3.5-turbo-instruct", temperature=0.6, max_tokens=500)

chain = load_qa_chain(llm=llm, chain_type="stuff")

44 | P a g e
with get_openai_callback() as cb:

response = chain.run(input_documents=docs, question=query)

if not st.session_state.questions:

st.session_state.questions = list(response.split('?'))[0:-1]

# Initialize recorded answers

if not st.session_state.recorded_answers:

for i in range(st.session_state.n):

curr_qns_ans = {}

curr_qns_ans["Question"] = st.session_state.questions[i]

curr_qns_ans["Answer"] = "Not Answered Yet"

st.session_state.recorded_answers[i] = curr_qns_ans

# Display questions and record answers

st.write("Analysis completed. Start answering Below Questions\n")

st.session_state.analysis=True

st.header("Questions")

for i, question in enumerate(st.session_state.questions):

st.write(f"{question}")

start_recording = st.button(f"Start Answering {i+1}")

if start_recording:

st.write("Listening...")

audio_file = record_audio(duration, fs)

st.write("Time's up!")

# Convert audio to text

45 | P a g e
text = convert_audio_to_text(audio_file)

# Store the recorded answer

curr_qns_ans = {}

curr_qns_ans["Question"] = question

curr_qns_ans["Answer"] = text

st.session_state.recorded_answers[i] = curr_qns_ans

# Display all recorded answers

if st.session_state.recorded_answers:

st.header("All Your Answers")

for qn_num, ele in st.session_state.recorded_answers.items():

st.write(f'{ele["Question"]}')

st.write(f'Answer: {ele["Answer"]}')

# Analyze questions and corresponding answers after all recordings

query = f"""Analyze all the above questions and corresponding answers and give a score between
0 to 100 and also provide the areas of improvement for betterment of the candidate. The list of questions
and answers are as follows, providing a review only for answered questions:
{str(st.session_state.recorded_answers)}. Give analysis for every question and corresponding answer. The
format of the review is '[Question number] : [score]/100 Areas of improvement: [suggestions to
improve]'. Every question's response should be separated by '###'. For example:

Question 1: Score - [score]/100 Areas of improvement: The candidate provided a brief answer, but
it could be improved by providing more specific details about the methods used to fine-tune the VGG16
model and the results achieved ###

Question 2: Score - N/A Areas of improvement: The candidate did not provide an answer for this
question, so no score or areas of improvement can be given

46 | P a g e
and question number starts from 1.Please give each answer in a newline"""

count = 0

for i, question in enumerate(st.session_state.questions):

if st.session_state.recorded_answers[i]["Answer"] != "Not Answered Yet":

count += 1

answered_all = True if count == st.session_state.n else False

if query and answered_all:

st.title("Analysis and Review")

docs = st.session_state.vectorstore.similarity_search(query=query, k=3)

llm = OpenAI(model_name="gpt-3.5-turbo-instruct", temperature=0.6, max_tokens=1000)

chain = load_qa_chain(llm=llm, chain_type="stuff")

with get_openai_callback() as cb:

response = chain.run(input_documents=docs, question=query)

reviews=response.split("###")

for review in reviews:

st.write(review)

# st.write(response)

except Exception as e:

st.error(f"An error occurred: {str(e)}")

# Run the main function

if __name__ == '__main__':

main()

47 | P a g e
Fig- Resume Uploading

48 | P a g e
Fig- Analyzing Resume

49 | P a g e
Fig- Interview Questions generation from Resume

Fig- Recording Answer for Interview Question

50 | P a g e
Fig- Answer Completion and speech to text conversion and analysis

Fig- Display of all Questions and Answers

51 | P a g e
Fig- Analysis and suggestions after all questions being answered.

Fig- Error and Exception handling for invalid resumes

52 | P a g e
8. TESTING

8.1 INTRODUCTION TO TESTING


Software Testing is defined as an activity to check whether the actual results match the expected
results and to ensure that the software system is Defect free. It involves the execution of a software
component or system component to evaluate one or more properties of interest. It is required for
evaluating the system. This phase is the critical phase of software quality assurance and presents the
ultimate view of coding.

Importance of Testing
The importance of software testing is imperative. A lot of times this process is skipped, therefore, the
product and business might suffer. To understand the importance of testing, here are some key points
to explain
• Software Testing saves money
• Provides Security
• Improves Product Quality
• Customer satisfaction

Testing is of different ways The main idea behind the testing is to reduce the errors and do it with a
minimum time and effort.

Benefits of Testing
• Cost-Effective: It is one of the important advantages of software testing. Testing any IT
project on time helps you to save your money for the long term. In case if the bugs caught in
the earlier stage of software testing, it costs less to fix.
• Security: It is the most vulnerable and sensitive benefit of software testing. People are looking
for trusted products. It helps in removing risks and problems earlier.
• Product quality: It is an essential requirement of any software product. Testing ensures a
quality product is delivered to customers.
• Customer Satisfaction: The main aim of any product is to give satisfaction to their customers.
UI/UX Testing ensures the best user experience.

53 | P a g e
Different types of Testing

Unit Testing: Unit tests are very low level, close to the source of your application. They consist in
testing individual methods and functions of the classes, components or modules used by your software.
Unit tests are in general quite cheap to automate and can be run very quickly by a continuous
integration server.

Integration Testing: Integration tests verify that different modules or services used by your
application work well together. For example, it can be testing the interaction with the database or
making sure that microservices work together as expected. These types of tests are more expensive to
run as they require multiple parts of the application to be up and running.

Functional Tests: Functional tests focus on the business requirements of an application. They only
verify the output of an action and do not check the intermediate states of the system when performing
that action. There is sometimes a confusion between integration tests and functional tests as they both
require multiple components to interact with each other. The difference is that an integration test may
simply verify that you can query the database while a functional test would expect to get a specific
value from the database as defined by the product requirements.

Regression Testing: Regression testing is a crucial stage for the product & very useful for the
developers to identify the stability of the product with the changing requirements. Regression testing is
a testing that is done to verify that a code change in the software does not impact the existing
functionality of the product.

System Testing: System testing of software or hardware is testing conducted on a complete integrated
system to evaluate the system’s compliance with its specified requirements. System testing is a series
of different tests whose primary purpose is to fully exercise the computer-based system.

Performance Testing: It checks the speed, response time, reliability, resource usage, scalability of a
software program under their expected workload. The purpose of Performance Testing is not to find
functional defects but to eliminate performance bottlenecks in the software or device.

Alpha Testing: This is a form of internal acceptance testing performed mainly by the in- house
software QA and testing teams. Alpha testing is the last testing done by the test teams at the
development site after the acceptance testing and before releasing the software for the beta test. It can
also be done by the potential users or customers of the application. But still, this is a form of in-house
acceptance testing.

54 | P a g e
Beta Testing: This is a testing stage followed by the internal full alpha test cycle. This is the final
testing phase where the companies release the software to a few external user groups outside the
company test teams or employees. This initial software version is known as the beta version. Most
companies gather user feedback in this release.

Black Box Testing: It is also known as Behavioural Testing, is a software testing method in which the
internal structure/design/implementation of the item being tested is not known to the tester. These tests
can be functional or non-functional, though usually functional.

Fig- BlackBox Testing

This method is named so because the software program, in the eyes of the tester, is like a black box;
inside which one cannot see. This method attempts to find errors in the following categories:
• Incorrect or missing functions
• Interface errors
• Errors in data structures or external database access
• Behaviour or performance errors
• Initialization and termination errors

55 | P a g e
White Box Testing: White box testing (also known as Clear Box Testing, Open Box Testing, Glass
Box Testing, Transparent Box Testing, Code-Based Testing or Structural Testing) is a software testing
method in which the internal structure/design/implementation of the item being tested is known to the
tester. The tester chooses inputs to exercise paths through the code and determines the appropriate
outputs. Programming know-how and the implementation knowledge is essential. White box testing is
testing beyond the user interface and into the nitty-gritty of a system. This method is named so because
the software program, in the eyes of the tester, is like a white/transparent box; inside which one clearly
sees.

Fig- Whitebox Testing

56 | P a g e
9. CONCLUSION

Our project leverages advanced technologies like LangChain and OpenAI to revolutionize the interview
preparation process. By automating question generation based on resume content and enabling
personalized audio responses, we offer a dynamic and efficient platform for candidates to hone their
interview skills. The seamless integration of machine learning algorithms ensures objectivity, fairness,
and real-time feedback, enhancing the overall interview experience

Both benefits and drawbacks exist with our project. On the positive side, it automates question
generation and response recording, streamlining the interview preparation process. Additionally, it
provides personalized feedback and analysis, enhancing candidate performance and confidence.
However, reliance on machine learning algorithms may introduce biases or inaccuracies in question
generation, impacting the quality of interview practice. Our system may not fully replicate the nuances
of human interaction in interview scenarios, and users should supplement their preparation with real-
world practice and feedback.

The main challenge that we faced while working on this project was the need for internet connectivity
and API access may limit accessibility and usability in certain environments.

57 | P a g e
10. FUTURE SCOPE

 The future scope of our project is expansive, driven by our overarching objective of
revolutionizing interview preparation processes.
 We continue to refine our system, we aim to leverage cutting-edge technologies to enhance user
experience and effectiveness.
 This includes exploring advanced natural language processing techniques to generate more
contextually relevant and diverse interview questions.
 Additionally, we envision integrating machine learning algorithms to provide personalized
feedback and performance analytics to users.
 Moreover, we plan to expand the application's capabilities by incorporating features such as mock
interview simulations and industry-specific question sets.
 These enhancements will ensure that our platform remains at the forefront of interview preparation
innovation, catering to diverse user needs and preferences.
 Therefore, these are some upcoming upgrades or enhancements that we intend to make.

58 | P a g e
11. REFERENCE LINKS

1. Data Extraction from pdf:

2. Open Ai docs: https://platform.openai.com/docs/introduction

3. Google Cloud Speech to text API: https://cloud.google.com/text-to-speech?hl=en

4. Google Cloud text to Speech API: https://cloud.google.com/speech-to-text?hl=en

5. LangChain docs: https://python.langchain.com/docs/get_started/introduction

59 | P a g e

You might also like