[go: up one dir, main page]

0% found this document useful (0 votes)
27 views31 pages

SkillSync Midterm Presentation

weertweetwt

Uploaded by

cr7ashek
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
27 views31 pages

SkillSync Midterm Presentation

weertweetwt

Uploaded by

cr7ashek
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 31

MIDTERM PROGRESS PRESENTATION ON

SkillSync: NLP BASED


RESUME RANKING SYSTEM
PRESENTED BY:
SAUJANYA SHRESTHA [PAS077BCT036]
DIPAK POUDEL [PAS077BCT019]
ANISH DAHAL [PAS077BCT005]
SAROJ ADHIKARI [PAS077BCT035]
OVERVIEW
01 INTRODUCTION
02 METHODOLOGY
03 WORK PROGRESS
04 REMAINING TASKS
INTRODUCTION
RESUME RANKING

Resume Ranking Systems allows users to compare


resumes with job posts and calculate a match score.

Based on the match score, the resumes are ranked, hence


significantly reducing the time and labor required to
screen through hundreds or thousands of resumes.
PROBLEM STATEMENTS

Tedious to manage and analyze high volume of resumes

Traditional means introduce human bias and subjectivity

Manual analysis and ranking is time-consuming and inefficient


OBJECTIVES

To build a custom NER model for parsing relevant


information from resumes and job posts

To build a system that compares resumes and job posts,


calculates a match score and ranks them
NLP & NER

Natural Language Processing (NLP) is a field of


artificial intelligence that enables machines to
understand, interpret, and generate human language.

Named Entity Recognition (NER) is a subtask of NLP


that focuses on identifying and classifying specific
entities (e.g., names, dates, locations) in text.
spaCy is a fast and efficient open-source library for
NLP in Python.

It provides tools for tasks like tokenization,


part-of-speech tagging, dependency parsing, and NER.
METHODOLOGY
TOOLS & TECHNOLOGIES
● NLP Libraries: SpaCy
● Programming Language: Python
● Machine Learning: Scikit-Learn, TensorFlow, PyTorch
● Database: MongoDB
● Deployment: AWS or Docker or Google Cloud
● Development Tools: Jupyter Notebook, GitHub, VS Code
● Web Framework: Django or Flask or Streamlit
● Other Libraries & Modules: Pandas, Numpy, Matplotlib,
Seaborn, Pytesseract, Pillow, PyMuPDF, Python-Docx…
DATA COLLECTION

Resume dataset uploaded by Mr. Roman Shilpakar on Google


Drive that contains 1014 resumes

659 job descriptions collected from various online sources


and compiled into a single text file
DATA COLLECTION
DATA PREPROCESSING

The text files containing the job descriptions are manually


annotated using an online NER annotator called “arunmozhi”

After annotation, a dataset is obtained in json format with


the custom annotations of various entities.
DATA PREPROCESSING
DATA PREPROCESSING
MODEL GENERATION

Using the dataset, a custom NER model is trained using


spaCy’s command line interface
MODEL EVALUATION

It took over 3 hours to train the custom NER model using


the resume dataset containing 1014 resumes with a score
of 85%

It took over 2 hours to train the custom NER model


using the job post dataset containing 659 job posts
with a score of 70%
MODEL EVALUATION
MODEL EVALUATION
MODEL DEPLOYMENT

The custom NER model was saved in the local directory


for deployment in Python app

This stage involves implementing the NER models to parse


the resumes and job descriptions, which will later on be
used to compare the two and calculate a match score
WORK PROGRESS
TASKS COMPLETED

We collected a text file containing 659 job descriptions

We manually annotated the job descriptions into a


dataset for training a model

We trained a custom NER model using the publicly available


resume dataset and the self-annotated job post dataset for
parsing key entities from resumes and job posts
SAMPLE OUTPUT (RESUME)
SAMPLE OUTPUT (RESUME)
SAMPLE OUTPUT (JOB POST)
SAMPLE OUTPUT (JOB POST)
REMAINING TASKS
TASKS REMAINING

Extending and refining the datasets and model

Developing a comparison algorithm

Developing a python web application

Final detailed documentation


REFERENCES

https://spacy.io/usage/training

https://drive.google.com/file/d/1dduSVVa0QKXvVx
4_OGQXr16FFsTFb7rJ/view
THANK YOU!!!

You might also like