[go: up one dir, main page]

0% found this document useful (0 votes)
48 views5 pages

Smart-Hire Personality Prediction Using ML

The document presents a study on using machine learning to predict personality traits based on the Big Five personality model, aiming to enhance recruitment processes. It discusses the development of a system that automates candidate selection by analyzing personality traits through various algorithms like KNN and Logistic Regression. The results indicate a significant improvement in prediction accuracy, showcasing the potential of data science and AI in human resource management.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
48 views5 pages

Smart-Hire Personality Prediction Using ML

The document presents a study on using machine learning to predict personality traits based on the Big Five personality model, aiming to enhance recruitment processes. It discusses the development of a system that automates candidate selection by analyzing personality traits through various algorithms like KNN and Logistic Regression. The results indicate a significant improvement in prediction accuracy, showcasing the potential of data science and AI in human resource management.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

2023 International Conference on Disruptive Technologies (ICDT)

Smart-Hire Personality Prediction Using ML


Isha Gupta, Manasvi Jain, Dr.Prashant Johri,
School of Computer Science & School of Computer Science & School of Computer Science &
Engineering, Engineering, Engineering,
Galgotias University, Galgotias University, Galgotias University, Gr. Noida, India.
3johri.prashant@gmail.com
Gr. Noida, Gr. Noida, India.
India.1ishagupta2103@gmail.com 2manasvijain266@gmail.com
2023 International Conference on Disruptive Technologies (ICDT) | 979-8-3503-2388-7/23/$31.00 ©2023 IEEE | DOI: 10.1109/ICDT57929.2023.10151367

Abstract—Through technological changes, data science and alone. This quality involves being gregarious, enthusiastic,
artificial intelligence are altering the world. One of the most and self-assured.
significant uses of machine learning is the classification of
people based on their personality features. We can see many Agreeableness: A person's capacity for empathy and
applications of machine learning in our daily lives. Each cooperation, which are indicators of their ability to relate
individual on the world has a distinct personality type. with others. Inherent in this quality are tact, gentleness, and
Targeting particular demographics has made it possible to
make marketing campaigns more effective. This is made loyalty.
possible by the availability of high-dimensional data. Such
The tendency to have unfavorable personality traits,
personality-based promotions are quite effective at raising
brand awareness and enhancing the appeal of goods and unstable emotions, and damaging thoughts is known as
services. Using the Big Five personality traits, we created a neuroticism. Pessimism, worry, insecurity, and fearfulness
system for predicting personality. Every day, a large number of are characteristics of this trait.
students take competitive exams with a strong personality
component. These tests' primary goal is to evaluate the student's
talents and personality. Writing the personality test and
assessing the subject's personality are made easier by this
initiative. The person can view their personality type and make
improvements to their personality depending on the results of
the personality classification. In our paper, we tried to combine
phrase frequency algorithm to determine a person's talent and
personality prediction utilizing ML algorithms like KNN, CNN,
and Logistic regression to predict a person's personality. From
this model or system, users can quickly determine his
personality and level of technical proficiency.

Keywords—Big Five Personality Model, Feature Analysis,


Personality Prediction, Personality Traits

I. INTRODUCTION
The "big five" personality traits are openness,
conscientiousness, extroversion, agreeableness, and
Fig 1. Big five factors represent individual’s personality
neuroticism. They are frequently referred to as "OCEAN" and
occasionally "CANOE." These five personality traits cover a Objective of the System: Automating the selection of
wide range of analysis of individual nature is responsible for
candidates is the key goal. The idea is to create a system that
decision-making as well as personality variations. The model
is being used by HR specialists to evaluate new hires and by will make it easier to recognize the personality traits
marketers to understand the target consumers for their displayed by the applicant and learn more about them
products. In this paper, the OCEAN Model is being used to without actually meeting them. The company will have a
construct the algorithm. better understanding of the candidate and be in a better
Receptivity to new things: This personality quality, position to choose the best applicant for the open position.
sometimes known as intelligence and inventiveness,
advocates for the openness to explore new stuff and to think II. LITERATURE SURVEY
outside the box. Insightfulness, inventiveness, and curiosity “Multiple Social Networking source of data for Text-
are characteristics of this feature. based Character Prediction”
Conscientiousness: The urge to exercise self-control, be In order to extract features for a personality prediction
cautious, and work hard. This quality involves determination, system, using pre-trained language models BERT, RoBERTa,
self-control, reliability, and consistency. and XLNet along with additional NLP features, this work
Extroversion: The tendency to seek up human proposed a multi-modal deep learning architecture (sentiment
communication or connection rather than spending time analysis, TF-IGM, and NRC emotion lexicon database). In

979-8-3503-2388-7/23/$31.00 ©2023 IEEE 381


Authorized licensed use limited to: Zhejiang University. Downloaded on November 24,2024 at 04:49:50 UTC from IEEE Xplore. Restrictions apply.
2023 International Conference on Disruptive Technologies (ICDT)

terms of predicting personality traits, this strategy


outperforms the previous method.
“A Big-Five Model based Neural Network Approach to
Personality Prediction”
This model was designed by Mayuri Pundlik Kalghatgi
[3]. Parallelism between a person's linguistic information
and personality qualities is looked at for the model analytics.
The OCEAN model’s use of linguistic information allows
for identifying personality characteristics. This
demonstrates the personality characteristics that are
applicable to several disciplines, including business
intelligence, marketing & psychology.
“The General Personality Factor”
The article was published by Dimitri van der Linden. Fig.2. Sample Questions
Using Big Five intercorrelations as a starting point for an
assessment of the criterion-related validity. In order to B- Data Analysis
ascertain whether a GFP exists, this study looked at the We use the Scikit Learn Library's Standard Scaler to scale
interconnections of the five personality traits of receptivity, the test dataset after dividing it into x- and y-tests.
openness to experience, emotional stability, extroversion, and
psychopaths. This paper got to the conclusion that the GFP
plays a key role since it is associated to supervisor-rated work
performance in light of the meta-support analyses for GFP at
the highest operational level. [6]
Gayatri Vaidya: To develop a personality prediction end-
to-end network that employs discrete technique and can
successfully predict self-reported personality traits from an
image, this system would first be developed by assembling a Fig.3. Pre-processing of Dataset
dataset with photographs, quality criteria, tests of IQ, and
personality tests. The main goals of the suggested method We ran our model using a Jupyter notebook computer
were to distinguish between an individual's internal system. Among the Python libraries we have used are
characteristics and outward behaviors and to use a graph or matplotlib, sklearn, numpy, re, seaborn, pandas, and numpy.
percentage to show the results. [7]
C- System Architecture
A Comprehensive Electronic Recruitment System for
Automated Personality Mining and Candidate Ranking was The suggested recruitment model's two primary
proposed by Athanasios Tsakalidis and Evanthia Faliagka in components are the administrator page and the candidate
2012 [5]. A candidate ranking is now automatic thanks to this page. There are numerous other parts inside these pages. User
system. It was built on unbiased principles that the candidate's must log in using legitimate credentials in order to access
information would be taken from their LinkedIn page. The them. While applicants would use the Candidate Page, the
recruiter controlled the weight of the selection criteria, which recruitment agency would use the Admin Page.
were then utilized to determine the applicant's rank using the a- Section-1 Admin Page
Analytical Hierarchy Process (AHP).
III. PROPOSED METHOD
A- Data Collection
The data set was gathered through interactions with
potential employees and a variety of websites. The
questions and answers were entered into Google Forms is
saved as a CSV file for speedy data retrieval and training.
In the graphic below, some questions for one of the Big Fig.4. Data flow diagram of Admin Page
Five personality traits—Openness—is depicted. There is a
predetermined range of responses for each topic, from Login: To configure the various system settings and gain
strongly agree to strongly disagree. access to the Admin Page's sub-sections, the admin must first
log in.
Manage Questions: The administrator may include
aptitude questions on any subject of his or her choosing, each
with a multiple-choice response. The administrator may use a
personality-related question based on the OCEAN model in
this subsection to predict the candidate's personality.

382
Authorized licensed use limited to: Zhejiang University. Downloaded on November 24,2024 at 04:49:50 UTC from IEEE Xplore. Restrictions apply.
2023 International Conference on Disruptive Technologies (ICDT)

Manage Jobs & Options: The administrator can control the D- Design Algorithm
selections based on the specifications of the employment
position and the available job postings. Machine learning (ML) algorithms are used to deliver the
outcomes of the entire process prior to the interview stage. It
View Candidates: The administrator has access to all of requires some input training data, most of which is made up
the candidate's data. of earlier candidate selection judgments made by subject-
matter experts. The eligibility ratings from the analysis
View Results: The evaluation results for the shortlisted determine a candidate's suitability for the position. The
candidates are visible to the admin. feedback on the candidate's claimed attributes is passed along
to a learning algorithm, which uses the data to generate the
evaluation. The system then creates the final ranked list to
choose the applicants. In the proposed approach, a set of input
employee selection patterns is used to represent the training
data set. The KNN (K-Nearest Neighbors) and Logistic
Regression understanding algorithm ranking used in the
prediction model and evaluation tests do the job well.
a- KNN (K-Nearest Neighbors)
It is possible to use the KNN method for classification and
Fig.5. Admin System Workflow regression issues. Based on "feature similarity," the KNN
b- Section-2 Candidate Page algorithm predicts fresh data points' values. This suggests that
the value assigned to the new point depends on how much it
resembles the points in the train dataset. [20]
Predictions are produced by looking over the entire
training set for the K instances (neighbors) that match the new
instance (x) the most. Following that, the output variable for
these K instances is added. In classification, this might be the
modal (or most common) class value, and in regression, it
might be the mean output variable. A distance measure is used
Fig.6. Data flow diagram of Candidate Page
to identify which of the K samples in the train dataset is most
similar to the new input. [20]
Registration: To access the following sections, the
b- Logistic Regression
candidate must first complete the registration form and
create their login credentials. A CV form must be Using the supervised learning classification process
completed and submitted by the applicant as part of the known as logistic regression, the likelihood of a target
registration process. variable is predicted. There are only two feasible classes
because of the dichotomous character of the dependent
Login: By providing the necessary information, the
variable. [19]
candidate can access the sub-sections.
The dependent variable is, to put it simply, a binary
Test: A personality and aptitude test can be taken online
variable, with data recorded as either 1 (which represents
following a successful login. If the applicant meets the
success/yes) or 0 (which represents failure/no). [19]
requirements established by the candidate Admin, they will
be able to view the job specifics and select the relevant A logistic regression model forecasts the value of P(Y=1)
position. as a function of X mathematically. One of the simplest ML
techniques, it can be used to solve a variety of categorization
View Results: The test taker can see the results after
problems. [19]
finishing it.
Logout: The candidate may exit the portal after viewing
the results.

Fig.8. Machine Learning Algorithm

The logistic function, which forms the method's basis, is


known as logistic regression. The logistic function, often
known as the sigmoid function, was developed by
Fig.7. Candidate System Workflow statisticians, to describe the characteristics of population
expansion in ecology, which quickly increases and peaks at
the environment's carrying capacity. Any real-valued integer
will be mapped onto a value between 0 and 1, but never
precisely at those ranges, using this S-shaped curve.

383
Authorized licensed use limited to: Zhejiang University. Downloaded on November 24,2024 at 04:49:50 UTC from IEEE Xplore. Restrictions apply.
2023 International Conference on Disruptive Technologies (ICDT)

1 / (e-value + 1) where value is the straightforward


numerical value you want to modify, The EXP () function in
your spreadsheet or Euler's number are both examples of
natural logarithm bases. The logistic function was used to
translate the numbers between - 5 and 5 into the range
between 0 and 1. The results are plotted below.

Fig.9. Logistic Regression Function

Fig.12. Result of Survey

IV. RESULTS AND DISCUSSIONS


The proposed model was the subject of studies that led to
the creation of Table 1. The accuracy of the Logistic
Regression algorithm increased from 26% to 86%,
outperforming our previously used KNN approach in
accuracy.

Fig.10. Logistic Regression Result We discuss the analysis of the results in this part. Below,
we go over the findings as example of the output.
E- Implementation
The dataset is split into training and testing halves. With
the help of the Standard Sklearn Library, the dataset is scaled
further. 30% of the test is testing, and 70% is training. The
dataset has 972 rows and 8 columns, and each row contains
the candidate's age and gender as well as one of the OCEAN
Model's five personality qualities. The rows include the
participant data.

Fig.13. Admin Panel

The implementation of a website with three main


modules—Admin Panel, Candidate Panel, and Analysis
Section—is done. It shows the Admin Panel, where system
administrators or recruiters can sign up and log in. Admin will
Fig.11. Survey be able to add the Category of Test to be conducted after
logging into the site. Then, depending on the demands of the
Data is subsequently transformed into an array and given position, questions from each category, such as those based
into the Logistic Regression algorithm. The K-Mean on aptitude, personality, or any technical category, may be
Clustering Algorithm is then used to cluster each personality added. The administrator can view every question in every
attribute. As a result, the algorithm guesses each survey category.
respondent's personality and outputs the result.

384
Authorized licensed use limited to: Zhejiang University. Downloaded on November 24,2024 at 04:49:50 UTC from IEEE Xplore. Restrictions apply.
2023 International Conference on Disruptive Technologies (ICDT)

based on the Big-Five Model.", International Journal of Innovative


Research in Advanced Engineering (IJIRAE) ISSN: 2349-2163 Issue
8, Volume 2, August.
[4] Clemens Stachl, Quey Au, Ramona Schoedel, and Markus Buhner,
"Predicting personality from patterns of behaviour recorded using
smartphones."
[5] Athanasios Tsakalidis, Giannis Tzimas, and Evanthia Faliagka, "An
integrated e-recruitment system for automated personality mining and
applicant ranking", 2012.
[6] L. D. van der, J. te Nijenhuis, et al., "The General Factor of
Personality"
[7] Gayatri Vaidya, Pratima Yadav, Reena Yadav, and Prof. Chandana
Nighut, "Personality Prediction By Discrete Methodology," IOSR
Journal of Engineering (IOSRJEN), ISSN (e): 2250-3021, ISSN (p):
Fig.14. Candidate Panel 2278-8719 Volume 14, pp. 10-13
[8] Aleksandar Kartelj, Vladimir Filipovic, Veljko Milutinovic, “Novel
By using their Candidate ID and password to authenticate, approaches to automated personality classifications ideas & their
the registered Candidates are permitted to log in to the system. potential”, 35th International conference MIPRO, IEEE, 2012.
The CV details, where they can insert resume information, can [9] Vishnu M. Menon, Rahulnath H. A, “A Novel Approach to Evaluate
and Rank Candidates in a Recruitment Process by Estimating
be added after logging in. The candidate may then attempt the Emotional Intelligence Through Social Media Data.
test by selecting it from the menu on the left. At the conclusion
[10] J. Golbeck, C. Robles, K.Turner, “Predictiong personality with social
of the exam, the results will be shown. To raise their results, media”, CHI’11 Extended Abstract on Human Factors in Computing
they can take more than one test. Systems, pp.253-262, 2011.
[11] Fazel Keshtkar, Candice Burkett, Haiying Li and Arthur C Graesser,
“Using Data Mining Techniques to detect the personality of players in
an Educational Game”, Spinger International Publishing, 2014.
[12] Randall Wad, Taghi M. Khoshgoftaar, Amri Napolitano, Chris
Sumner, “Using Twitter Content to Predict Psychopathy”, 11 th
International conference on Machine learning & Applications, IEEE,
2012.
[13] Educational Game Yago Saez, Carlos Navarro, Asuncion Mochon,
Pedro Isasi, 2014. “A system for personality & Happiness detection,
International Journal of Interactive Multimedia & Artificial
Intelligence.
[14] Yilun Wang, “Undersatnd Personality through social media”,
International of computer Science standford university.
[15] Bayu Yudha Pratama, Riyanarto Sarno, “Personality classification
based on twitter text”, International Conference on Data & Software
Testing, 2015.
Fig.15. Predicted Output Plot
[16] Manasi Ombhase, Prajakta Gogate, Tejas Aptil, Karan sNair, Prof.
Gayatri Hedge, “automated Personality classification using Data
V. CONCLUSION Mining Techniques”, International Conference on Data & Software
This personality prediction model is applicable to e- Testing, April 2017.
[17] I. Cantandir, I. Fernandez-Tobiaz, A. Belllogin, “Relating Personality
commerce websites, competitive examinations, psychometric
types with user preferences in multiple entertainment domains”,
testing, matrimony websites, and government organizations EMPIRE 1st Workshop on Emotions and Personality in Personalised
like the army, navy, and air force. The system then uses the Services, 2013.
data set provided at the back end to automatically classify the [18] C.D. Manning, P. Raghavan, H. Schutez, “Introduction to Information
user's personality after they attempt the survey. Since Retrieval”, Cambridge University Press, ISBN: 978-0-521-86571-5,
personality analysis and prediction have increased recently, 2008.
[19] Tutorials point: Machine Learning-Logistic Regression-
more personality traits may be introduced in the future. The Tutorialspoint
algorithms and data collection can be used to make any [20] Machine Learning Mastery:
additional improvements to increase accuracy and benefit the https://machinelearningmastery.com/k-nearestneighbors-for-
career advising module. This process would help the human machine-learning/
resources division choose the best candidate for a particular [21] Jayashree Rout, Sudhir Bagade, Pooja Yede, Nirmiti Patil, 2019.
job opening, giving the business a knowledgeable employee. “Personality Evaluation and CV Analysis using Machine Learning
Algorithm.”, International Journal of Computer Science &
By ranking each CV, this system would make it easier to Engineering ISSN: 2347-2693 Issue 5, Volume 7, May.
choose which ones to use. Their test results, level of
experience, credentials, and other factors all affect where they
rank. This plan would lessen the workload for the human
resources department.
REFERENCES
[1] Prof. Waheeda Dhokley, Randeria Kaiwan Jehangir, Shaikh Nabeel
Rashid, and Shaikh Almas Mohd Sarwar, "A Novel Approach to
Predict Personality of a Person."
[2] Tanuj Shankarwar, Siddharth Thorat, and Atharva Kulkarni, “Using
machine learning, Personality Prediction via CV Analysis"
[3] Mayuri Pundlik Kalghatgi, Manjula Ramannavar, and Dr. Nandini S.
Sidnal, 2015. "A Neural Network Approach to Personality Prediction

385
Authorized licensed use limited to: Zhejiang University. Downloaded on November 24,2024 at 04:49:50 UTC from IEEE Xplore. Restrictions apply.

You might also like