0% found this document useful (0 votes)

39 views7 pages

Final DM

The document discusses analyzing campus placements in India using machine learning algorithms. It first collects a campus placement dataset from Kaggle containing attributes on students like CGPA, internships, projects, etc. It then preprocesses the data, selects relevant features, and splits the data into 80% for model training and 20% for testing. Various machine learning classification algorithms will be applied and evaluated on the test data to predict student placements based on their attributes.

Uploaded by

Love Gates

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

39 views7 pages

Final DM

Uploaded by

Love Gates

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

Analysis of Campus Placements in India

Tejansh Sachdeva Chaitanya Tandon Mitaali Singhal

2110110555 2110110171 2110110883
Computer Science And Engineering Computer Science And Engineering Computer Science And Engineering
Shiv Nadar University Shiv Nadar University Shiv Nadar University
ts879@[Link] ct765@[Link] ms923@[Link]

I. INTRODUCTION introduced a recommendation framework capable of

predicting five distinct placement statuses for scholars,
There was a time when the success of an educational
enhancing their technical and social skills.
institute was judged by the level of skills and knowledge the
students hold but in the current times for most of the This model serves placement cells within academic
Undergraduate as well as postgraduate courses, success of institutions by identifying and advancing students with
education is measured by successful campus placements. potential based on their academic performance in 10th, 12th,
Campus placement of all the students is considered as and graduation, along with current backlog status. The
institutional obligation and merit. The ranking of the evaluation criteria encompass various metrics, including
institutions is based on the number of students placed accuracy scores, percentage accuracy scores, confusion
successfully and the average salary offered. Campus matrices, heat maps, and classification reports, encompassing
Placements is described as a program organized by the precision, recall, f1-score, and support. Several classification
university or educational institute in collaboration with algorithms, such as Support Vector Machine, Gaussian Naive
Bayes, K-Nearest Neighbor, Random Forest, Stochastic
various companies to provide job opportunities to students
Gradient Descent, Logistic Regression, and Neural
who are nearing the completion of studies. It is a widely
Networks, were applied to develop these classifiers.
used phenomena in the education industry.
This pivotal phase not only influences the professional In another study by Pal and Pal [4], the Naïve Bayes
trajectory of students but also significantly impacts the classifier emerged as the most effective choice for placement
standing of colleges and universities. Nevertheless, predictions. Ramanathan, Swarnalatha, and Gopal [5]
forecasting whether a student will successfully secure a adopted a different approach, using the sum of the difference
placement in a coveted company is a formidable challenge, method to predict student placements, considering attributes
as it hinges on a multitude of variables, including academic like age, academic records, and achievements, offering
achievements, personal backgrounds, prior work experience, valuable insights for higher learning institutions to improve
and more. education quality.
The comprehensive dataset used encompasses a plethora III. Methodology
of attributes, including secondary and higher education
Drawing from the insights presented in the literature
percentages, the number of internships undertaken, projects
review, a diverse array of robust analysis methods, including
completed, workshops attended, and much more. K-Nearest Neighbor and Random Forest, have been widely
Leveraging data mining techniques and machine learning adopted due to their credibility and high [Link] our
algorithms, this paper endeavors to predict the placement research, we follow a structured methodology commencing
outcomes of students based on these diverse attributes. with comprehensive data preprocessing to ensure data quality
and uniformity.
II. LITERATURE REVIEW
The significance of campus recruitment for both Subsequently, we embark on the vital task of feature
educational institutions and corporations is well-established selection, strategically assessing their relevance in relation to
in the literature. Research highlights a prevailing mismatch the target feature. To fortify the predictive accuracy of our
between students' skills and industry expectations. Beyond model, we execute rigorous testing, splitting the dataset into
technical expertise and subject knowledge, soft skills are a training subset and a smaller testing subset, maintaining a
emphasized as key factors in the campus recruitment process. ratio of 1:4. This carefully partitioned data allows us to
To bridge this gap, industries are encouraged to engage with scrutinize the performance of the model with precision.
campuses through internships, curriculum development, and Leveraging data mining classification techniques, we proceed
student workshops. Studies underscore the characteristics of to the core of our analysis, employing methods such as
the campus recruitment process and note that engineering Support Vector Machines, Decision Trees, and more.
students primarily base their career choices on intrinsic The ultimate step entails a meticulous evaluation of the
factors. Notably, software services companies in India play a predictive accuracy and performance of each classification
prominent role in campus recruitment, seeking students with technique. This meticulous evaluation forms the bedrock of
logical and problem-solving abilities. Building a positive our analysis, providing valuable insights into the
brand image on campuses is recognized as a pivotal factor in effectiveness of the methods in forecasting campus
attracting top talent, especially among non-computer placements, guiding our conclusive findings.
science/IT students with multiple options for career choices.
In a recent study [2], random forest algorithms were
employed to classify a dataset of campus-placed and non-
placed students, achieving an 86% accuracy rate. The study
The steps involved E. website. Here
in this system are as is the link for the
follows, datase
The campus placement dataset is collected from

The steps involved Kaggle([Link]

kar/campus-placement-data-for-engineering-colleges/
data

in this system are as

The dataset consists of various attributes such as
CGPA, Internships, Projects, Workshops, Aptitude
Scored SoftSkillsRating, Extracurricular Activities,
Placement Training taken or not and High School

follows,
marks.

B. Handling Categorical Data:

A. Data Acquisition Since we cannot deal with categorical values
directly mapping is done.
B. The campus Attributes such as Extracurricular activities and
Placement Training have values as ‘Yes’ and ‘No’. We
will replace these values with boolean numbers like 0,1

placement achieved by map function in python.

For eg:
df[‘training]=df[‘training’].map({‘Yes’:1, ‘No’:0})

dataset is
collected from
Kaggle
C. website. Here
is the link for the
dataset
D. The campus
Additional data-preprocessing is not required since data is
placement clean and does not have any null values in any of the rows.

C. Feature Selection:
dataset is Under this section we evaluate various
features/attributes and their co-relation with the target
feature.
collected from Analysis such as the number of students placed with respect
to the internship they did, number of projects they
completed or whether they took placement training or not.
Kaggle We then deduce whether the placement count of students is
dependent on these features.

D. Split Data:
Here, data is testing data. Where
divided into two 80 % data is taken
parts i.e. training for training our
data & machine
testing data. Where learning algorithm
80 % data is taken and remaining 20 %
for training our data is used for
machine testing
learning algorithm whether our trained
and remaining 20 % machine learning
data is used for model is working
testing correctly or not.
whether our trained Here, data is
machine learning divided into two
model is working parts i.e. training
correctly or not. data &
Here, data is testing data. Where
divided into two 80 % data is taken
parts i.e. training for training our
data & machine
learning algorithm whether our trained
and remaining 20 % machine learning
data is used for model is working
testing correctly or not.
whether our trained Here, data is
machine learning divided into two
model is working parts i.e. training
correctly or not. data &
Here, data is testing data. Where
divided into two 80 % data is taken
parts i.e. training for training our
data & machine
testing data. Where learning algorithm
80 % data is taken and remaining 20 %
for training our data is used for
machine testing
learning algorithm whether our trained
and remaining 20 % machine learning
data is used for model is working
testing correctly or not.
Here, data is testing data. Where
divided into two 80 % data is taken
parts i.e. training for training our
data & machine
testing data. Where learning algorithm
80 % data is taken and remaining 20 %
for training our data is used for
machine testing
learning algorithm whether our trained
and remaining 20 % machine learning
data is used for model is working
testing correctly or not.
Here, data is divided into two parts i.e., training data
whether our trained and testing data. Where 80% data is taken for training our
machine learning algorithm and remaining 20% data is used
for testing whether our trained machine learning model is
machine learning working correctly or not.
We can also use external data sets to test our model
and deduce the accuracy of the result with respect to the
model is working target feature.

E. Machine Learning Algorithm:

correctly or not. a) Logistic Regression:
Logistic regression is a statistical method used to
determine the outcome of a dependent variable(y) based on
Here, data is the values of independent variable(x).
In our problem dependent variable is placement status and
independent variables are the features selected by us in the
divided into two previous step.

parts i.e. training b) Decision Tree:

data & A decision tree is a
graph like a tree
where nodes edges represent the
represent answers of the
the position where question; and the
we select the feature leaves
and ask a question, represent the final
edges represent the output or label of
answers of the the class.
b) Decision Tree:
question; and the A decision tree is a graphical structure resembling a tree,
where nodes symbolize the points where we choose a
feature and pose a question, edges represent the answers to
leaves these questions, and the leaves represent the ultimate output
or class label.

represent the final c) K Nearest Neighbor

K-NN classifies new data by comparing its similarity to
known data in distinct classes based on class labels. It can
output or label of effectively consider a wide array of student attributes, such
as academic performance, internships, etc., to make
predictions based on similarities. This provides practical
the class. insights into the factors influencing campus placements,
making it a suitable choice for our task.

b) Decision Tree: d) Random Forest

The Random Forest classifier consists of several decision
trees which apply on different subsets of our dataset and the
A decision tree is a average of outputs of all the decision trees is taken to
improve the accuracy of output prediction.

graph like a tree e) Naïve Bayes

Naive Bayes offers several advantages, including its
simplicity, efficiency, and suitability for text and categorical
where nodes data. In the context of campus placements, the algorithm's
ability to calculate conditional probabilities allows us to

represent
make informed predictions based on a wide array of student
attributes.

the position where f) Support Vector Machine

SVM is known for its versatility in binary classification

we select the feature

tasks and is particularly valuable when dealing with datasets
that have diverse features and complex decision boundaries.
The use of SVM in campus placement prediction involves

and ask a question,

feature selection and preprocessing, data splitting, model
training, and evaluation.

F. Evaluation of Results
The evaluation of results in this scenario will primarily
focus on accuracy, a fundamental performance metric for
classification models. The accuracy score will be calculated
using the formula: Accuracy = (True Positives + True
Negatives) / (True Positives + False Positives + True
Negatives + False Negatives). True Positives represent
correctly identified placements, while True Negatives
denote correctly identified non-placements. False Positives
and False Negatives correspond to incorrect predictions of
placements and non-placements, respectively. This
comprehensive approach to evaluation will provide insights
into the model's ability to correctly predict placement
outcomes, offering a valuable assessment of its performance
and reliability.

IV. REFERENCES

[1] [Link]
placement-data-for-engineering-colleges/data
[2] Laxmi Shanker Maurya and Md Shadab Hussain and Sarita
Singh, “Developing Classifiers through Machine Learning
Algorithms for Student Placement Prediction Based on Academic
Performance”, In Applied Artificial Intelligence, vol. 35, no. 6, pp.
403-420, 2021, doi: 10.1080/08839514.2021.1901032.
[3] Pratiwi, Oktariani Nurul. “Predicting student placement class using
data minig.” Proceedings of 2013 IEEE International Conference on
Teaching,Assessment and Learning of Engineering(TALE).
IEEE,2013.
[4] [5] Pal, A.K. and S. Pal (2013)Analysis and Mining of Educational
Data for Predicting the Performance of Students. (IJECCE)
International Journal of ElectronicsCommunication and Computer
Engineering, Vol. 4, Issue 5, pp. 1560-1565, ISSN: 2278-4209, 2013.
[5] Ramanathan, L., P. Swarnalathat and G.D. Gopal (2014)Mining
Educational Data for Students’ Placement Prediction using Sum of
Difference Method. International Journal of Computer Applications
99(18): 36-39

Machine Learning for Campus Placement Prediction
No ratings yet
Machine Learning for Campus Placement Prediction
19 pages
Career Placement Predictor Tool
No ratings yet
Career Placement Predictor Tool
6 pages
Campus Placement Prediction with ML
No ratings yet
Campus Placement Prediction with ML
25 pages
Factors Influencing Campus Placements
No ratings yet
Factors Influencing Campus Placements
3 pages
Campus Placement
No ratings yet
Campus Placement
13 pages
Student Placement Prediction Portal
No ratings yet
Student Placement Prediction Portal
3 pages
Students Placement Prediction with ML
No ratings yet
Students Placement Prediction with ML
10 pages
Placement Prediction with Machine Learning
No ratings yet
Placement Prediction with Machine Learning
5 pages
Expert System for Student Placement
No ratings yet
Expert System for Student Placement
5 pages
College Placement Prediction with Data Science
No ratings yet
College Placement Prediction with Data Science
5 pages
Student Placement Prediction Model
No ratings yet
Student Placement Prediction Model
14 pages
Educational Data Mining For Student Placement Prediction Using Machine Learning Algorithms
No ratings yet
Educational Data Mining For Student Placement Prediction Using Machine Learning Algorithms
4 pages
Student Placement Prediction Models
No ratings yet
Student Placement Prediction Models
22 pages
Students Placement Prediction System
No ratings yet
Students Placement Prediction System
5 pages
Zoho University Admission Insights
No ratings yet
Zoho University Admission Insights
7 pages
Predictive Analysis of Student Placement
No ratings yet
Predictive Analysis of Student Placement
16 pages
Student Placement Prediction Models
No ratings yet
Student Placement Prediction Models
7 pages
Predicting Job Placement with ML Models
No ratings yet
Predicting Job Placement with ML Models
11 pages
Machine Learning for Campus Placement Prediction
No ratings yet
Machine Learning for Campus Placement Prediction
15 pages
Student Placement Prediction System
No ratings yet
Student Placement Prediction System
13 pages
R4 - Placement Prediction
No ratings yet
R4 - Placement Prediction
9 pages
Student Placement Prediction with ML
No ratings yet
Student Placement Prediction with ML
4 pages
Student Placement
No ratings yet
Student Placement
14 pages
Student Placement Prediction Model
No ratings yet
Student Placement Prediction Model
22 pages
Machine Learning for Campus Placement Prediction
No ratings yet
Machine Learning for Campus Placement Prediction
7 pages
Student Placement Analysis with ML
No ratings yet
Student Placement Analysis with ML
16 pages
E10380585S19
No ratings yet
E10380585S19
6 pages
Educational Data Mining For Student Placement Prediction Using Machine Learning Algorithms - Sreenivasa Rao - International Journal of Engineering & Technology
No ratings yet
Educational Data Mining For Student Placement Prediction Using Machine Learning Algorithms - Sreenivasa Rao - International Journal of Engineering & Technology
4 pages
Students Placement Prediction Using Machine Learning
No ratings yet
Students Placement Prediction Using Machine Learning
6 pages
Machine Learning for Student Placement
No ratings yet
Machine Learning for Student Placement
11 pages
Placement Prediction Using Machine Learning
No ratings yet
Placement Prediction Using Machine Learning
5 pages
Campus Placement Prediction with ML
No ratings yet
Campus Placement Prediction with ML
6 pages
Predicting Student Placement Rates
No ratings yet
Predicting Student Placement Rates
5 pages
Machine Learning for Campus Placement Prediction
No ratings yet
Machine Learning for Campus Placement Prediction
1 page
Student Placement Prediction Techniques
No ratings yet
Student Placement Prediction Techniques
5 pages
Student Campus Placement Prediction Analysis Using ChiSquared Test On Machine Learning Algorithms-IJRASET
No ratings yet
Student Campus Placement Prediction Analysis Using ChiSquared Test On Machine Learning Algorithms-IJRASET
10 pages
Student Placement Prediction
No ratings yet
Student Placement Prediction
4 pages
Placement Prediction with ML Models
No ratings yet
Placement Prediction with ML Models
5 pages
Machine Learning for Student Placement
No ratings yet
Machine Learning for Student Placement
3 pages
Machine Learning for Student Placement Predictions
No ratings yet
Machine Learning for Student Placement Predictions
3 pages
Machine Learning for Campus Placement Success
No ratings yet
Machine Learning for Campus Placement Success
13 pages
R1 Score in Student Placement Prediction
No ratings yet
R1 Score in Student Placement Prediction
19 pages
TnP Portal for Student Placement Management
No ratings yet
TnP Portal for Student Placement Management
9 pages
Machine Learning for Campus Placement
No ratings yet
Machine Learning for Campus Placement
8 pages
Machine Learning for Student Placement Prediction
No ratings yet
Machine Learning for Student Placement Prediction
9 pages
Classification Model of Prediction For Placement of Students
No ratings yet
Classification Model of Prediction For Placement of Students
9 pages
Student Placement Prediction Using Data Mining
No ratings yet
Student Placement Prediction Using Data Mining
7 pages
Placement Prediction Model at St. Mary's
No ratings yet
Placement Prediction Model at St. Mary's
47 pages
Students' Admission Placement in Nigerian Universities Using Artificial Neural Networks
No ratings yet
Students' Admission Placement in Nigerian Universities Using Artificial Neural Networks
10 pages
Tracking and Predecting Students Performance With Machine Learning
0% (1)
Tracking and Predecting Students Performance With Machine Learning
47 pages
Internship Training in MSSQL and ML
No ratings yet
Internship Training in MSSQL and ML
29 pages
Student Placement Prediction System
No ratings yet
Student Placement Prediction System
5 pages
Fusion Search for Student Success
No ratings yet
Fusion Search for Student Success
18 pages
First Project
No ratings yet
First Project
34 pages
Student Placement Success Factors Analysis
No ratings yet
Student Placement Success Factors Analysis
3 pages
Predicting Career Choices with Machine Learning
No ratings yet
Predicting Career Choices with Machine Learning
18 pages
Web-Based Recruitment Prediction System
No ratings yet
Web-Based Recruitment Prediction System
5 pages
Student Placement Analyzer Using ML
No ratings yet
Student Placement Analyzer Using ML
6 pages
Dr. Panovski on Masks and Public Transport
No ratings yet
Dr. Panovski on Masks and Public Transport
1 page
Rajasthan Geography Notes 2025
No ratings yet
Rajasthan Geography Notes 2025
11 pages
Sl. No. Code No. Place / Co-Ordinator Name Contact No. Address E-Mail ID
100% (1)
Sl. No. Code No. Place / Co-Ordinator Name Contact No. Address E-Mail ID
36 pages
Guard-band Clipping in 3D Rendering
No ratings yet
Guard-band Clipping in 3D Rendering
19 pages
Bureaucracy's Role in Pakistan's Governance
100% (2)
Bureaucracy's Role in Pakistan's Governance
11 pages
FAST Diagram in Value Engineering
No ratings yet
FAST Diagram in Value Engineering
16 pages
Powertec 365S, 425S & 505S Manual
100% (1)
Powertec 365S, 425S & 505S Manual
20 pages
Mechanical Integrity Engineer Resume
No ratings yet
Mechanical Integrity Engineer Resume
6 pages
Testing Product Prototypes in TLE Class
No ratings yet
Testing Product Prototypes in TLE Class
11 pages
Sanitation Plan Overview
No ratings yet
Sanitation Plan Overview
7 pages
Hydraulic Motor Selection Exam Guide
100% (1)
Hydraulic Motor Selection Exam Guide
2 pages
Ccna 2
No ratings yet
Ccna 2
6 pages
Icarus Dynamic Use & Maintenance Manual
No ratings yet
Icarus Dynamic Use & Maintenance Manual
311 pages
Haryana Civil Services Conduct Rules 2013
No ratings yet
Haryana Civil Services Conduct Rules 2013
28 pages
Parallel Finite Difference Navier-Stokes Simulations
No ratings yet
Parallel Finite Difference Navier-Stokes Simulations
8 pages
Rural Health Project Evaluation Plan
No ratings yet
Rural Health Project Evaluation Plan
8 pages
Total Building Solutions For Hospitals: The Next Generation of Intelligence
No ratings yet
Total Building Solutions For Hospitals: The Next Generation of Intelligence
8 pages
Low-Cost Microplastic Detection System
No ratings yet
Low-Cost Microplastic Detection System
8 pages
Air India Flight Schedule Update
No ratings yet
Air India Flight Schedule Update
39 pages
Negotiation Strategy Planning Guide
No ratings yet
Negotiation Strategy Planning Guide
11 pages
Is 2590 1987
No ratings yet
Is 2590 1987
11 pages
Evolution of Gender and Development
No ratings yet
Evolution of Gender and Development
6 pages
Building Energy Modeling Professional Long
No ratings yet
Building Energy Modeling Professional Long
20 pages
Industrial Electrician Interview Questions
No ratings yet
Industrial Electrician Interview Questions
3 pages
Internet Academia Aesthetics School Center by Slidesgo
No ratings yet
Internet Academia Aesthetics School Center by Slidesgo
48 pages
Emerging Web Technology Trends Module
No ratings yet
Emerging Web Technology Trends Module
64 pages
Ampalaya Coffee Project Schedule
No ratings yet
Ampalaya Coffee Project Schedule
2 pages
Types of Costing in Cost Accounting
No ratings yet
Types of Costing in Cost Accounting
4 pages
EOI Evaluation Criteria for Consulting Firms
No ratings yet
EOI Evaluation Criteria for Consulting Firms
3 pages
Consumer Satisfaction Study: Honda Two-Wheelers
100% (2)
Consumer Satisfaction Study: Honda Two-Wheelers
25 pages

Final DM

Uploaded by

Final DM

Uploaded by

Analysis of Campus Placements in India

Tejansh Sachdeva Chaitanya Tandon Mitaali Singhal

I. INTRODUCTION introduced a recommendation framework capable of

The steps involved Kaggle([Link]

in this system are as

B. Handling Categorical Data:

placement achieved by map function in python.

E. Machine Learning Algorithm:

parts i.e. training b) Decision Tree:

represent the final c) K Nearest Neighbor

b) Decision Tree: d) Random Forest

graph like a tree e) Naïve Bayes

the position where f) Support Vector Machine

we select the feature

and ask a question,

You might also like