0% found this document useful (0 votes)

2 views11 pages

Pratical Work

The document outlines an approach to applying numerical linear algebra concepts to detect plagiarism in computer science through an algorithm using cosine similarity. It details the steps of identifying a practical problem, understanding relevant concepts, implementing an algorithm, and analyzing simulation results. The algorithm preprocesses documents, computes a similarity matrix, and flags potentially plagiarized documents based on a defined threshold.

Uploaded by

ekpehope19

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

2 views11 pages

Pratical Work

Uploaded by

ekpehope19

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 11

GROUP 16

NUMERICAL LINEAR ALGEBRA

QUESTION
Apply the concepts of numerical linear algebra to solve a practical problem in computer science. Implement with an algorithm
and analyses the simulation results.
APPROACH STYLE
01 IDENTIFICATION OF PRATICAL PROBLEM IN COMPUTER SCI.
First, we need to find a practical problem where numerical linear algebra can
be applied. This could be anything from image processing, machine learning
models, data compression, or solving systems of equations that arise in
various computer simulations.

02 UNDERSTAND NUMERICAL LINEAR ALGEBRA CONCEPTS:

Numerical linear algebra involves the study of how matrix operations can be
used to create efficient and accurate computer algorithms for questions in
continuous mathematics1. It includes understanding vectors, matrices,
matrix operations, eigenvalues, and eigenvectors2.

03 IMPLEMENT AN ALGORITHM:

Once we had our problem and understand the necessary linear algebra
concepts, we implemented an algorithm. This involve writing a program that
uses SVD

04 SIMULATE AND ANALYZE RESULTS:

After implementing the algorithm, we run simulations to test its effectiveness.
We would use a dataset of document, run our feature extraction and
classification, and then analyze the results to see how well our algorithm
performed.
GLOBAL PLAGIARISM SURVEY

United States: China

Data_01 : 36% Data_01 : 70%
Data_02 : 7% Data_02 : 0%

Colombia Australia
Data_01 : 36% Data_01 : 15%
Data_02 : 0% Data_02: 0%

United States: Colombia: China: Australia:

A survey conducted in the A survey in Colombia found In China, a study conducted at a Research in Australia
United States revealed that 36% that 36% of students admitted leading university revealed that indicates that
of undergraduates admit to 70% of students admitted to approximately 15% of
to plagiarizing, highlighting a cheating in exams or assignments.
paraphrasing/copying a few students have
sentences without citation, while
significant issue with academic This high prevalence of academic purchased
7% admit to submitting work integrity in the country's dishonesty has raised concerns
assignments from
done by someone else. source educational institutions about the integrity of education in
online sources
the country. source 1.0
1.1 IDENTIFICATION OF PRATICAL PROBLEM IN
COMPUTER SCIENCE
You see, as computer scientists and tech enthusiasts, we're always working on
exciting projects, creating innovative solutions, and sharing our ideas with the
world. But there's a problem that's been popping up more and more often, and it's
something we need to address: plagiarism.
Plagiarism is like a sneaky ghost that haunts the world of computer science. It's
when someone takes the hard work, ideas, or creations of others and tries to pass
them off as their own. And unfortunately, it's becoming a bit of a problem in our
community.

Now, you might be wondering why this is such a big deal. Well, let me tell you. In
computer science, our ideas and creations are like building blocks. Each new discovery,
innovation, or program builds upon what came before it. But when someone plagiarizes,
they're not only being dishonest, but they're also hindering progress and undermining
the hard work of others.
So, as members of the computer science society, it's up to us to recognize this problem
and take action to prevent it. We need to promote integrity, honesty, and originality in
everything we do. And that's why today, we're going to dive deeper into the issue of
plagiarism in computer science, explore its consequences, and discuss how we can work
together to combat it.

Are you ready to tackle this challenge with me? Let's get started! 💻🔍
2.0 UNDERSTAND NUMERICAL LINEAR ALGEBRA
CONCEPTS:
Preprocessing:
•Tokenize the text documents into words or phrases.
•Convert the documents into numerical representations, such as TF-
IDF vectors or word embeddings.

Constructing Similarity Matrix:

Use cosine similarity to compute the similarity between each pair of
documents. Cosine similarity measures the cosine of the angle
between two vectors and is commonly used in text similarity tasks.

Identifying Copied Work:

•Analyze the similarity matrix to identify pairs of documents with
high similarity scores. This indicates potential instances of copied
work.
•Define a threshold above which documents are considered
plagiarized. Documents with similarity scores above this threshold
are flagged as potentially plagiarized.

Limitations:

•Numerical methods alone may not capture synonyms or

paraphrased content.
•Setting the threshold is subjective and depends on the desired level
of strictness.
3.0 ALGORITHM:
PLAGIARISM DETECTION USING COSINE SIMILARITY

Input: Output Steps: Simulation

First Step 1.PREPROCESS THE DOCUMENTS: 3.COSINE SIMILARITY 5.EXAMPLE USAGE:

- List of documents - Convert documents - Calculate cosine - Provide a list of
(texts) into TF-IDF vectors similarity score example documents.
- Similarity threshold using between document i - Call the
TfidfVectorizer. and document j. detect_plagiarism
(default 0.8)
- Compute TF-IDF function with the list
Content Here matrix representing 2. DETECT PLAGIARISM: 4.SIMILARITY SCORE of documents.
Get a modern the documents. - Compute cosine - If the similarity score is
PowerPoint similarity matrix greater than the
between all pairs of threshold:
Presentation that is
documents using - Print "Documents i and j
beautifully designed cosine_similarity are potentially
function. plagiarized with a
- For each pair of similarity score of score".
documents (i, j)
where i < j:
Preprocessing Documents:
from sklearn.feature_extraction.text import •The preprocess_documents function takes a list of documents as input.
TfidfVectorizer •It initializes a TfidfVectorizer object to convert the documents into TF-IDF vectors.
from sklearn.metrics.pairwise import cosine_similarity •The fit_transform method of the vectorizer computes the TF-IDF vectors for the
Content Content Content Content
given documents and returns a matrix representation.
def preprocess_documents(documents):
vectorized = TfidfVectorizer() Detecting Plagiarism:
return vectorized.fit_transform(documents) •The detect_plagiarism function takes the preprocessed TF-IDF vectors of
documents as input.
def detect_plagiarism(documents, threshold=0.8): •It computes the cosine similarity matrix between all pairs of documents using the
tiff_matrix = preprocess_documents(documents) cosine_similarity function from sklearn.metrics.pairwise.
similarity_matrix = cosine_similarity(tiff_matrix) •The similarity score between each pair of documents is compared against a
n = len(documents) threshold (default value is 0.8).
for i in range(n): •If the similarity score between a pair of documents is greater than the threshold, it
for j in range(i+1, n): indicates potential plagiarism.
similarity_score = similarity_matrix[i][j] •The function then prints out the indices of the potentially plagiarized documents
if similarity_score > threshold: along with their similarity scores.
print(f"Documents {i+1} and {j+1} are potentially
plagiarized with a similarity score of
Example Usage:
{similarity_score:.2f}")
•An example list of documents is provided.
# Example usage: •The detect_plagiarism function is called with this list of documents.
documents = [
"This is the first document.",
"This document is the second document.",
"And this is the third one.",
"Is this the first document?"
]

detect_plagiarism(documents)

3.1
4.0 SIMULATE AND ANALYZE RESULTS:

IF THE SIMILARITY SCORE IS GREATER

TAKE FIRST INPUT THAN THE THRESHOLD

COMPUTE TF-IDF MATRIX

REPRESENTING THE RUN CHECK WITH ALL
DOCUMENTS. THE INPUTS TAKEN

DETECT
PLAGIARISM

- Compute cosine similarity matrix between all pairs of documents

TAKE THE NEXT using cosine_similarity function. PRINTS POTENTIALLY PLAGIARIZED
INPUTS(CAN BE MOR THAN ONE) - For each pair of documents (i, j) where i < j: DOCUMENTS AND THEIR SIMILARITY
- Calculate cosine similarity score between document i and document j. SCORE
4.1 SIMULATE AND ANALYZE RESULTS:

INPUTS
"This is the first document."
"This document is the second document."
"And this is the third one."
"Is this the first document?".
PREPROCESSING DOCUMENTS:
• The preprocess_documents function takes a list of documents as input.
• It initializes a TfidfVectorizer object to convert the documents into TF-
IDF vectors.
• The fit_transform method of the vectorizer computes the TF-IDF vectors
for the given documents and returns a matrix representation.
THRESHOLDING AND ANALYSIS:
•Set a threshold for cosine similarity (e.g., 0.8). Documents with a
similarity above the threshold are flagged for further inspection.
•If the similarity score between a pair of documents is greater than
the threshold, it indicates potential plagiarism.
RESULTS
• The function then prints out the indices of the potentially
plagiarized documents along with their similarity scores.
• Remember, this is just an initial detection system. Further
human review is crucial to confirm plagiarism.
OUTPUTS
Documents 1 and 4 are potentially plagiarized with
a similarity score of 1.00
GROUP 16
Thank You
END OF PRESENTATION

Artificial Intelligence Capstone Project Idea
No ratings yet
Artificial Intelligence Capstone Project Idea
15 pages
MD Sohail Me102 Project Report II
No ratings yet
MD Sohail Me102 Project Report II
49 pages
Plagiarism Checker
No ratings yet
Plagiarism Checker
59 pages
6014
No ratings yet
6014
36 pages
Sodapdf Resized
No ratings yet
Sodapdf Resized
71 pages
Basawashree 1
No ratings yet
Basawashree 1
10 pages
Plagiarism Checker
No ratings yet
Plagiarism Checker
25 pages
Basawashreeeeeeeee
No ratings yet
Basawashreeeeeeeee
10 pages
Synopsis 6 Sem 34 ECE
No ratings yet
Synopsis 6 Sem 34 ECE
9 pages
My Projeact
No ratings yet
My Projeact
21 pages
Plagiarism
No ratings yet
Plagiarism
19 pages
Ijarcce 2024 134107
No ratings yet
Ijarcce 2024 134107
6 pages
Plagiarism Detection Application For Computer Science Student Theses Using Cosine Similarity and Rabin-Karp
No ratings yet
Plagiarism Detection Application For Computer Science Student Theses Using Cosine Similarity and Rabin-Karp
10 pages
Graph Data Science with Python and Neo4j: Hands-on Projects on Python and Neo4j Integration for Data Visualization and Analysis Using Graph Data Science for Building Enterprise Strategies (English Edition)
From Everand
Graph Data Science with Python and Neo4j: Hands-on Projects on Python and Neo4j Integration for Data Visualization and Analysis Using Graph Data Science for Building Enterprise Strategies (English Edition)
Timothy Eastridge
No ratings yet
Math for Deep Learning: What You Need to Know to Understand Neural Networks
From Everand
Math for Deep Learning: What You Need to Know to Understand Neural Networks
Ronald T. Kneusel
No ratings yet
Cop Tse Accepted
No ratings yet
Cop Tse Accepted
21 pages
Signed Report
No ratings yet
Signed Report
37 pages
A New Era of Plagiarism The Danger of Cheating Using AI
No ratings yet
A New Era of Plagiarism The Danger of Cheating Using AI
6 pages
Class 10th Ai Project File Work
No ratings yet
Class 10th Ai Project File Work
16 pages
Week 7-Computational Thinking
No ratings yet
Week 7-Computational Thinking
36 pages
My Project
No ratings yet
My Project
16 pages
Generative AI Report
No ratings yet
Generative AI Report
15 pages
Computer Science For Digital Engineering Assignment Report
No ratings yet
Computer Science For Digital Engineering Assignment Report
15 pages
Graph Data Science with Python and Neo4j: Hands-on Projects on Python and Neo4j Integration for Data Visualization and Analysis Using Graph Data Science for Building Enterprise Strategies
From Everand
Graph Data Science with Python and Neo4j: Hands-on Projects on Python and Neo4j Integration for Data Visualization and Analysis Using Graph Data Science for Building Enterprise Strategies
Timothy Eastridge
No ratings yet
Plagiarismchecker
No ratings yet
Plagiarismchecker
8 pages
Text Plagiarism Checker Using NLP: Presented by Under The Supervision of
No ratings yet
Text Plagiarism Checker Using NLP: Presented by Under The Supervision of
18 pages
Essay Algebra 2025
No ratings yet
Essay Algebra 2025
3 pages
Computational Intelligence: Synergies of Fuzzy Logic, Neural Networks and Evolutionary Computing
From Everand
Computational Intelligence: Synergies of Fuzzy Logic, Neural Networks and Evolutionary Computing
Nazmul Siddique
No ratings yet
Articles Plagiarism
No ratings yet
Articles Plagiarism
11 pages
Plagiarism
No ratings yet
Plagiarism
5 pages
Cosine Similarity in Machine Learning
No ratings yet
Cosine Similarity in Machine Learning
14 pages
Mobile App Year 2 Groupings-Groups Added
No ratings yet
Mobile App Year 2 Groupings-Groups Added
17 pages
Cosine Similarity and Plaigerism Detector With Code
No ratings yet
Cosine Similarity and Plaigerism Detector With Code
10 pages
Source Code Plagiarism
No ratings yet
Source Code Plagiarism
41 pages
PWPReport G7
No ratings yet
PWPReport G7
7 pages
Batch 20
No ratings yet
Batch 20
31 pages
Review1
No ratings yet
Review1
19 pages
AI Based Student's Assignments Plagiarism Detector
No ratings yet
AI Based Student's Assignments Plagiarism Detector
11 pages
Plagiarism Detection Using Artificial in
No ratings yet
Plagiarism Detection Using Artificial in
4 pages
B Ise 27 Admin Guide
No ratings yet
B Ise 27 Admin Guide
1,334 pages
PWP Proposal G 7
No ratings yet
PWP Proposal G 7
4 pages
Cppproject 4
No ratings yet
Cppproject 4
17 pages
Cppproject 5
No ratings yet
Cppproject 5
17 pages
Title of Project
No ratings yet
Title of Project
19 pages
Plagiarism Detection For Text and Images: Ms. Jaishma Kumari B
No ratings yet
Plagiarism Detection For Text and Images: Ms. Jaishma Kumari B
8 pages
Ijresm V4 I4 34
No ratings yet
Ijresm V4 I4 34
3 pages
Plagiarism PDF
No ratings yet
Plagiarism PDF
14 pages
Plagiarism Detection Process Using Data Mining Techniques
No ratings yet
Plagiarism Detection Process Using Data Mining Techniques
8 pages
List of Search Engines PDF
No ratings yet
List of Search Engines PDF
7 pages
1.man Is A Spirit, Has A Soul and L
No ratings yet
1.man Is A Spirit, Has A Soul and L
1 page
Palagiarism Detection
No ratings yet
Palagiarism Detection
14 pages
Study Shore
No ratings yet
Study Shore
4 pages
Aicomplete 2
No ratings yet
Aicomplete 2
11 pages
Image Classification: Step-by-step Classifying Images with Python and Techniques of Computer Vision and Machine Learning
From Everand
Image Classification: Step-by-step Classifying Images with Python and Techniques of Computer Vision and Machine Learning
Mark Magic
No ratings yet
DataOps For Power BI & Fabric 1
No ratings yet
DataOps For Power BI & Fabric 1
49 pages
ICT Theory 0417-Chapter 7
No ratings yet
ICT Theory 0417-Chapter 7
42 pages
Week 1 Java Basics Refresher
No ratings yet
Week 1 Java Basics Refresher
3 pages
IJRPR7794
No ratings yet
IJRPR7794
3 pages
Bavya
No ratings yet
Bavya
2 pages
Pract 1 Measuring The Document Similarity in Python
No ratings yet
Pract 1 Measuring The Document Similarity in Python
6 pages
Short Report
No ratings yet
Short Report
2 pages
Detecting Plagiarism in Academics Using Levenshtein Distance Algorithm and Semantic Similarity
No ratings yet
Detecting Plagiarism in Academics Using Levenshtein Distance Algorithm and Semantic Similarity
3 pages
Summary
No ratings yet
Summary
2 pages
V2 - 61 Analytics On Azure Specialization Audit Checklist
No ratings yet
V2 - 61 Analytics On Azure Specialization Audit Checklist
27 pages
Ijarcce 2022 114158
No ratings yet
Ijarcce 2022 114158
6 pages
Python Full
No ratings yet
Python Full
28 pages
Cosine Similarity
No ratings yet
Cosine Similarity
5 pages
Alshammari 2023 Ijca 922667
No ratings yet
Alshammari 2023 Ijca 922667
4 pages
05 PhilSys
No ratings yet
05 PhilSys
25 pages
CasinoDays TermsAndConditions in
No ratings yet
CasinoDays TermsAndConditions in
23 pages
CNN PPT Unit Iv
No ratings yet
CNN PPT Unit Iv
134 pages
From FPDF Import FPDF
No ratings yet
From FPDF Import FPDF
2 pages
IBM Power Expert Care L2 Quiz Attempt 2 Review PDF
No ratings yet
IBM Power Expert Care L2 Quiz Attempt 2 Review PDF
13 pages
Wa0006.
No ratings yet
Wa0006.
2 pages
Juniper Layered MPLS Troubleshooting Model
No ratings yet
Juniper Layered MPLS Troubleshooting Model
7 pages
3.3 SAMTEC - Introduction and Configuration: DICV-DM-M053
100% (1)
3.3 SAMTEC - Introduction and Configuration: DICV-DM-M053
6 pages
Copy Checker: Keywords:-Plagiarism System, Text Mining, Data Mining
No ratings yet
Copy Checker: Keywords:-Plagiarism System, Text Mining, Data Mining
3 pages
Fluent12 Lecture10 Transient PDF
No ratings yet
Fluent12 Lecture10 Transient PDF
13 pages
Test Real Number 4apr2025
No ratings yet
Test Real Number 4apr2025
2 pages
Pubkey
No ratings yet
Pubkey
3 pages
ADC121S101 Single Channel, 0.5 To 1 MSPS, 12-Bit A/D Converter
No ratings yet
ADC121S101 Single Channel, 0.5 To 1 MSPS, 12-Bit A/D Converter
18 pages
NC2430 - System Analysis and Design Level 4 Memo 2011
No ratings yet
NC2430 - System Analysis and Design Level 4 Memo 2011
8 pages
UML Lecture - ClassDiagram (BITP - Topic3 - 2)
No ratings yet
UML Lecture - ClassDiagram (BITP - Topic3 - 2)
29 pages
Evaluate The Quality Assurance (QA) Process and Review How It Was Implemented During Your Design and Development Stages
No ratings yet
Evaluate The Quality Assurance (QA) Process and Review How It Was Implemented During Your Design and Development Stages
7 pages
Manual Usuario bs230
100% (1)
Manual Usuario bs230
357 pages
Mathematics 8 - Module 1
No ratings yet
Mathematics 8 - Module 1
4 pages
TD-W8960N (EU) V8 Datasheet
No ratings yet
TD-W8960N (EU) V8 Datasheet
5 pages
GstarCAD2021 Network License Manager Guide
No ratings yet
GstarCAD2021 Network License Manager Guide
18 pages
Wat STD 0009a
No ratings yet
Wat STD 0009a
1 page
RC 2.2 Features
No ratings yet
RC 2.2 Features
2 pages
AVR-X1200W: Powerful 7.2Ch Av Receiver With Wi-Fi, Bluetooth, 3D Audio and Full 4K Ultra HD Support
No ratings yet
AVR-X1200W: Powerful 7.2Ch Av Receiver With Wi-Fi, Bluetooth, 3D Audio and Full 4K Ultra HD Support
2 pages
Advance Tech
No ratings yet
Advance Tech
9 pages
CyberPower K01-0000788-00 UM VP700-1600E (I) LCD En-2
No ratings yet
CyberPower K01-0000788-00 UM VP700-1600E (I) LCD En-2
1 page
Lesson Plan in Tle Computer Hardware and Servicing Grade 8 PDF Free
100% (1)
Lesson Plan in Tle Computer Hardware and Servicing Grade 8 PDF Free
8 pages
(EARLY RELEASE) Quick Start Guide To Large Language Models Strategies and Best Practices For Using ChatGPT and Other LLMs (Sinan Ozdemir) (Z-Library)
100% (14)
(EARLY RELEASE) Quick Start Guide To Large Language Models Strategies and Best Practices For Using ChatGPT and Other LLMs (Sinan Ozdemir) (Z-Library)
132 pages

Pratical Work

Uploaded by

Pratical Work

Uploaded by

GROUP 16

NUMERICAL LINEAR ALGEBRA

02 UNDERSTAND NUMERICAL LINEAR ALGEBRA CONCEPTS:

04 SIMULATE AND ANALYZE RESULTS:

United States: China

United States: Colombia: China: Australia:

Constructing Similarity Matrix:

Identifying Copied Work:

•Numerical methods alone may not capture synonyms or

Input: Output Steps: Simulation

First Step 1.PREPROCESS THE DOCUMENTS: 3.COSINE SIMILARITY 5.EXAMPLE USAGE:

IF THE SIMILARITY SCORE IS GREATER

COMPUTE TF-IDF MATRIX

- Compute cosine similarity matrix between all pairs of documents

You might also like