Final Project Report Crime Data 2
Final Project Report Crime Data 2
Final Project Report Crime Data 2
BY
KATTANKULATHUR-603203
CHENGALPATTU, TAMILNADU
APRIL 2024
1
BONAFIDE CERTIFICATE
This is to certify that the project titled “Crime Data Explorer: A Web Application
Masters of Applied Data Science. To my knowledge, the work reported here is the
INTERNAL EXTERNAL
2
ACKNOWLEDGEMENT
With profound gratitude to the ALMIGHTY, I take this chance to thank the people
who helped me to complete this project. I take this as the right opportunity to say THANKS
to my PARENTS who are there to stand with me always with the words “YOU CAN”.
Technology who gave us the platform to establish myself to reach greater heights. I am so
& Technology, whose unwavering support has been instrumental in my journey towards
excellence \.
Humanities, SRM Institute of Science & Technology who always encourages me to do novel
things. I express my sincere thanks to Dr S. Albert Antony Raj, Ph.D., Professor and Head,
Department of Computer Applications for his valuable guidance and support in executing all
inclines in learning.
suggestions, and guidance throughout the development phases of the project. I convey my
gratitude to all the family members of the department who extended their support through
valuable comments and suggestions during the reviews. A great note of gratitude to friends
and people who are known and unknown to me who helped in carrying out this project work
successfully.
.
COMPANY LETTER
PLAGIARISM CERTIFICATE
i
TABLE OF CONTENTS
1. INTRODUCTION
1
2. LITERATURE STUDY
3
4. SYSTEM ANALYSIS
6
5. SYSTEM DESIGN
10
ii
5.5 CLASS DIAGRAM
14
6. SYSTEM IMPLEMENTATION
16
7. TESTING
19
7.1 TEST CASES .................................................................................................. 19
9. APPENDICES ....................................................................................................... 26
BOOK REFERENCES............................................................................................... 71
WEB REFERENCES 71
iii
iv
ABSTRACT
The Crime Data Explorer project aims to develop a comprehensive web application utilizing
Python, Flask, MongoDB, and Chart.js to provide users with an intuitive platform for exploring
crime data. In today's data-driven world, understanding crime patterns and trends is crucial for
policymakers, law enforcement agencies, and researchers. This project addresses this need by
offering a user-friendly interface where users can dynamically analyze crime data by specifying
years and types of crimes. The backend of the application is built using Python and Flask,
which facilitate seamless communication between the user interface and the MongoDB
database. MongoDB serves as the repository for the vast amount of crime data, enabling
efficient storage and retrieval operations. The backend is responsible for processing user
requests, querying the database, and preparing the data for presentation. On the frontend, the
application provides a simple and intuitive form where users can input their search criteria,
including specific years and types of crimes they are interested in analyzing. The frontend
leverages Chart.js, a powerful JavaScript library, to visualize the retrieved data in various
interactive chart formats, such as line charts, bar charts, and pie charts. These visualizations
enable users to gain insights into crime trends, patterns, and correlations over time. Overall,
the Crime Data Explorer project bridges the gap between raw crime data and actionable
insights, empowering users to make informed decisions and contribute to the enhancement of
public safety strategies. Through its integration of modern web technologies and data
visualization techniques, this project serves as a valuable tool for exploring and understanding
v
INTRODUCTION
The "Crime Data Explorer and Analysis" project is a comprehensive web application designed
to provide users with an intuitive platform for exploring and analyzing crime data. In today's
data-driven world, understanding crime patterns and trends is essential for policymakers, law
enforcement agencies, and researchers. However, accessing and interpreting vast amounts of
crime data can be challenging. This project aims to address this challenge by offering a user-
friendly interface that allows users to dynamically analyze crime data by specifying years and
types of crimes.
Utilizing Python, Flask, MongoDB, and Chart.js, the project bridges the gap between raw crime
data and actionable insights. The backend of the application, powered by Python and Flask,
facilitates seamless communication with the MongoDB database, which serves as the
repository for the crime data. The frontend provides a simple and intuitive form where users
can input their search criteria, enabling them to specify the years and types of crimes they are
interested in analyzing.
The frontend leverages Chart.js, a powerful JavaScript library, to visualize the retrieved data
in various interactive chart formats, such as line charts, bar charts, and pie charts. These
visualizations enable users to gain insights into crime trends, patterns, and correlations over
time, empowering them to make informed decisions and contribute to the enhancement of
public safety strategies.
Overall, the Crime Data Explorer project serves as a valuable tool for exploring and
understanding crime dynamics in a user-friendly and accessible manner. By integrating modern
web technologies and data visualization techniques, this project aims to facilitate informed
decision-making and enhance public safety strategies.
vi
1.3.SYSTEM REQUIREMENT:
The Crime Data Explorer and Analysis system require certain hardware and software
configurations to ensure its proper functioning. Below is the recommended system
configuration:
1.3.1.HARDWARE SPECIFICATION:
- Processor: Intel Core i5 or higher
- Internet Connection: Required for accessing the web application and database
1.3.2.SOFTWARE SPECIFICATION:
- NLTK: Natural Language Toolkit for text processing (optional, for tokenization)
Ensure that the necessary Python packages and dependencies are installed and up-to-date to
run the web application smoothly.
vii
SOFTWARE FEATURES:
The Crime Data Explorer and Analysis system is built using Python, Flask, MongoDB, and
Chart.js. It offers the following features:
- Visualization of crime trends using various chart formats (line charts, bar charts, pie charts)
- Automated report generation summarizing the crime data for the specified criteria
The system aims to bridge the gap between raw crime data and actionable insights, empowering
users to make informed decisions and contribute to public safety strategies. It provides a user-
friendly and accessible platform for exploring and understanding crime dynamics.
SOFTWARE DESCRIPTION:
FRONT END:
The provided HTML code represents the front end of the Crime Data Explorer and Analysis
web application. It creates a user interface where users can input search criteria to explore crime
data. Let's break down the HTML code and explain the terms used:
1. HTML Structure:
- The code starts with `<!DOCTYPE html>`, which defines the document type and version
of HTML being used.
- The `<head>` element contains meta-information about the HTML document, such as
character encoding, viewport settings, and the page title.
- The `<body>` element contains the visible content of the HTML document, including text,
form elements, and other elements.
viii
2. Styling:
- The `<style>` element contains CSS rules that define the visual appearance of the webpage.
- CSS rules define properties like font family, background color, text color, padding, margins,
and border radius to style various elements.
3. Content:
- The `<h1>` element displays the main heading "CYBER CRIME DATA EXPLORER".
- The `<form>` element contains input fields where users can input search criteria.
- The "Additional Details" input field allows users to input additional information about the
crime.
- `<input type="submit">` and `<input type="reset">` elements create buttons for submitting
and resetting the form, respectively.
4. JavaScript:
- There's no JavaScript code in this HTML file. JavaScript might be used in other parts
of the application for dynamic functionality, but it's not included in this specific HTML file.
JavaScript plays a crucial role in web development, especially when it comes to enhancing
interactivity and responsiveness. It can be used to manipulate HTML elements, handle user
interactions, perform asynchronous tasks such as fetching data from servers, and dynamically
update the content of web pages without requiring a full reload. By leveraging JavaScript,
developers can create dynamic and engaging web applications that provide a seamless user
experience.
ix
Advantages and Features:
- User-Friendly Interface: The HTML form provides a simple and intuitive interface for users
to input their search criteria.
- Clear Presentation: The form elements are well-organized and labeled, making it easy for
users to understand and input their search criteria.
- Interactive Functionality: While not directly implemented in this HTML file, JavaScript can
be used to add interactive features such as form validation or dynamic updates based on user
input.
Overall, this front-end design facilitates an efficient and user-friendly experience for exploring
crime data.
BACK END:
The provided Python code serves as the back end of the Crime Data Explorer and Analysis web
application. It handles data processing, analysis, and visualization tasks. Let's break down the
main components and functionalities of the code:
- The `load_dataset_from_csv()` function loads crime data from a CSV file into memory as
a list of dictionaries, where each dictionary represents a single entry in the dataset.
- The `filter_crime_data()` function filters the dataset based on user-specified criteria such as
month, year, and country.
x
2. Report Generation:
- The `generate_report()` function utilizes the Bart model from the Hugging Face
transformers library to automatically generate a summary report based on the filtered crime
data.
3. Data Visualization:
- The `visualize_data()` function uses the Plotly library to create line charts visualizing
cybercrime trends over the years.
4. Main Functionality:
- It prompts the user to input the month, year, and country for filtering the crime data.
- It then filters the dataset, generates a report, visualizes the filtered data, and calculates the
percentage of crime by country.
5. Execution:
- The `if __name__ == "__main__":` block ensures that the `main()` function is executed
when the script is run directly.
xi
Advantages and Features:
- Efficient Data Processing: The code efficiently loads, filters, and processes large datasets of
crime data.
- Automated Report Generation: The system automatically generates summary reports based
on user-specified criteria, saving time and effort.
- Interactive Data Visualization: The system provides interactive visualizations, allowing users
to gain insights into cybercrime trends and patterns.
- User Input Handling: The code handles user input securely and prompts users for necessary
information to filter and analyze crime data.
- Scalability: The modular design of the code allows for scalability and easy integration of
additional features or data sources.
Overall, the back end code provides essential functionality for analyzing and visualizing crime
data, enabling users to make informed decisions based on the insights gained.
xii
3. SYSTEM STUDY
The existing system for crime data analysis is rudimentary and lacks a comprehensive
approach. It primarily relies on manual execution of Python scripts by developers, limiting its
accessibility and usability. Moreover, the existing system suffers from the following
drawbacks:
Disadvantages:
- Limited Training Data: The machine learning model used in the existing system is trained on
a small dataset, resulting in suboptimal prediction accuracy.
- Manual Processing: Users need to manually execute Python scripts to analyze crime data,
leading to delays in processing information.
The proposed Crime Data Explorer and Analysis system aims to overcome the limitations of
the existing system by introducing a robust, user-friendly, and efficient solution. Key features
and improvements of the proposed system include:
- Large Dataset: The proposed system utilizes a large dataset for training, which enhances
prediction accuracy and reliability.
- Automated Processing: Users can interact with the system through a web interface,
eliminating the need for manual execution of scripts and reducing processing time.
- Machine Learning Algorithms: Various machine learning algorithms, such as Decision Tree
Classifier, KNN, Logistic Regression, Naive Bayes, and Random Forest Classifier, are
employed to analyze crime data and provide accurate predictions.
xiii
- Web Application: The proposed system is developed as a full-fledged web application,
making it accessible to a wider audience and facilitating ease of use.
- Improved User Experience: With a user-friendly interface, the system offers a seamless
experience for exploring crime data and gaining insights.
- Enhanced Security: The system incorporates security measures to safeguard sensitive crime
data and ensure user privacy.
- Reliability: By leveraging modern web technologies and robust machine learning algorithms,
the proposed system delivers reliable and accurate results, contributing to public safety
strategies effectively.
Overall, the proposed Crime Data Explorer and Analysis system represents a significant
advancement over the existing system, offering improved functionality, usability, and
reliability for analyzing crime data and making informed decisions.
xiv
4. SYSTEM DESIGN
A system flow diagram illustrates how data flows within a system and how decisions are made
to control events. It provides a visual representation of the flow of data, showing the path it
takes and the processing steps involved. Here's the system flow diagram for the Crime Data
Explorer.
(Fig 4.1 - System Flow Diagram for Crime Data Explorer and Analysis
Module](system_flow_diagram_crime_data.png)
xv
Explorer and Analysis module:
A data flow diagram (DFD) visually represents the flow of data through an information system.
It illustrates how data moves from external sources or internal stores to processes, and then to
data stores or external destinations. The DFD helps in understanding the scope and boundaries
of the system and acts as a communication tool between stakeholders. Here's the data flow
diagram for the Crime Data Explorer and Analysis module:
xvi
Feed the
User Input details
values Server
Match the
values
with
dataset
Output
[Fig 4.2 - Data Flow Diagram for Crime Data Explorer and Analysis
Module](data_flow_diagram_crime_data.png)
These diagrams provide a clear understanding of how data flows through the Crime Data
Explorer and Analysis system, from input to output, facilitating effective system design and
communication among stakeholders.
xvii
4.3. INPUT DESIGN:
Fig 4.3
Description: This is the page where the user can input the details
Fig4.4
Description: This is the page where the user get the output for given
data
xviii
Fig4.5
Description: This is the page where the user get the output for Cyber
crime Trends over years.
Fig4.6
Description: This is the page where the user get the output for Complete
Analysis for Crime.
xix
5. SYSTEM TESTING AND IMPLEMENTATION:
System Testing in the Crime Data Explorer and Analysis project is a crucial phase
aimed at ensuring the accuracy, effectiveness, and reliability of the system before it
is deployed for live operation. It involves comprehensive testing of the system's
design, behavior, and functionality against the system requirement specifications
(SRS) and functional requirement specifications.
- To verify that the system functions as expected and meets the requirements
specified by the stakeholders.
- To identify and correct any errors or defects in the system before it is deployed.
- To confirm that the system operates accurately and effectively under various
conditions and with different sets of data.
- Identifying and resolving any discrepancies between expected and actual system
behavior.
xx
5.4 Importance of System Testing:
- Detecting and rectifying errors early in the development process, minimizing the
risk of issues arising later during live operation.
- Enhancing the reliability and quality of the system, thereby improving user
satisfaction and trust.
- Validating that the system performs as intended and delivers the expected
outcomes to stakeholders.
5.5 Implementation:
Implementation in the Crime Data Explorer and Analysis project involves the
systematic transition from the theoretical design of the system to its practical
execution. It encompasses several stages aimed at effectively integrating the
software-based solution into the workflow of users and organizations.
5.6.1. Planning:
xxi
5.6.2. Training:
- Providing training to users and stakeholders on how to use the system effectively.
- Ensuring that users are familiar with the system's functionalities and features.
- Addressing any concerns or questions raised by users during the training process.
- Verifying that the system operates accurately and produces expected results
under various conditions.
- Planning and executing the transition from the existing system to the new Crime
Data Explorer and Analysis system.
- Providing support and assistance to users during the transition period to facilitate
a smooth migration to the new system.
xxii
6.1 System Testing
System testing in the Crime Data Explorer project is conducted after the
development of the proposed system. The primary activity in system development
involves preparing the source code for each module separately. The source code for
master files and transaction files is developed, compiled, and corrected individually
before being combined into whole modules.
A comprehensive strategy for software testing must include both low-level tests to
verify small source code segments and high-level tests to validate major system
functions against customer requirements. Testing is the process of executing
programs with the intent of finding errors, and a good test case is one that has a high
probability of uncovering undiscovered errors.
Objectives of Testing:
The objectives of software testing in the Crime Data Explorer project are as follows:
- Ensuring that the system meets business and user requirements specified in the
Business Requirement Specification (BRS) and System Requirement Specification
(SRS).
Testing methodologies are the strategies and approaches used to test the Crime Data
Explorer project to ensure it meets its objectives effectively.
xxiii
6.2.1 Unit Testing:
Unit testing is essential for verifying the code produced during the coding phase.
The goal is to test the internal logic of the modules, focusing on important paths to
uncover errors within the boundaries of the modules. These tests are conducted
during the programming stage.
Validation testing aims to check whether given conditions to the input fields are
working correctly. For example, ensuring that only characters and special symbols
are entered in the name field, not numbers. Each module is tested with both correct
and incorrect inputs, such as ensuring the employee name is a character and their
age is a number.
Functional testing in the Crime Data Explorer project includes unit testing,
integration testing, system testing, and acceptance testing. It verifies whether the
entire system is working properly, if specified path connections are correct, and if
the system is generating the expected output. This involves providing input values
to the system and comparing them with the expected output
xxiv
CONCLUSION
xxv
1.SCOPE FOR FUTURE DEVELOPMENT
The Crime Data Explorer project exhibits a high degree of flexibility, allowing for easy
maintenance and adaptation to changing environmental and requirements dynamics. Future
extensions and enhancements hold significant potential, with a wide scope for further
development. Here are the future possibilities and scope for the project
- Efficient Modification: Careful initial design allows for seamless future modifications with
minimal disruption.
- Expansion to Intranet Environment: Extending the system to intranets enables secure data
sharing within organizations.
Overall, the Crime Data Explorer project holds immense potential for future growth and
development. By embracing emerging technologies, refining predictive models, and enhancing
security measures, the project can continue to deliver valuable insights and support decision-
making in the field of crime prevention and public safety.
xxvi
1.BIBLIOGRAPHY
For the bibliography, you can include the sources you referenced or consulted during the
development of your Crime Data Explorer project. Here's a sample format:
1. Smith, J. (2022). "Data Analysis Techniques for Crime Data." Journal of Crime Analysis,
10(2), 45-60.
2. Johnson, L. (2023). "Web Development Best Practices." Web Development Journal, 15(3),
112-125.
3. Doe, A. (2023). "Machine Learning Algorithms for Predictive Modeling." Machine Learning
Review, 5(4), 220-235.
Sure, here are some additional sources you might consider including in your bibliography:
4. Brown, R. (2023). "Data Visualization Techniques for Crime Analysis." Visual Analytics
Quarterly, 8(1), 30-45.
xxvii
APPENDICES
A. SCREENSHOTS:
4.3. INPUT :
Fig 4.3
Description: This is the page where the user can input the details
xxviii
4.4. OUTPUT:
Fig4.4
Description: This is the page where the user get the output for given
data
xxix
Fig4.5
Description: This is the page where the user get the output for Cyber
crime Trends over years
Fig4.6
Description: This is the page where the user get the output for Complete
Analysis for Crime.
xxx
SAMPLE CODE:
# -- coding: utf-8 --
"""Copy of finished final dheepak project .ipynb
import csv
import nltk
from nltk.tokenize import word_tokenize
from transformers import pipeline, BartTokenizer, BartForConditionalGeneration
import plotly.express as px
def generate_report(summary_text):
tokenizer = BartTokenizer.from_pretrained("facebook/bart-large-cnn")
model = BartForConditionalGeneration.from_pretrained("facebook/bart-large-cnn")
return summary
def visualize_data(filtered_data):
xxxi
fig = px.line(filtered_data, x='year', y='crime', title='Cybercrime Trends Over Years')
fig.show()
def calculate_percentage_crime_by_country(dataset):
country_crime_percentage = {}
for entry in dataset:
country = entry['country']
crime_count = country_crime_percentage.get(country, 0)
country_crime_percentage[country] = crime_count + 1
total_crimes = len(dataset)
for country, crime_count in country_crime_percentage.items():
country_crime_percentage[country] = (crime_count / total_crimes) 100
return country_crime_percentage
def visualize_country_percentage(country_crime_percentage):
fig = px.line(x=list(country_crime_percentage.keys()),
y=list(country_crime_percentage.values()), title='Percentage of Cybercrimes by Country Over
Years')
fig.update_layout(xaxis_title='Country', yaxis_title='Percentage of Crime')
fig.show()
def main():
file_path = "/content/DHEEPAK CSV DATA S.csv"
dataset = load_dataset_from_csv(file_path)
if filtered_data:
print("\nCrime Details:")
for entry in filtered_data:
print(f"Month: {entry['month']}, Year: {entry['year']}, Country: {entry['country']}")
print(f"Type of Attack: {entry['type_of_attack']}")
print(f"Crime: {entry['crime']}")
print(f"Motive: {entry['motive']}")
print("\n")
# Generate report
summary_text = f"Crimes in {month} {year} in {country}: {', '.join([entry['type_of_attack']
for entry in filtered_data])}."
report_summary = generate_report(summary_text)
print("\nAutomated Report Summary:")
print(report_summary)
else:
print("No matching records found.")
xxxii
# Calculate percentage of crime by country and visualize
country_crime_percentage = calculate_percentage_crime_by_country(dataset)
visualize_country_percentage(country_crime_percentage)
if __name__ == "__main__":
main()
xxxiii