Data Science
Data Science
Libraries
Data
Data: Data refers to raw and unprocessed facts and figures. It can take the form of numbers, text,
images, or any other representation of facts. Data by itself does not carry any meaning and needs to be
processed to become useful.
Example: A list of temperatures recorded hourly (e.g., 25°C, 26°C, 24°C) is data. Each individual
temperature reading is a piece of data.
Information
Information: Information is data that has been processed, organized, or structured in a meaningful
way to convey a message or provide insights. It represents knowledge and has context, making it
valuable for decision-making.
Example: If we take the list of temperatures mentioned earlier and calculate the average temperature
for the day (e.g., 25.5°C), the average temperature becomes information. It provides a meaningful
summary of the data. Hare is another example of student rating chart.
Student Rating
6
0
Akash Ahnaf Sihan Moajjim
Math AI CS
Database
Database: A database is a structured collection of data that is organized and stored in a way that
allows for efficient retrieval, updating, and management. It typically consists of tables, relationships,
and a management system to facilitate easy access and manipulation of data.
Example: Consider a database for a library. It might have tables for books, authors, and borrowers.
Each table contains data, such as book titles, author names, and borrower information. The database
structure allows for efficient querying and retrieval of information, such as finding all books by a
specific author or tracking borrowed books.
Database Types
1. Relational Databases
Description: Store data in tables with rows and columns. Use SQL(Structured Query Language)
for querying.
Examples:
• MySQL
• PostgreSQL
• SQLite
• Microsoft SQL Server
• Oracle
Microsoft Access is a Desktop Relational Database Systems
2. NoSQL Databases
Description: Non-relational databases designed for flexibility and scalability. Data is often
stored in key-value pairs, documents, or graphs.
Examples:
• MongoDB
• Firebase Realtime Database
SQL (Structured Query Language)
SQL (Structured Query Language) is a standard programming language used for managing and
manipulating relational databases. It allows users to query, insert, update, and delete data in a
database, as well as manage database structures such as tables and indexes.
Python
SQL R
Data
Science
Jupyter
Power BI Notebooks
Tableau Excel
Data Analysis Tools
Python is a versatile programming language with a wide range of applications across various
domains. Some of the key application fields of Python include:
1. Web Development: Python is widely used for developing web applications and websites.
Frameworks like Django and Flask simplify the process of building robust and scalable web
applications.
2. Data Science and Machine Learning: Python is one of the most popular languages for data science
and machine learning. Libraries such as NumPy, Pandas, Matplotlib, and scikit-learn make it easy to
perform data analysis, visualization, and machine learning tasks.
5. Game Development: Python is used in game development, both for building full-fledged games
and for creating game-related tools and scripts. Libraries like Pygame provide a framework for game
development.
6. Desktop GUI Applications: Python can be used to develop graphical user interface (GUI)
applications using libraries like Tkinter, PyQt, and Kivy. This makes it suitable for creating desktop
applications with a graphical interface.
Python Libraries
Python has a vast ecosystem of libraries that cater to different domains and purposes. Here are
some important and commonly used libraries in Python:
1.NumPy: Fundamental package for scientific computing with support for large, multi-
dimensional arrays and matrices, along with a collection of mathematical functions.
2.Pandas: Provides high-performance, easy-to-use data structures and data analysis tools. It's
particularly useful for working with structured data and time series data.
3.Matplotlib: Comprehensive library for creating static, animated, and interactive visualizations
in Python. It is often used for plotting graphs, charts, histograms, and other types of
visualizations.
4.SciPy: Collection of mathematical algorithms and functions built on top of NumPy. It includes
modules for optimization, integration, interpolation, linear algebra, signal processing, and more.
5.scikit-learn: Simple and efficient tools for data mining and data analysis. It provides various
algorithms and tools for machine learning tasks such as classification, regression, clustering,
dimensionality reduction, and model selection.
6. TensorFlow: Open-source deep learning framework developed by Google. It provides a
comprehensive ecosystem of tools, libraries, and community resources for building and deploying
machine learning models, particularly deep neural networks.
7. PyTorch: Deep learning framework maintained by Facebook's AI Research lab. It is known for its
dynamic computational graph and ease of use, making it popular among researchers and
practitioners in the deep learning community.
8. Keras: High-level neural networks API that runs on top of TensorFlow, Theano, or Microsoft
Cognitive Toolkit (CNTK). It provides a simple and consistent interface for building and training
deep learning models.
9. NLTK (Natural Language Toolkit): Library for natural language processing (NLP) tasks such as
tokenization, stemming, part-of-speech tagging, parsing, and semantic reasoning.
10. OpenCV: Open-source computer vision library with a wide range of functions for image and video
processing, including object detection, face recognition, feature extraction, and more.
11. Requests: HTTP library for making HTTP requests in Python. It simplifies the process of sending
HTTP requests and handling responses, making it easy to work with web APIs and web services.
12. Django: High-level web framework for building web applications in Python. It follows the model-
view-controller (MVC) architectural pattern and provides features such as ORM (Object-Relational
Mapping), authentication, URL routing, and templating.
Cloud Computing