ENROLLMENT NO: 202203103510400: Utu/Cgpit/Ce/Sem-6/Machine Intelligence (Ce5008)
ENROLLMENT NO: 202203103510400: Utu/Cgpit/Ce/Sem-6/Machine Intelligence (Ce5008)
5. NumPy
Key Features:
• Multidimensional Arrays: N-dimensional array object (ndarray) to store large datasets
efficiently.
• Mathematical Functions: Operations like addition, multiplication, and trigonometric
functions are vectorized for fast computation.
• Linear Algebra Support: Functions for matrix multiplication, eigenvalue problems, and
solving linear equations.
• Broadcasting: Ability to perform arithmetic operations on arrays of different shapes
without explicit loops.
Uses:
• Vectorized Computation: High-performance mathematical operations on large datasets.
• Modeling: Performing matrix operations for machine learning algorithms like regression.
• Numerical Simulations: Used in physical and statistical simulations where performance
is critical.
• Fast Data Processing: Optimized for numerical data processing tasks compared to
traditional Python lists.
6. Matplotlib
Key Features:
• Wide Range of Plots: Line, bar, scatter, histogram, heatmaps, and more.
• Customization: Full control over plot elements (titles, axis labels, colors, grids, etc.).
• Interactive Plots: Can integrate with Jupyter notebooks and other tools for interactive
visualizations.
• Subplots: Ability to plot multiple graphs in a single figure for comparison.
• Support for LaTeX-style math: Inline mathematical expressions in plots.
Uses:
• Data Exploration and Visualization: Plotting distributions, trends, and relationships
within datasets.
• Model Evaluation: Visualizing training curves, loss functions, and performance metrics.
• Scientific Publications: Generating high-quality plots for research papers or presentations.
• Custom Data Insights: Drawing insights from the data with tailored visualizations.
UTU/CGPIT/CE/SEM-6/MACHINE INTELLIGENCE[CE5008]
ENROLLMENT NO: 202203103510400
7.XGBoost
Key Features:
• Boosting Algorithm: Implements the gradient boosting technique, which sequentially
corrects errors made by previous models.
• Regularization: Includes L1 (Lasso) and L2 (Ridge) regularization to avoid overfitting.
• Parallel Processing: Efficient use of multiple CPU cores to speed up training.
• Handling Missing Data: Built-in handling for missing values, which can be treated in a
way that improves model performance.
Uses:
• Kaggle Competitions: One of the most popular algorithms for winning competitions
involving structured/tabular data.
• Classification/Regression: Efficient for tasks like customer segmentation, fraud detection,
and predictive modeling.
• Ranking Tasks: Used for ranking tasks like search engines and recommendation systems.
• Feature Selection: Helps identify important features due to its built-in regularization and
importance scoring.
8.LightGBM
Key Features:
• Histogram-based Algorithm: Reduces memory usage and speeds up training by grouping
continuous features into discrete bins.
• Categorical Feature Handling: Directly supports categorical variables without the need
for one-hot encoding.
• Leaf-wise Tree Growth: Grows trees leaf-wise instead of level-wise, making it more
efficient for large datasets.
• Multi-threading: Highly optimized for multi-threading, making it fast for large-scale
problems.
Uses:
• Large-Scale Machine Learning: Ideal for problems with massive datasets, like click-
through rate prediction or user recommendations.
• Fast Prototyping: Quickly build accurate models for structured data without needing
extensive hyperparameter tuning.
• Anomaly Detection: Excellent for classification problems like fraud detection.
UTU/CGPIT/CE/SEM-6/MACHINE INTELLIGENCE[CE5008]
ENROLLMENT NO: 202203103510400
9.Seaborn
Key Features:
• High-level Interface for Matplotlib: Built on top of Matplotlib to simplify complex
visualizations with less code.
• Beautiful and Informative Visualizations: Automatically handles aesthetics, colors, and
layouts for clean and readable plots.
• Built-in Themes: Offers a set of pre-defined themes for consistent and visually appealing
plots.
• Statistical Plots: Direct support for advanced statistical visualizations like violin plots,
box plots, heatmaps, and pair plots.
Uses:
• Data Exploration: Visualizing distributions and relationships in datasets.
• Correlation Heatmaps: Visualizing correlations between multiple variables in a dataset.
• Model Comparison: Plotting different machine learning model performance metrics.
• Statistical Analysis: Visualizing statistical results like confidence intervals and
distributions.
10.PyCaret
Key Features:
• Low-code Machine Learning: A low-code library that simplifies machine learning
workflows, making it accessible to non-experts.
• End-to-End Pipeline: Supports the entire machine learning process, from data
preprocessing and feature engineering to model selection and deployment.
• Model Comparison: Allows easy comparison of different models using various evaluation
metrics.
Uses:
• Quick Prototyping: Build and evaluate machine learning models quickly, suitable for both
beginners and advanced users.
• End-to-End ML Workflows: Automates tasks such as data cleaning, feature engineering,
model selection, and hyperparameter tuning.
• Model Interpretability: Provides tools to understand and interpret model predictions,
enhancing model explainability.
UTU/CGPIT/CE/SEM-6/MACHINE INTELLIGENCE[CE5008]
ENROLLMENT NO: 202203103510400
12. Optuna
Key Features:
• Hyperparameter Optimization: A library focused on automating the hyperparameter
optimization process, offering efficient search algorithms.
• Pruning: Includes algorithms that monitor training to stop unpromising trials early, saving
resources and time.
• Multi-objective Optimization: Supports optimization of multiple objectives at once (e.g.,
optimizing both accuracy and inference speed).
Uses:
• Hyperparameter Tuning: Automatically tuning the hyperparameters of machine learning
models, leading to better performance.
• Model Optimization: Used to find optimal model architectures and configurations for
deep learning models.
• Resource Efficiency: Helps save computational resources by efficiently managing search
processes and stopping ineffective trials early.
UTU/CGPIT/CE/SEM-6/MACHINE INTELLIGENCE[CE5008]
ENROLLMENT NO: 202203103510400
PRACTICAL 2
AIM: - Write a python program to solve the following problems:
a) Find the probability of drawing two kings from a deck.
b) A math teacher gave her class two tests, 25 % of the class passed both tests
and 42 % of the class passed the first test. Find the probability, number of
students passed the second test.
A]
Dynamic
Decks = int(input("Enter the number of decks: "))
deck = 52 * Decks
kings = 4 * Decks
Draw = int(input("Enter the number of kings to draw: "))
answer = 1
for i in range(0,Draw):
answer = answer * ((kings-i)/(deck-i))
print("Probability: ",answer)
B]
passed_both = float(input("Enter the percentage of students who passed both tests: ")) / 100
passed_first = float(input("Enter the percentage of students who passed the first test: ")) / 100
second_test = passed_both / passed_first
print(f"The probability of a student passing the second test: {second_test}")
UTU/CGPIT/CE/SEM-6/MACHINE INTELLIGENCE[CE5008]
ENROLLMENT NO: 202203103510400
TO IMPORT IRIS CSV FILE AND READ DATA FROM THAT USING PANDAS
LIBRARIES.
UTU/CGPIT/CE/SEM-6/MACHINE INTELLIGENCE[CE5008]