[go: up one dir, main page]

0% found this document useful (0 votes)
40 views47 pages

TulikaArun AIPT LabExperiments

Uploaded by

bhavyankarun1504
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
40 views47 pages

TulikaArun AIPT LabExperiments

Uploaded by

bhavyankarun1504
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 47

Artificial Intelligence- Programming Tools

PRACTICAL FILE

Indira Gandhi Delhi Technical University for Women

Submitted By
Tulika Arun
02302102024
1st Sem, MTech CSE - AI
Submitted To
Prof. S.R.N. Reddy
INDEX
S.No. Topic Date Remarks
1 LISP Basics of Programming 30/09/24
2 LISP Advance Features 30/09/24
3 Fibonacci Series and Pattern 30/09/24
Matching - LISP
4 Basics of Python 09/09/24
5 Control Structures of Python 09/09/24
6 Functions, Lambda 23/09/24
Functions using Python
7 Modules, Packages using 23/09/24
Python
8 OOP Concepts 23/09/24
implementation using
Python/various graphs using
matplotlib
9 Calculator using tensorflow 23/09/24
10 Files and File Handling using 07/10/24
Python
11 Train model with 6 hidden 07/10/24
layer - Tensorflow
12 MNIST fashion dataset 28/10/24
classification - Pytorch
13 Database creation using 28/10/24
Python and its operations
14 No of characters in string 04/11/24
and longest word
15 Celsius to Fahrenheit and 04/11/24
Fahrenheit to Celsius
16 First n rows of Pascal’s 04/11/24
Triangle
17 List addition using map and 11/11/24
lambda
18 List of content to a file 11/11/24
19 Web scrapping and web 18/11/24
page connection - R
20 Database creation 18/11/24
Experiment 1
Aim :
LISP Basics of Programming
Software Used :
Python 3.12.5, Visual Studio Code
Hardware Used :
Legion Pro 5
Code :
We show the basics of LISP via :
1. Printing a statement
2. Sum of Two Numbers
3. Finding the Maximum of Two Numbers

(format t "Tulika's experiment on LISP Basics~%")

(defun sum (a b)
(+ a b))
(format t "~%Sum of 3 and 5 is: ~A~%" (sum 3 5))

(defun max-of-two (a b)
(if (> a b)
a
b))

(format t "~%Max of 7 and 3 is: ~A~%" (max-of-two 7 3))

Output :

Conclusion :
We demonstrated basic fundamentals of LISP such as Printing a statement, Sum of Two
Numbers, Finding the Maximum of Two Numbers
Experiment 2
Aim :
LISP Advance Features
Software Used :
Python 3.12.5, Visual Studio Code
Hardware Used :
Legion Pro 5
Code :
We show the basics of LISP via :
1. Recursive Factorial Function
2. Fibonacci Series Using Recursion
(format t "Tulika's experiment on LISP Advanced~%")

(defun factorial (n)


(if (<= n 1)
1
(* n (factorial (- n 1)))))

(format t "Factorial of 5 is: ~A~%" (factorial 5))

(defun fibonacci (n)


(if (<= n 1)
n
(+ (fibonacci (- n 1)) (fibonacci (- n 2)))))

(format t "Fibonacci of 5 is: ~A~%" (fibonacci 5))


(format t "Fibonacci of 7 is: ~A~%" (fibonacci 7))

Output :

Conclusion :
We demonstrated advanced fundamentals of LISP such as Recursive Factorial Function and
Fibonacci Series Using Recursion
Experiment 3
Aim :
Fibonacci Series and Pattern Matching – LISP
Software Used :
Python 3.12.5, Visual Studio Code
Hardware Used :
Legion Pro 5
Code :

(defun fibonacci (n)


(if (<= n 1)
n
(+ (fibonacci (- n 1)) (fibonacci (- n 2)))))

(format t "Fibonacci of 10 is: ~A~%" (fibonacci 10))


(format t "Fibonacci of 14 is: ~A~%" (fibonacci 14))
Output :

Conclusion :
We demonstrated Fibonacci Series Using Recursion via LISP
Experiment 4
Aim :
Basics of Python
Software Used :
Python 3.12.5, Visual Studio Code
Hardware Used :
Legion Pro 5
Code :

# Basics of python
print("Tulika's AIPT Python Basics Experiment")

# ---------------- Sum of 2 numbers ------------------------------


a = 3
b = 5
sum_result = a + b
print(f"Sum of {a} and {b} is: {sum_result}")

# ----------------- Greater amongst two values ----------------------------

a = 7
b = 3

if a > b:
max_value = a
else:
max_value = b

print(f"Maximum of {a} and {b} is: {max_value}")

Output :

Conclusion :
We demonstrated python basics by printing a statement, calculating sum of two values and
finding the greater amongst two values
Experiment 5
Aim :
Control Structures of Python
Software Used :
Python 3.12.5, Visual Studio Code
Hardware Used :
Legion Pro 5
Code :

# Control Structures in Python

1. Conditional Statements (WAP to check if a number is positive or not)

number = 10
if number > 0:
print(f"{number} is positive")
elif number == 0:
print(f"{number} is zero")
else:
print(f"{number} is negative")

2. For Loop (WAP to get ranks less than or equal to 100)

ranks = [34, 222, 232, 100, 323, 21, 34, 4, 96]


for i in range(len(ranks)):
if ranks[i]<=100:
print(ranks[i])

3. While Statement (WAP to print all even numbers between 0 -10)

number = 0
while number < 11:
if number % 2 == 0:
print(number, end=" ")
number += 1

4. Break Statement (Linear Search WAP)

number = 23
for i in range(1, 100):
print('Number is : ', i)
if i == number:
print(f'Found {number}')
break
Output :

Conclusion :
We demonstrated control structures in python by using if-else, for loop, while loop, break.
Experiment 6
Aim :
Functions, Lambda Functions using Python
Software Used :
Python 3.12.5, Visual Studio Code
Hardware Used :
Legion Pro 5
Code :

# Functions and Lambda in Python

1. Write a function to print your name

def callName(name):
return name

name = input('Enter your name : ')


print(callName(name))

2. Lambda Functions

val_av = lambda a, b, c : (a+b+c)/3


print(val_av(3,2,3))

Output :

Conclusion :
We created a function to print our name and a lambda function to calculate the average of 3
numbers
Experiment 7
Aim :
Modules and Packages in Python by writing a code for calculator.
Software Used :
Python 3.12.5, Visual Studio Code
Hardware Used :
Legion 5 Pro
Code :
Package Structure :

Module calc.py :

def calculator(num1, num2, operation):


if operation == 'add':
return(num1+num2)
elif operation == 'subtract':
return(num1-num2)
elif operation == 'multiply':
return(num1*num2)
elif operation == 'divide':
return(num1/num2)

Module fact.py :

def fact(num):
factorial = 1
for i in range(1, num + 1):
factorial *= i
return factorial
Module fib.py :

def fibonacci(n):
a = 0
b = 1
if n < 0:
print("Incorrect input")

elif n == 0:
return 0

elif n == 1:
return b
else:
for i in range(1, n):
c = a + b
a = b
b = c
return b

Main file :

from aipt.calc import calculator


from aipt.fact import fact
from aipt.fib import fibonacci

# calculator
num1 = int(input('Enter first number for calculator : '))
num2 = int(input('Enter second number for calculator : '))

operation = ['add', 'subtract', 'multiply', 'divide']

for op in operation:
print(f'Answer for {op} is :')
print(calculator(num1=num1,
num2=num2,
operation=op))

# factorial
num_for_fact = int(input('Enter the number for which factorial needs to
be calculated : '))
print(f'Factorial of {num_for_fact} is : ')
print(fact(num_for_fact))

# fibonacci
num_for_fib = int(input('Enter the number for fibonacci sequence : '))
print(f'The {num_for_fib} - th Fibonacci sequence number is : ')
print(fibonacci(num_for_fib))
print(f'The resulted number 2 is', {num2})
Output :

Conclusion :
We demonstrate how modules and packages are created in python by creating a calculator
where the main file calls all the modules to execute them.
Experiment 8
Aim :
OOP Concepts implementation using Python
Software Used :
Python 3.12.5, Visual Studio Code
Hardware Used :
Legion 5 Pro
Code :
1. Classes and Objects

class Person:
def __init__(self, name, age):
self.name = name # Attribute
self.age = age # Attribute

def greet(self): # Method


return f"Hello, my name is {self.name} and I am {self.age} years
old."

# Creating an object
person1 = Person("Tulika", 25)
print(person1.greet())

2. Inheritance

class Animal:
def __init__(self, name):
self.name = name

def speak(self):
return f"{self.name} makes a sound."

# Inheriting from Animal


class Dog(Animal):
def speak(self):
return f"{self.name} barks."

# Creating an object of the subclass


dog = Dog("Pyaari")
print(dog.speak())
3. Polymorphism

class Bird:
def speak(self):
return "Chirp!"

class Cat:
def speak(self):
return "Meow!"

# Polymorphic behavior
def animal_sound(animal):
print(animal.speak())

bird = Bird()
cat = Cat()

animal_sound(bird)
animal_sound(cat)

4. Encapsulation
class BankAccount:
def __init__(self, owner, balance):
self.owner = owner
self.__balance = balance # Private attribute

def deposit(self, amount):


if amount > 0:
self.__balance += amount
print(f"Deposited: {amount}")
else:
print("Invalid deposit amount.")

def withdraw(self, amount):


if amount > self.__balance:
print("Insufficient balance!")
else:
self.__balance -= amount
print(f"Withdrawn: {amount}")

def get_balance(self):
return self.__balance

# Using the class


account = BankAccount("Tulika", 1000)
account.deposit(500)
account.withdraw(300)
print(f"Balance: {account.get_balance()}")
5. Abstraction
from abc import ABC, abstractmethod

class Shape(ABC):
@abstractmethod
def area(self):
pass

class Circle(Shape):
def __init__(self, radius):
self.radius = radius

def area(self):
return 3.14 * self.radius * self.radius

class Rectangle(Shape):
def __init__(self, width, height):
self.width = width
self.height = height

def area(self):
return self.width * self.height

# Using the classes


circle = Circle(5)
rectangle = Rectangle(4, 6)

print(f"Circle area: {circle.area()}")


print(f"Rectangle area: {rectangle.area()}")

6. Method Overloading and Overriding


# Overloading using default arguments
class Math:
def add(self, a, b=0, c=0):
return a + b + c

math = Math()
print(math.add(5, 10)) # Two arguments
print(math.add(5, 10, 15)) # Three arguments

# Overriding in inheritance
class Parent:
def display(self):
print("Parent class display.")

class Child(Parent):
def display(self):
print("Child class display.")
child = Child()
child.display()

Output :

Conclusion :
We demonstrate the concepts of OOPs in python via classes, objects, inheritance,
polymorphism, encapsulation, abstraction, overloading and overriding.
Experiment 9
Aim :
Calculator using tensorflow
Software Used :
Python 3.12.5, Visual Studio Code
Hardware Used :
Legion 5 Pro
Code :
import tensorflow as tf

def add(a, b):


return tf.add(a, b)
def subtract(a, b):
return tf.subtract(a, b)
def multiply(a, b):
return tf.multiply(a, b)
def divide(a, b):
return tf.divide(a, b)

a = tf.constant(10)
b = tf.constant(5)
# Perform operations
addition = add(a, b)
subtraction = subtract(a, b)
multiplication = multiply(a, b)
division = divide(a, b)
print(f"Addition: {addition.numpy()}")
print(f"Subtraction: {subtraction.numpy()}")
print(f"Multiplication: {multiplication.numpy()}")
print(f"Division: {division.numpy()}")

Output :

Conclusion :
We demonstrated the implementation of a calculator using the TensorFlow API.
Experiment 10
Aim :
Files and File Handling using Python
Software Used :
Python 3.12.5, Visual Studio Code
Hardware Used :
Legion 5 Pro
Code :
def file_operations(filename):
try:
with open(filename, 'w') as file:
file.write("Hello, this is Tulika Arun.\n")
file.write("We are learning file operations in Python.\n")
file.write("This is the third line of the file.\n")
print("File created and written successfully.")
with open(filename, 'a') as file:
file.write("Appending a new line to the file.\n")
print("Content appended successfully.")
# Open the file in read mode and display its content
with open(filename, 'r') as file:
content = file.read()
print("\nReading the entire file content:\n")
print(content)
# Open the file in read mode and read line by line
with open(filename, 'r') as file:
print("\nReading the file line by line:\n")
for line in file:
print(line, end="")
except Exception as e:
print(f"An error occurred: {e}")
file_operations("sample.txt")
Output :

Conclusion :
We demonstrated certain operations performed on the file such as – writing, appending,
reading the entire file, reading the file line by line.
Experiment 11
Aim :
Train model with 6 hidden layer – Tensorflow
Software Used :
Python 3.12.5, Visual Studio Code
Hardware Used :
Legion 5 Pro
Code :

import tensorflow as tf
from tensorflow.keras import Sequential
from tensorflow.keras.layers import Dense
from tensorflow.keras.optimizers import Adam
from sklearn.datasets import make_classification
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler

X, y = make_classification(n_samples=1000, n_features=20, n_informative=15,


n_classes=2, random_state=42)

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2,


random_state=42)

scaler = StandardScaler()
X_train = scaler.fit_transform(X_train)
X_test = scaler.transform(X_test)

model = Sequential([
Dense(64, input_dim=X_train.shape[1], activation='relu'),
Dense(128, activation='relu'),
Dense(256, activation='relu'),
Dense(128, activation='relu'),
Dense(64, activation='relu'),
Dense(32, activation='relu'),
Dense(1, activation='sigmoid')
])

model.compile(optimizer=Adam(learning_rate=0.001),
loss='binary_crossentropy', metrics=['accuracy'])

model.summary()

history = model.fit(X_train, y_train, epochs=30, batch_size=32,


validation_data=(X_test, y_test))
test_loss, test_accuracy = model.evaluate(X_test, y_test)
print(f"Test Accuracy: {test_accuracy:.2f}")

Output :

Conclusion :
We create a sequential model with 6 layers via tensorflow which gives an accuracy of 93%
for a synthetic classification dataset made via sklearn.
Experiment 12
Aim :
Train model with 6 hidden layer – Tensorflow
Software Used :
Python 3.12.5, Visual Studio Code
Hardware Used :
Legion 5 Pro
Code :

import torch
import torch.nn as nn
import torch.optim as optim
from torchvision import datasets, transforms
from torch.utils.data import DataLoader

transform = transforms.Compose([
transforms.ToTensor(),
transforms.Normalize((0.5,), (0.5,))
])

train_dataset = datasets.FashionMNIST(root="./data", train=True,


transform=transform, download=True)
test_dataset = datasets.FashionMNIST(root="./data", train=False,
transform=transform, download=True)

train_loader = DataLoader(train_dataset, batch_size=64, shuffle=True)


test_loader = DataLoader(test_dataset, batch_size=64, shuffle=False)

class FashionMNISTModel(nn.Module):
def __init__(self):
super(FashionMNISTModel, self).__init__()
self.fc = nn.Sequential(
nn.Flatten(),
nn.Linear(28 * 28, 256),
nn.ReLU(),
nn.Linear(256, 128),
nn.ReLU(),
nn.Linear(128, 10)
)

def forward(self, x):


return self.fc(x)

model = FashionMNISTModel()
criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(model.parameters(), lr=0.001)

epochs = 10
for epoch in range(epochs):
model.train()
total_loss = 0
for images, labels in train_loader:
optimizer.zero_grad()
outputs = model(images)
loss = criterion(outputs, labels)
loss.backward()
optimizer.step()
total_loss += loss.item()
print(f"Epoch {epoch+1}/{epochs}, Loss:
{total_loss/len(train_loader):.4f}")

model.eval()
correct = 0
total = 0
with torch.no_grad():
for images, labels in test_loader:
outputs = model(images)
_, predicted = torch.max(outputs, 1)
total += labels.size(0)
correct += (predicted == labels).sum().item()

accuracy = correct / total


print(f"Test Accuracy: {accuracy:.4f}")
Output :

Conclusion :
We perform a classification on MNIST Fashion Dataset by creating a neural network via
pytorch. The accuracy comes out to be 88.26%
Experiment 13
Aim :
Database creation using Python and its operations
Software Used :
Python 3.12.5, Visual Studio Code
Hardware Used :
Legion 5 Pro
Code :

import sqlite3

conn = sqlite3.connect("example.db")
cursor = conn.cursor()

cursor.execute("""
CREATE TABLE IF NOT EXISTS students (
id INTEGER PRIMARY KEY AUTOINCREMENT,
name TEXT NOT NULL,
age INTEGER NOT NULL,
grade TEXT NOT NULL
)
""")

cursor.execute("INSERT INTO students (name, age, grade) VALUES (?, ?, ?)",


("Tulika", 23, "A"))
cursor.execute("INSERT INTO students (name, age, grade) VALUES (?, ?, ?)",
("Arun", 22, "B"))
cursor.execute("INSERT INTO students (name, age, grade) VALUES (?, ?, ?)",
("Jha", 21, "A"))
conn.commit()

cursor.execute("SELECT * FROM students")


rows = cursor.fetchall()
print("Initial Records:")
for row in rows:
print(row)

cursor.execute("UPDATE students SET grade = ? WHERE name = ?", ("A+", "Bob"))


conn.commit()

cursor.execute("SELECT * FROM students")


rows = cursor.fetchall()
print("Updated Records:")
for row in rows:
print(row)

cursor.execute("DELETE FROM students WHERE name = ?", ("Charlie",))


conn.commit()

cursor.execute("SELECT * FROM students")


rows = cursor.fetchall()
print("Final Records:")
for row in rows:
print(row)

conn.close()

Output :

Conclusion :
A database by the name example.db is created via python. We perform basic CRUD
operations.
Experiment 14
Aim :
No of characters in string and longest word
Software Used :
Python 3.12.5, Visual Studio Code
Hardware Used :
Legion 5 Pro
Code :

def analyze_string(input_string):
num_characters = len(input_string.replace(" ", ""))

words = input_string.split()
longest_word = max(words, key=len) if words else ""

return num_characters, longest_word

input_string = "Hi my name is Tulika Arun and I am a student at IGDTUW"


num_characters, longest_word = analyze_string(input_string)

print(f"Number of characters (excluding spaces): {num_characters}")


print(f"Longest word: '{longest_word}'")

Output :

Conclusion :
We can see how the longest word is found.
Experiment 15
Aim :
Celsius to Fahrenheit and Fahrenheit to Celsius
Software Used :
Python 3.12.5, Visual Studio Code
Hardware Used :
Legion 5 Pro
Code :

def celsius_to_fahrenheit(celsius):
return (celsius * 9/5) + 32

def fahrenheit_to_celsius(fahrenheit):
return (fahrenheit - 32) * 5/9

# Example conversions
celsius = 25
fahrenheit = 77

converted_fahrenheit = celsius_to_fahrenheit(celsius)
converted_celsius = fahrenheit_to_celsius(fahrenheit)

print(f"{celsius}°C is {converted_fahrenheit}°F")
print(f"{fahrenheit}°F is {converted_celsius}°C")

Output :

Conclusion :
Farenheit is converted into Celsius by (celsius * 9/5) + 32 while the opposite happens by
(fahrenheit - 32) * 5/9
Experiment 16
Aim :
First n rows of Pascal's Triangle
Software Used :
Python 3.12.5, Visual Studio Code
Hardware Used :
Legion 5 Pro
Code :

def generate_pascals_triangle(n):
triangle = []
for row_num in range(n):
row = [None] * (row_num + 1)
row[0], row[-1] = 1, 1
for j in range(1, len(row) - 1):
row[j] = triangle[row_num - 1][j - 1] + triangle[row_num - 1][j]
triangle.append(row)
return triangle
def print_pascals_triangle(triangle):
for row in triangle:
print(" ".join(map(str, row)).center(len(triangle[-1])*3))

n = 10
triangle = generate_pascals_triangle(n)
print_pascals_triangle(triangle)

Output :

Conclusion :
The 10th row of the Pascal triangle is created as above.
Experiment 17
Aim :
List addition using map and lambda
Software Used :
Python 3.12.5, Visual Studio Code
Hardware Used :
Legion 5 Pro
Code :
list1 = [23, 53, 55, 412]

list2 = [8, 67, 73, 85]

result = list(map(lambda x, y: x + y, list1, list2))

print("Result of list addition:", result)

Output :

Conclusion :
The lists are added via map and lambda.
Experiment 18
Aim :
List of content to a file
Software Used :
Python 3.12.5, Visual Studio Code
Hardware Used :
Legion 5 Pro
Code :

content = ['Hi I am Tulika Arun.', 'I study at IGDTUW.']

with open("output_exp18.txt", "w") as file:

for line in content:


file.write(line + "\n")

print("Content written to file successfully.")

Output :

Conclusion :
The content of the above list is sto
Experiment 19
Aim :
Web scraping and web page connection in R
Software Used :
Python 3.12.5, Visual Studio Code
Hardware Used :
Legion 5 Pro
Code :
install.packages("rvest")

library(rvest)
webpage = read_html("https://www.geeksforgeeks.org /
data-structures-in-r-programming")

heading = html_node(webpage, '.entry-title')

text = html_text(heading)
print(text)

paragraph = html_nodes(webpage, 'p')

pText = html_text(paragraph)

print(head(pText))

Output :

Conclusion :
The web scraping and connection is done via R Language.
Experiment 20
Aim :
Database creation in R
Software Used :
Python 3.12.5, Visual Studio Code
Hardware Used :
Legion 5 Pro
Code :
# Install package
install.packages("RMySQL")
# Loading library
library("RMySQL")

mysqlconn = dbConnect(MySQL(), user = 'root', password = 'welcome',


dbname = 'GFG', host = 'localhost')
dbListTables(mysqlconn)
# Create connection object
mysqlconn = dbConnect(MySQL(), user = 'root',
password = 'welcome',
dbname = 'GFG',
host = 'localhost')
dbSendQuery(mysqlconn, 'DROP TABLE mtcars')
# Create connection object
mysqlconn = dbConnect(MySQL(), user = 'root',
password = 'welcome',
dbname = 'GFG', host = 'localhost')
dbSendQuery(mysqlconn, "insert into articles(sno, type)
values(1, 'R language')"
)
Output :

Conclusion :
A database is created via R.
Experiment 12
Aim :
Mini Project to implement any specific machine Learning application.
PROJECT : Commodity price forecasting.

Dataset :
The dataset used for this project is sourced from Kaggle: Commodity Prices Dataset
This dataset contains daily prices of various commodities across categories like Energy,
Industrial Metals, Precious Metals, Grains, Livestock, and Softs.

Algorithm :
• Data Preprocessing:
o Fill missing values using forward-fill and backward-fill methods.
o Filter data starting from 2018.
o Remove rows with non-numeric values to ensure clean input for analysis.
• Exploratory Data Analysis (EDA):
o Generate correlation heatmaps to understand relationships between
features.
o Plot histograms for categories of commodities.
o Use PCA for dimensionality reduction and K-Means clustering for grouping
commodities.
o Perform detailed trend and seasonality analysis for the target commodity.
• Feature Selection:
o Use multiple techniques (correlation, VIF, mutual information, and
SelectKBest) to identify significant features.
o Select features that perform well across at least three methods.
• Modeling and Forecasting:
o Train predictive models on the selected features.
o Perform future forecasting using the best-performing model.

Program :
The program is implemented in Python using libraries such as pandas, matplotlib, seaborn,
sklearn, and statsmodels. Key functionalities are modularized into classes for preprocessing,
EDA, feature selection, and modeling.
commodity_price_forecasting.py :
import pandas as pd
from data_preprocessing.preprocessing import Preprocessing
from eda.eda import EDA
from feature_selection.feature_selection import Feature_Selection
from modelling.modelling import Modelling

class Tulika_Predicomm:
def predicomm(raw_data_path, commodity, category, category_commodity):
raw_data = pd.read_csv(f'{raw_data_path}')
commodity = commodity
category = category

df = Preprocessing.preprocessing(raw_data=raw_data)
EDA.overall_eda(df=df,category_commodity=category_commodity)
print(df.columns)
EDA.pca_clustering(df, category_commodity)
df_summary = EDA.target_eda(df, commodity)

_, selected_features = Feature_Selection.feature_analysis(commodity,
df)
print(' ')
print('Features selected after every method : ')
for i, j in _.items():
print(i)
print(j)
print(selected_features)
print(' ')

best_model = Modelling.modelling(df, selected_features, commodity)


Modelling.future_forecast(df, commodity, best_model)
return df

category_commodity = {
"ENERGY": [
"NATURAL GAS",
"LOW SULPHUR GAS OIL",
"WTI CRUDE",
"BRENT CRUDE",
"ULS DIESEL",
"GASOLINE",
],
"INDUSTRIAL METALS": [
"COPPER",
"ALUMINIUM",
"ZINC",
"NICKEL",
],
"PRECIOUS METALS": [
"GOLD",
"SILVER",
],
"GRAINS": [
"CORN",
"SOYBEANS",
"WHEAT",
"SOYBEAN OIL",
"SOYBEAN MEAL",
"HRW WHEAT",
],
"LIVESTOCK": [
"LIVE CATTLE",
"LEAN HOGS",
],
"SOFTS": [
"SUGAR",
"COFFEE",
"COTTON",
],
}

file_path = r'C:\Users\tulik\Desktop\IGDTUW\ML\ML
Lab\predicomm\predicomm\dataset\commodity_futures.csv'
commodity = 'SILVER'
category = 'PRECIOUS METALS'
print(Tulika_Predicomm.predicomm(raw_data_path=file_path,
commodity=commodity,
category=category,
category_commodity=category_commodity))

preprocessing.py :
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import numpy as np
from sklearn.feature_selection import mutual_info_regression, SelectKBest,
f_regression
from statsmodels.stats.outliers_influence import variance_inflation_factor
from sklearn.preprocessing import StandardScaler, MinMaxScaler

class Preprocessing:

def treatNA(df):
df = df.ffill()
df = df.bfill()
return df

def is_numeric(value):
try:
float(value)
return True
except ValueError:
return False

def preprocessing(raw_data):
df = raw_data.copy()
df['Date'] = pd.to_datetime(df['Date'], format="%d-%m-%Y")
date_range = pd.date_range(start=df['Date'].min(),
end=df['Date'].max())
df = df.set_index('Date').reindex(date_range)
df = df.reset_index()
df.rename(columns={'index': 'Date'}, inplace=True)
df = df.set_index('Date')

df.to_csv(r'C:\Users\tulik\Desktop\IGDTUW\ML\ML
Lab\predicomm\predicomm\dataset\commodity_futures_alldates.csv', index=False)
print('wooh')
df = df[df.index.year >= 2018]

df = Preprocessing.treatNA(df)

df = df[df.applymap(Preprocessing.is_numeric).all(axis=1)]

return df

eda.py :

import numpy as np
import matplotlib.pyplot as plt
from sklearn.datasets import make_blobs

def initialize_centroids(X, k):

indices = np.random.choice(X.shape[0], k, replace=False)


return X[indices]

def assign_clusters(X, centroids):

distances = np.linalg.norm(X[:, np.newaxis] - centroids, axis=2)


return np.argmin(distances, axis=1)

def update_centroids(X, labels, k):

return np.array([X[labels == i].mean(axis=0) for i in range(k)])

def compute_inertia(X, centroids, labels):

return sum(np.linalg.norm(X[labels == i] - centroids[i], axis=1).sum()


for i in range(len(centroids)))

def kmeans(X, k, max_iters=100, tol=1e-4):

centroids = initialize_centroids(X, k)

for iteration in range(max_iters):


labels = assign_clusters(X, centroids)
new_centroids = update_centroids(X, labels, k)

if np.all(np.abs(new_centroids - centroids) < tol):


break
centroids = new_centroids

inertia = compute_inertia(X, centroids, labels)


return centroids, labels, inertia

def elbow_method(X, max_k=10):

inertias = []
for k in range(1, max_k + 1):
_, _, inertia = kmeans(X, k)
inertias.append(inertia)
return inertias

def plot_elbow(inertias, max_k):

plt.figure(figsize=(8, 5))
plt.plot(range(1, max_k + 1), inertias, marker='o')
plt.title("Elbow Method for Optimal K")
plt.xlabel("Number of Clusters (k)")
plt.ylabel("Inertia")
plt.grid(True)
plt.show()

def plot_clusters(X, centroids, labels):

plt.figure(figsize=(8, 5))
plt.scatter(X[:, 0], X[:, 1], c=labels, cmap='viridis', alpha=0.6,
label='Data Points')
plt.scatter(centroids[:, 0], centroids[:, 1], c='red', marker='X', s=200,
label='Centroids')
plt.title("K-means Clustering")
plt.xlabel("Feature 1")
plt.ylabel("Feature 2")
plt.legend()
plt.grid(True)
plt.show()

X, _ = make_blobs(n_samples=300, n_features=2, centers=4, cluster_std=1.0,


random_state=42)

plt.figure(figsize=(8, 5))
plt.scatter(X[:, 0], X[:, 1], c='blue', alpha=0.6)
plt.title("Original Data Points")
plt.xlabel("Feature 1")
plt.ylabel("Feature 2")
plt.grid(True)
plt.show()

max_k = 10
inertias = elbow_method(X, max_k)
plot_elbow(inertias, max_k)
optimal_k = 4

centroids, labels, _ = kmeans(X, optimal_k)

plot_clusters(X, centroids, labels)

feature_selection.py :
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import numpy as np
from sklearn.feature_selection import mutual_info_regression, SelectKBest,
f_regression
from statsmodels.stats.outliers_influence import variance_inflation_factor
from sklearn.preprocessing import StandardScaler, MinMaxScaler

class Feature_Selection:
def feature_analysis(target_column, data):
# Separate features and target
X = data.drop(columns=[target_column])
y = data[target_column]

# Standardize the features


scaler = MinMaxScaler()
X_scaled = scaler.fit_transform(X)
X_scaled_df = pd.DataFrame(X_scaled, columns=X.columns)

# 1. Correlation
correlations = X.corrwith(y).abs()
top_corr_features = correlations[correlations > 0.6].index.tolist()

# 2. Multicollinearity (VIF)
vif_data = pd.DataFrame()
vif_data['Feature'] = X.columns
vif_data['VIF'] = [variance_inflation_factor(X_scaled, i) for i in
range(X_scaled.shape[1])]
top_vif_features = vif_data[vif_data['VIF'] < 5]['Feature'].tolist()

# 3. Mutual Information Gain


mi_scores = mutual_info_regression(X_scaled, y)
mi_features = X.columns[mi_scores > np.mean(mi_scores)].tolist()

# 4. K Best Features
selector = SelectKBest(score_func=f_regression, k=5)
selector.fit(X_scaled, y)
k_best_features = X.columns[selector.get_support()].tolist()

feature_counts = pd.Series(
top_corr_features + top_vif_features + mi_features +
k_best_features
).value_counts()
selected_features = feature_counts[feature_counts >=
3].index.tolist()

return {
"Correlation": top_corr_features,
"High Multicollinearity Features": top_vif_features,
"Mutual Information": mi_features,
"K Best Features": k_best_features,
"Selected Features (>= 3 methods)": selected_features
}, selected_features

modelling.py :
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import numpy as np
from sklearn.feature_selection import mutual_info_regression, SelectKBest,
f_regression
from statsmodels.stats.outliers_influence import variance_inflation_factor
from sklearn.preprocessing import StandardScaler, MinMaxScaler
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
from sklearn.ensemble import RandomForestRegressor
from xgboost import XGBRegressor
from datetime import timedelta
from sklearn.metrics import r2_score, mean_squared_error

class Modelling:
def modelling(df, selected_features, target_column):
df["Year"] = df.index.year

train = df[df["Year"] <= 2021]


validate = df[(df["Year"] > 2021) & (df["Year"] <= 2022)]
test = df[df["Year"] > 2022]

print("Train set size:", train.shape)


print("Validation set size:", validate.shape)
print("Test set size:", test.shape)

X_train, y_train = train[selected_features], train[target_column]


X_validate, y_validate = validate[selected_features],
validate[target_column]
X_test, y_test = test[selected_features], test[target_column]

models = {
"Linear Regression": LinearRegression(),
"Random Forest": RandomForestRegressor(random_state=42),
"XGBoost": XGBRegressor(random_state=42)
}
results = {}
for model_name, model in models.items():
model.fit(X_train, y_train)
y_val_pred = model.predict(X_validate)
y_test_pred = model.predict(X_test)

rmse_val = np.sqrt(mean_squared_error(y_validate, y_val_pred))


rmse_test = np.sqrt(mean_squared_error(y_test, y_test_pred))

accuracy_val = 1 - (np.sum(np.abs(y_validate - y_val_pred)) /


np.sum(np.abs(y_validate)))
accuracy_test = 1 - (np.sum(np.abs(y_test - y_test_pred)) /
np.sum(np.abs(y_test)))

results[model_name] = {
"Validation RMSE": rmse_val,
"Test RMSE": rmse_test,
"Validation Accuracy": accuracy_val,
"Test Accuracy": accuracy_test,
}

best_model_name = max(results, key=lambda x: results[x]["Test


Accuracy"])
best_model = models[best_model_name]

print("Model Performance Metrics:")


for model_name, metrics in results.items():
print(f"{model_name}:")
print(f" Validation RMSE: {metrics['Validation RMSE']:.4f}")
print(f" Test RMSE: {metrics['Test RMSE']:.4f}")
print(f" Validation Accuracy: {metrics['Validation
Accuracy']:.4f}")
print(f" Test Accuracy: {metrics['Test Accuracy']:.4f}\n")

print(f"Best Model: {best_model_name}")

best_model.fit(X_train, y_train)
y_test_pred = best_model.predict(X_test)

plt.figure(figsize=(12, 6))
plt.plot(y_test.values, label="Actual Values", color="blue")
plt.plot(y_test_pred, label=f"Predicted Values ({best_model_name})",
color="orange")
plt.title(f"Actual vs Predicted Values ({best_model_name})")
plt.xlabel("Sample Index")
plt.ylabel("Target Value")
plt.legend()
plt.show()
return best_model

def future_forecast(df, target_column, best_model):


forecast_period = 30
df['Shifted_Target'] = df[target_column].shift(-forecast_period)

train_data = df.dropna(subset=['Shifted_Target'])

# Test data starts where the shifted target becomes NaN


test_data = df[df['Shifted_Target'].isna()]

# Prepare training and test data


X_train = train_data.drop(columns=[target_column, 'Shifted_Target'])
y_train = train_data['Shifted_Target']

X_test = test_data.drop(columns=[target_column,
'Shifted_Target']).iloc[:forecast_period]
last_date = df.index[-1]

# Train a model
model = best_model
model.fit(X_train, y_train)

# Make predictions for test data


y_pred = model.predict(X_test)

# Generate future dates starting from the last known date


future_dates = [last_date + timedelta(days=i + 1) for i in
range(len(y_pred))]

# Combine results into a DataFrame


results = pd.DataFrame({
'Date': future_dates,
f'Predicted {target_column}': y_pred
})

# Display results
print("Future Predictions:")
print(results)

# Filter data for 2023


df_2023 = df[df.index.year == 2023]

plt.figure(figsize=(12, 6))

# Plot historical values for 2023


plt.plot(df_2023.index, df_2023[target_column], label='Historical
Values (2023)', color='blue')

# Plot future predictions for 2023


plt.plot(results['Date'], results[f'Predicted {target_column}'],
label='Future Predictions', color='orange', linestyle='--')

plt.title(f"Historical and Future Predictions for {target_column} in


2023")
plt.xlabel("Date")
plt.ylabel(target_column)
plt.legend()
plt.xticks(rotation=45)
plt.tight_layout()
plt.show()
INDEX

S.No. List of Experiments Due date Remarks

1 Write a program to implement 02/09/24 Use Breast Cancer Wisconsin


various feature selection techniques (Diagnostic) Dataset
and compare the performance with a
classifier.

2 Write a program to implement linear 05/09/24 Use student dataset


regression without using python
libraries

3 Write a program to implement 05/09/24 Use Breast Cancer Wisconsin


logistic regression without using (Diagnostic) Dataset
python libraries

4 Write a program to implement the 09/09/24 Use Breast Cancer Wisconsin


support Vector machine algorithm (Diagnostic) Dataset

5 Write a program to implement back 16/09/24 Use


propagation neural network for
classification of sample data without Iris Dataset
using Python libraries

6 Write a program to implement 23/09/24 Use Breast Cancer Wisconsin


Principle Component Analysis (Diagnostic) Dataset
technique of dimensionality
reduction and evaluate the
performance with a classifier

7 Write a program to implement the 30/09/24 Use Breast Cancer Wisconsin


ID3 algorithm (Diagnostic) Dataset

8 Write a program to implement the 07/10/24 Use Breast Cancer Wisconsin


Random forest algorithm (Diagnostic) Dataset

9 Write a program to implement the 14/10/24 Use Breast Cancer Wisconsin


K- nearest neighbor algorithm (Diagnostic) Dataset

10 Write a program to implement 21/10/24 Use Pima Indian Diabetic


Baye’s classifier for classification of Dataset
sample data in Python without using
libraries

11 Write a program to implement the 28/10/24


K-means clustering algorithm

12 Mini Project to implement any 04/11/24


specific machine Learning
application.

You might also like