[go: up one dir, main page]

0% found this document useful (0 votes)
50 views188 pages

DSD Notes

Abstract Data Types (ADTs) are models for data structures defined by their operations rather than implementation, facilitating data management. Object-Oriented Programming (OOP) utilizes classes to implement ADTs, promoting encapsulation, inheritance, and polymorphism. Key concepts include shallow and deep copying, namespaces, and algorithm analysis using asymptotic notations.

Uploaded by

Malathi Krishnan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
50 views188 pages

DSD Notes

Abstract Data Types (ADTs) are models for data structures defined by their operations rather than implementation, facilitating data management. Object-Oriented Programming (OOP) utilizes classes to implement ADTs, promoting encapsulation, inheritance, and polymorphism. Key concepts include shallow and deep copying, namespaces, and algorithm analysis using asymptotic notations.

Uploaded by

Malathi Krishnan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 188

Unit I

Abstract Data Types


Abstract Data Types (ADTs)

Introduction

Abstract Data Types (ADTs) are a fundamental concept in computer science used to describe data
structures in a way that is independent of their implementation. An ADT is defined by the operations
that can be performed on it and the properties of these operations, without specifying how these
operations are implemented.

Key Concepts

Definition: An ADT is a model for a data structure that specifies the type of data stored, the
operations supported, and the types of operations. The implementation details are hidden from
the user.
Purpose: ADTs provide a way to encapsulate data and operations, making it easier to manage
and manipulate complex data structures.

ADTs and Classes

Introduction to OOP

Object-Oriented Programming (OOP) is a programming paradigm based on the concept of "objects,"


which can contain data and methods. ADTs are closely related to classes in OOP.

Classes in Python

In Python, classes are used to implement ADTs. A class defines the properties and behaviors of an
object.

Example: Implementing a Stack ADT in Python

class Stack:
def __init__(self):
self.stack = []

def push(self, item):


"""Add an item to the top of the stack."""
self.stack.append(item)

def pop(self):
"""Remove and return the item from the top of the stack."""
if not self.is_empty():
return self.stack.pop()
else:
raise IndexError("pop from empty stack")
def peek(self):
"""Return the item at the top of the stack without removing it."""
if not self.is_empty():
return self.stack[-1]
else:
raise IndexError("peek from empty stack")

def is_empty(self):
"""Check if the stack is empty."""
return len(self.stack) == 0

def size(self):
"""Return the number of items in the stack."""
return len(self.stack)

Explanation

__init__ : Initializes the stack as an empty list.


push : Adds an item to the end of the list (top of the stack).
pop : Removes and returns the last item from the list (top of the stack). Raises an error if the
stack is empty.
peek : Returns the last item from the list without removing it. Raises an error if the stack is
empty.
is_empty : Checks if the list is empty.
size : Returns the number of items in the list.

Inheritance and Namespaces

Inheritance

Inheritance is a mechanism in OOP that allows a new class to inherit properties and behaviors from
an existing class.

Example: Inheriting from Stack

class LimitedStack(Stack):
def __init__(self, limit):
super().__init__()
self.limit = limit

def push(self, item):


if self.size() < self.limit:
super().push(item)
else:
raise OverflowError("Stack limit reached")

Explanation
LimitedStack inherits from Stack .
__init__ : Initializes the stack with a limit.
push : Checks if the stack size is less than the limit before adding an item.

Namespaces

Namespaces in Python provide a way to organize and manage the names of variables and functions
in a program. They help avoid naming conflicts.

Shallow and Deep Copying

Shallow Copy

A shallow copy creates a new object but does not create copies of the nested objects; instead, it
copies references to them.

Example: Shallow Copy

import copy

original_list = [1, [2, 3]]


shallow_copied_list = copy.copy(original_list)

Deep Copy

A deep copy creates a new object and recursively copies all objects found in the original.

Example: Deep Copy

import copy

original_list = [1, [2, 3]]


deep_copied_list = copy.deepcopy(original_list)

Introduction to Analysis of Algorithms

Asymptotic Notations

Asymptotic notations describe the behavior of algorithms as the input size grows.

Big O Notation: Describes the upper bound of the time complexity.


Big Theta Notation: Describes both upper and lower bounds of the time complexity.
Big Omega Notation: Describes the lower bound of the time complexity.
Recursion and Recursive Algorithms

Recursion is a technique where a function calls itself. Analyzing recursive algorithms involves
understanding the base case, recursive case, and overall time complexity.

Example: Recursive Factorial Function

def factorial(n):
if n == 0:
return 1
else:
return n * factorial(n - 1)

Explanation

Base Case: When n is 0, return 1.


Recursive Case: Multiply n by the factorial of n - 1 .

Summary

ADTs: Define data types by their behavior, not implementation.


OOP: Uses classes to implement ADTs.
Inheritance: Allows a class to inherit properties and methods from another class.
Namespaces: Help avoid naming conflicts.
Copying: Shallow vs. Deep copying impacts how nested objects are handled.
Algorithm Analysis: Asymptotic notations and recursion are key concepts.
ADTs and Classes in Python

Introduction

In computer science, Abstract Data Types (ADTs) are a fundamental concept that helps in organizing
data and defining operations on data. They provide a high-level description of a data structure,
abstracting away the implementation details. Python, being an object-oriented programming (OOP)
language, provides robust support for implementing ADTs through classes. This lecture will cover the
following aspects:

Abstract Data Types (ADTs): Definition and examples


Classes in Python: Introduction, syntax, and usage
Inheritance: Extending classes and reusing code
Namespaces: Managing scope and avoiding conflicts
Shallow and Deep Copying: Understanding object copying and memory management

Abstract Data Types (ADTs)

Definition and Examples

An Abstract Data Type (ADT) is a mathematical model for certain classes of data structures that
have similar behavior. An ADT defines a set of operations that can be performed on the data, but
does not specify how these operations are implemented. Examples of ADTs include:

Stack: A collection of elements with Last In First Out (LIFO) access.


Queue: A collection of elements with First In First Out (FIFO) access.
List: A collection of elements that can be accessed by index.

Each ADT is defined by its operations and properties, without concern for the underlying
implementation details.

Classes in Python

Introduction

In Python, classes are used to define ADTs. A class is a blueprint for creating objects that
encapsulate both data and methods. The syntax for defining a class in Python is as follows:

class ClassName:
def __init__(self, attributes):
self.attribute1 = attributes
# Other initializations

def method_name(self, parameters):


# Method implementation
return result

Example: Stack Implementation

Here is an implementation of a stack using Python classes:

class Stack:
def __init__(self):
self.items = []

def is_empty(self):
return len(self.items) == 0

def push(self, item):


self.items.append(item)

def pop(self):
if not self.is_empty():
return self.items.pop()
else:
raise IndexError("pop from empty stack")

def peek(self):
if not self.is_empty():
return self.items[-1]
else:
raise IndexError("peek from empty stack")

def size(self):
return len(self.items)

Explanation

__init__ method: Initializes a new instance of the class.


push method: Adds an item to the stack.
pop method: Removes and returns the top item from the stack.
peek method: Returns the top item without removing it.
size method: Returns the number of items in the stack.

Inheritance

Introduction

Inheritance allows a class to inherit attributes and methods from another class. This promotes code
reuse and establishes a natural hierarchy between classes.

Example: Inheritance in Python


class Animal:
def __init__(self, name):
self.name = name

def speak(self):
raise NotImplementedError("Subclasses must implement this method")

class Dog(Animal):
def speak(self):
return f"{self.name} says Woof!"

class Cat(Animal):
def speak(self):
return f"{self.name} says Meow!"

Explanation

Animal class: A base class with a method speak that is intended to be overridden by
subclasses.
Dog and Cat classes: Inherit from Animal and provide specific implementations of the speak
method.

Namespaces

Introduction

Namespaces are containers that hold a collection of identifiers (names) and their corresponding
objects. They help avoid naming conflicts and organize code.

Example: Using Namespaces

class A:
def method(self):
print("Method in class A")

class B:
def method(self):
print("Method in class B")

obj_a = A()
obj_b = B()

obj_a.method() # Output: Method in class A


obj_b.method() # Output: Method in class B

Explanation
Classes A and B have a method named method , but they exist in different namespaces, so
there is no conflict.

Shallow and Deep Copying

Introduction

Shallow copying creates a new object but does not create copies of nested objects; it only copies
references to them. Deep copying creates a new object and recursively copies all nested objects.

Example: Shallow vs. Deep Copy

import copy

original_list = [1, [2, 3], 4]

shallow_copied_list = copy.copy(original_list)
deep_copied_list = copy.deepcopy(original_list)

original_list[1][0] = 'Changed'

print("Original:", original_list) # Output: [1, ['Changed', 3], 4]


print("Shallow copy:", shallow_copied_list) # Output: [1, ['Changed', 3], 4]
print("Deep copy:", deep_copied_list) # Output: [1, [2, 3], 4]

Explanation

Shallow Copy: Changes in nested lists affect both the original and the shallow copy.
Deep Copy: The deep copy remains unaffected by changes in the original list.

Summary

Abstract Data Types (ADTs): Defined by operations, not implementation. Python classes can
model ADTs.
Classes in Python: Blueprint for creating objects, encapsulating data and methods.
Inheritance: Allows a class to inherit from another, promoting code reuse.
Namespaces: Manage scopes and avoid naming conflicts.
Shallow and Deep Copying: Shallow copy duplicates references; deep copy duplicates
objects.
Introduction to Object-Oriented Programming

Introduction

Object-Oriented Programming (OOP) is a programming paradigm based on the concept of "objects",


which can contain data and code. The data is often in the form of fields, and the code is in the form
of procedures or methods. OOP aims to increase the modularity and reusability of code by
organizing it into objects. The key principles of OOP include:

Encapsulation: Bundling data with methods that operate on the data.


Inheritance: Mechanism to create a new class using properties and methods of an existing
class.
Polymorphism: Ability of different classes to be treated as instances of the same class through
a common interface.
Abstraction: Hiding complex implementation details and showing only the necessary features
of an object.

This lecture will cover the following aspects:

Basic Concepts of OOP: Encapsulation, inheritance, polymorphism, and abstraction.


Classes and Objects in Python: Definition, syntax, and usage.
Creating and Using Methods: Constructors, instance methods, class methods, and static
methods.
Inheritance and Method Overriding: Extending classes and modifying methods in subclasses.
Encapsulation and Data Hiding: Using private and public attributes and methods.
Polymorphism and Method Overloading: Implementing flexible and interchangeable object
behavior.

Basic Concepts of OOP

Encapsulation

Encapsulation is the concept of wrapping data (attributes) and methods (functions) into a single unit
known as a class. It restricts direct access to some of the object's components, which can prevent
accidental interference and misuse of the methods and attributes.

Example: Encapsulation in Python

class Person:
def __init__(self, name, age):
self.name = name
self.__age = age # Private attribute

def get_age(self):
return self.__age

def set_age(self, age):


if age > 0:
self.__age = age
else:
print("Invalid age")

person = Person("Alice", 30)


print(person.get_age()) # Output: 30
person.set_age(35)
print(person.get_age()) # Output: 35

Explanation

Private Attribute: The __age attribute is private and cannot be accessed directly from outside
the class.
Public Methods: get_age and set_age methods are used to interact with the private attribute.

Inheritance

Inheritance allows a class to inherit attributes and methods from another class. This promotes code
reuse and establishes a natural hierarchy between classes.

Example: Inheritance in Python

class Animal:
def __init__(self, name):
self.name = name

def speak(self):
raise NotImplementedError("Subclasses must implement this method")

class Dog(Animal):
def speak(self):
return f"{self.name} says Woof!"

class Cat(Animal):
def speak(self):
return f"{self.name} says Meow!"

Explanation

Base Class: Animal class is the base class with a method speak that is intended to be
overridden.
Derived Classes: Dog and Cat classes inherit from Animal and provide specific
implementations of the speak method.
Polymorphism

Polymorphism allows objects of different classes to be treated as objects of a common superclass. It


is implemented through method overriding in subclasses.

Example: Polymorphism in Python

def make_animal_speak(animal):
print(animal.speak())

dog = Dog("Buddy")
cat = Cat("Whiskers")

make_animal_speak(dog) # Output: Buddy says Woof!


make_animal_speak(cat) # Output: Whiskers says Meow!

Explanation

Method Overriding: Both Dog and Cat provide their own implementation of the speak
method.
Common Interface: The make_animal_speak function can call the speak method on any
object that inherits from Animal .

Abstraction

Abstraction means hiding the complex implementation details and showing only the essential
features of the object. It allows focusing on interactions at a higher level.

Example: Abstraction with Abstract Base Classes

from abc import ABC, abstractmethod

class AbstractShape(ABC):
@abstractmethod
def area(self):
pass

class Rectangle(AbstractShape):
def __init__(self, width, height):
self.width = width
self.height = height

def area(self):
return self.width * self.height

class Circle(AbstractShape):
def __init__(self, radius):
self.radius = radius
def area(self):
import math
return math.pi * self.radius * self.radius

Explanation

Abstract Base Class: AbstractShape defines an abstract method area that must be
implemented by subclasses.
Concrete Classes: Rectangle and Circle provide specific implementations for the area
method.

Classes and Objects in Python

Introduction

In Python, classes are used to create objects, which are instances of the class. A class defines the
structure and behavior of objects, encapsulating data and methods.

Creating and Using Methods

Methods are functions defined within a class that operate on instances of the class.

Example: Methods in Python

class Calculator:
def __init__(self, value=0):
self.value = value

def add(self, number):


self.value += number
return self.value

def subtract(self, number):


self.value -= number
return self.value

@staticmethod
def multiply(x, y):
return x * y

@classmethod
def create_calculator(cls, value):
return cls(value)

Explanation

Instance Methods: add and subtract methods operate on the instance attribute value .
Static Method: multiply does not access instance attributes.
Class Method: create_calculator is used to create an instance of the class.

Summary

Encapsulation: Bundles data and methods within a class, hiding implementation details.
Inheritance: Allows creating a new class from an existing class, promoting code reuse.
Polymorphism: Enables objects to be treated as instances of a common superclass, using
method overriding.
Abstraction: Hides complex details and exposes only the essential features.
Classes and Methods: Python uses classes to define objects and methods to manipulate
them.
Classes and Objects in Python

Introduction

In Python, classes and objects are fundamental concepts in Object-Oriented Programming (OOP).
Classes serve as blueprints for creating objects, encapsulating data and methods that operate on
the data. Objects are instances of classes and represent real-world entities in a program. This
lecture will cover:

Definition of Classes and Objects: Understanding the basic concepts and syntax.
Creating and Initializing Objects: How to define classes and instantiate objects.
Instance Methods and Attributes: Working with methods and attributes of objects.
Class Methods and Static Methods: Differentiating between instance methods, class
methods, and static methods.
Inheritance and Method Overriding: Extending classes and modifying methods in subclasses.
Encapsulation and Data Hiding: Managing data access and visibility.

Definition of Classes and Objects

Classes

A class in Python is a blueprint for creating objects. It defines a set of attributes and methods that will
be shared by all instances of the class. The syntax for defining a class is:

class ClassName:
def __init__(self, attributes):
self.attribute1 = attributes
# Other initializations

def method_name(self, parameters):


# Method implementation
return result

Objects

An object is an instance of a class. It is created using the class name followed by parentheses. Each
object can have its own attributes and methods as defined by its class.

Example: Creating a Simple Class and Object

class Dog:
def __init__(self, name, age):
self.name = name
self.age = age
def bark(self):
return f"{self.name} barks!"

# Creating an object of the Dog class


dog1 = Dog("Buddy", 3)
print(dog1.name) # Output: Buddy
print(dog1.bark()) # Output: Buddy barks!

Explanation

Class Definition: Dog class has two attributes ( name and age ) and one method ( bark ).
Object Instantiation: dog1 is an instance of Dog with name as "Buddy" and age as 3.

Creating and Initializing Objects

Initializing Objects

The __init__ method is a special method called a constructor that is automatically called when an
object is created. It initializes the object's attributes.

Example: Initialization

class Person:
def __init__(self, name, age):
self.name = name
self.age = age

person1 = Person("Alice", 30)


print(person1.name) # Output: Alice
print(person1.age) # Output: 30

Explanation

Constructor: The __init__ method initializes name and age attributes for the Person object.

Instance Methods and Attributes

Instance Methods

Instance methods operate on the data within an instance of the class. They are defined using the
def keyword and always take self as the first parameter.

Example: Instance Methods


class Car:
def __init__(self, make, model):
self.make = make
self.model = model

def display_info(self):
return f"Car make: {self.make}, model: {self.model}"

car1 = Car("Toyota", "Corolla")


print(car1.display_info()) # Output: Car make: Toyota, model: Corolla

Explanation

Instance Method: display_info method returns information about the car instance.

Class Methods and Static Methods

Class Methods

Class methods are methods that are bound to the class rather than its instances. They are defined
using the @classmethod decorator and take cls as the first parameter.

Example: Class Methods

class Employee:
company_name = "Tech Corp"

def __init__(self, name):


self.name = name

@classmethod
def company_info(cls):
return f"Company: {cls.company_name}"

print(Employee.company_info()) # Output: Company: Tech Corp

Explanation

Class Method: company_info returns the company name, which is shared across all instances.

Static Methods

Static methods do not operate on instance or class data. They are defined using the @staticmethod
decorator and do not take self or cls as parameters.

Example: Static Methods


class Math:
@staticmethod
def add(x, y):
return x + y

print(Math.add(5, 3)) # Output: 8

Explanation

Static Method: add performs an addition operation without accessing instance or class data.

Inheritance and Method Overriding

Inheritance

Inheritance allows a class to inherit attributes and methods from another class. This promotes code
reuse and establishes a natural hierarchy between classes.

Example: Inheritance

class Animal:
def __init__(self, name):
self.name = name

def speak(self):
return "Animal sound"

class Dog(Animal):
def speak(self):
return f"{self.name} says Woof!"

dog = Dog("Buddy")
print(dog.speak()) # Output: Buddy says Woof!

Explanation

Base Class: Animal with a method speak .


Derived Class: Dog inherits from Animal and overrides the speak method.

Encapsulation and Data Hiding

Encapsulation

Encapsulation is about bundling data and methods into a single unit (class) and restricting access to
some of the object's components.
Example: Encapsulation

class BankAccount:
def __init__(self, balance):
self.__balance = balance # Private attribute

def deposit(self, amount):


if amount > 0:
self.__balance += amount

def get_balance(self):
return self.__balance

account = BankAccount(1000)
account.deposit(500)
print(account.get_balance()) # Output: 1500

Explanation

Private Attribute: __balance is private and can only be accessed through public methods
deposit and get_balance .

Summary

Classes: Define the blueprint for creating objects, encapsulating data and methods.
Objects: Instances of classes with their own data and behavior.
Instance Methods: Operate on instance data, accessed via self .
Class Methods: Operate on class data, accessed via cls .
Static Methods: Do not access instance or class data.
Inheritance: Allows a class to inherit from another, promoting code reuse.
Encapsulation: Bundles data and methods, restricting direct access to some components.
Inheritance in Python

Introduction

Inheritance is a fundamental concept in Object-Oriented Programming (OOP) that allows a class to


inherit attributes and methods from another class. It promotes code reuse and establishes a
hierarchical relationship between classes. In Python, inheritance is implemented using class
definitions and supports multiple inheritance, which means a class can inherit from more than one
parent class. This lecture will cover:

Definition and Basics of Inheritance: Understanding the core principles.


Single Inheritance: A class inheriting from one parent class.
Multiple Inheritance: A class inheriting from multiple parent classes.
Method Overriding: Customizing inherited methods.
The super() Function: Calling methods from the parent class.
Polymorphism: Using inherited methods in a flexible way.

Definition and Basics of Inheritance

Inheritance allows a new class (subclass or derived class) to inherit properties and behaviors from
an existing class (superclass or base class). This helps in creating a new class with minimal code
duplication.

Syntax

class BaseClass:
def __init__(self, attribute):
self.attribute = attribute

def base_method(self):
pass

class DerivedClass(BaseClass):
def __init__(self, attribute, new_attribute):
super().__init__(attribute)
self.new_attribute = new_attribute

def derived_method(self):
pass

Single Inheritance

Single inheritance involves one class inheriting from a single parent class. This is the simplest form
of inheritance and is used to extend or modify the behavior of a base class.
Example: Single Inheritance

class Animal:
def __init__(self, name):
self.name = name

def speak(self):
return "Animal sound"

class Dog(Animal):
def __init__(self, name, breed):
super().__init__(name)
self.breed = breed

def speak(self):
return f"{self.name} says Woof!"

dog = Dog("Buddy", "Labrador")


print(dog.name) # Output: Buddy
print(dog.breed) # Output: Labrador
print(dog.speak()) # Output: Buddy says Woof!

Explanation

Base Class: Animal with attributes and methods.


Derived Class: Dog inherits from Animal and overrides the speak method.

Multiple Inheritance

Multiple inheritance allows a class to inherit from more than one parent class. This can be useful for
combining functionalities from different base classes.

Example: Multiple Inheritance

class Engine:
def start(self):
return "Engine starting"

class Radio:
def play_music(self):
return "Playing music"

class Car(Engine, Radio):


def drive(self):
return "Car is driving"

car = Car()
print(car.start()) # Output: Engine starting
print(car.play_music()) # Output: Playing music
print(car.drive()) # Output: Car is driving
Explanation

Base Classes: Engine and Radio provide different functionalities.


Derived Class: Car inherits from both Engine and Radio , combining their features.

Method Overriding

Method overriding occurs when a subclass provides a specific implementation of a method that is
already defined in its superclass. The overridden method in the subclass will replace the method in
the superclass.

Example: Method Overriding

class Parent:
def show(self):
return "Parent class method"

class Child(Parent):
def show(self):
return "Child class method"

child = Child()
print(child.show()) # Output: Child class method

Explanation

Superclass: Parent class with a method show .


Subclass: Child class overrides the show method.

The super() Function

The super() function is used to call methods from a parent class. It allows a derived class to invoke
methods from its base class without explicitly referring to the base class name.

Example: Using super()

class Base:
def __init__(self, value):
self.value = value

def display(self):
return f"Value: {self.value}"

class Derived(Base):
def __init__(self, value, extra):
super().__init__(value)
self.extra = extra
def display(self):
return f"{super().display()} and Extra: {self.extra}"

obj = Derived(10, "Extra Info")


print(obj.display()) # Output: Value: 10 and Extra: Extra Info

Explanation

Base Class: Base class with display method.


Derived Class: Derived class uses super() to call display from Base .

Polymorphism

Polymorphism allows methods to do different things based on the object that is calling them.
Inheritance supports polymorphism by enabling a derived class to provide different implementations
of methods defined in the base class.

Example: Polymorphism

class Bird:
def sound(self):
return "Tweet"

class Dog:
def sound(self):
return "Woof"

def make_sound(animal):
print(animal.sound())

bird = Bird()
dog = Dog()

make_sound(bird) # Output: Tweet


make_sound(dog) # Output: Woof

Explanation

Different Classes: Bird and Dog both have a sound method.


Polymorphic Function: make_sound calls sound on different objects.

Summary

Inheritance: Allows a class to inherit attributes and methods from another class, promoting
code reuse.
Single Inheritance: A class inherits from one parent class.
Multiple Inheritance: A class inherits from multiple parent classes.
Method Overriding: A subclass can provide a specific implementation of a method already
defined in its superclass.
super() Function: Calls methods from a parent class.
Polymorphism: Methods can behave differently based on the object that invokes them.
Namespaces in Python

Introduction

Namespaces in Python are fundamental to managing the scope and visibility of variables and
functions in a program. They act as containers that hold a collection of identifiers (names) and their
corresponding objects. Understanding namespaces is crucial for organizing code and avoiding
naming conflicts. This lecture will cover:

Definition and Purpose: What namespaces are and why they are important.
Types of Namespaces: Different namespaces in Python and their scopes.
Scope Resolution: How Python determines the scope of names.
The global and nonlocal Keywords: Modifying the scope of variables.
Namespaces in Modules and Packages: Managing namespaces across files.

Definition and Purpose

A namespace is a mapping between names and objects. It ensures that names are unique within a
given scope and prevents conflicts between identifiers. Namespaces help in managing the visibility
of variables and functions, thus avoiding unintended interactions between different parts of a
program.

Example of a Simple Namespace

def function():
x = 10 # x is local to the function
print(x)

function()
print(x) # Raises NameError: name 'x' is not defined

Explanation

The variable x inside function is local to that function and cannot be accessed outside it.

Types of Namespaces

There are several types of namespaces in Python, each with a specific scope:

1. Local Namespace: Contains names local to a function. Created when the function is called and
deleted when the function exits.
2. Enclosing Namespace: Exists in nested functions. It’s the namespace of the outer function.
3. Global Namespace: Contains names at the level of the main program. Created when the
module is included and deleted when the script ends.
4. Built-in Namespace: Contains names built into Python. Always available.

Example: Different Levels of Namespace

x = "global"

def outer():
y = "enclosing"

def inner():
z = "local"
print(x) # Accesses global namespace
print(y) # Accesses enclosing namespace
print(z) # Accesses local namespace

inner()

outer()

Explanation

x is in the global namespace.


y is in the enclosing namespace.
z is in the local namespace.

Scope Resolution

Python uses the LEGB rule to resolve the scope of names, which stands for Local, Enclosing,
Global, and Built-in namespaces. When a name is referenced, Python searches these namespaces
in the following order:

1. Local: The innermost scope where the name is found.


2. Enclosing: Any enclosing scopes in nested functions.
3. Global: The module-level scope.
4. Built-in: The built-in namespace containing standard functions and exceptions.

Example: LEGB Rule

x = "global"

def func():
x = "enclosing"

def inner():
x = "local"
print(x) # Output: local

inner()
print(x) # Output: enclosing

func()
print(x) # Output: global

Explanation

The inner function uses the local x .


The func function uses the enclosing x .
The global x is used outside the function.

The global and nonlocal Keywords

The global keyword allows you to modify a variable in the global namespace from within a function.
The nonlocal keyword allows you to modify a variable in an enclosing (non-global) namespace.

Example: Using global

x = "global"

def func():
global x
x = "modified global"
print(x) # Output: modified global

func()
print(x) # Output: modified global

Explanation

The global keyword allows the function to modify the global variable x .

Example: Using nonlocal

def outer():
x = "enclosing"

def inner():
nonlocal x
x = "modified enclosing"
print(x) # Output: modified enclosing

inner()
print(x) # Output: modified enclosing
outer()

Explanation

The nonlocal keyword allows the inner function to modify the x in the enclosing outer
function.

Namespaces in Modules and Packages

Modules and packages in Python create their own namespaces. This helps in organizing code and
avoiding name conflicts between different modules.

Example: Namespaces in Modules

module1.py

def func():
return "Function in module1"

module2.py

def func():
return "Function in module2"

main.py

import module1
import module2

print(module1.func()) # Output: Function in module1


print(module2.func()) # Output: Function in module2

Explanation

Each module has its own namespace, so functions with the same name in different modules do
not conflict.

Summary

Namespaces: Containers for managing the scope of names in Python.


Types of Namespaces: Local, Enclosing, Global, and Built-in.
Scope Resolution: LEGB rule determines the scope of names.
global and nonlocal Keywords: Modify variables in different namespaces.
Namespaces in Modules: Modules and packages create separate namespaces to avoid
conflicts.
Shallow and Deep Copying

Introduction

Copying objects in programming is an essential concept for managing data and memory. In Python,
the terms "shallow copying" and "deep copying" refer to different methods of duplicating objects.
Understanding these concepts is crucial for effectively managing mutable objects and their
relationships. This lecture will cover:

Shallow Copying: Definition, behavior, and methods.


Deep Copying: Definition, behavior, and methods.
Comparative Examples: Demonstrating the differences between shallow and deep copying.
Applications: When and why to use each type of copy.

Shallow Copying

Definition

Shallow copying creates a new object but does not create copies of nested objects. Instead, it
copies references to the nested objects. As a result, changes to nested objects in the original will
reflect in the shallow copy, and vice versa.

Methods for Shallow Copying

1. Using the copy Module

The copy module provides a copy() function to create shallow copies of objects.

import copy

original_list = [1, [2, 3], 4]


shallow_copied_list = copy.copy(original_list)

2. Using Object's copy Method

Some Python objects have a copy() method that returns a shallow copy.

original_set = {1, 2, 3}
shallow_copied_set = original_set.copy()

Example: Shallow Copying


import copy

original_list = [1, [2, 3], 4]


shallow_copied_list = copy.copy(original_list)

original_list[1][0] = 'Changed'

print("Original list:", original_list) # Output: [1, ['Changed', 3], 4]


print("Shallow copy:", shallow_copied_list) # Output: [1, ['Changed', 3], 4]

Explanation

The change in the nested list affects both the original and shallow copied lists because the
nested list is shared between them.

Deep Copying

Definition

Deep copying creates a new object and recursively copies all nested objects. This means that the
new object is entirely independent of the original object, including its nested structures.

Methods for Deep Copying

1. Using the copy Module

The copy module provides a deepcopy() function to create deep copies of objects.

import copy

original_list = [1, [2, 3], 4]


deep_copied_list = copy.deepcopy(original_list)

Example: Deep Copying

import copy

original_list = [1, [2, 3], 4]


deep_copied_list = copy.deepcopy(original_list)

original_list[1][0] = 'Changed'

print("Original list:", original_list) # Output: [1, ['Changed', 3], 4]


print("Deep copy:", deep_copied_list) # Output: [1, [2, 3], 4]

Explanation
The deep copied list remains unaffected by changes in the original list because it is completely
independent.

Comparative Examples

Example: Comparing Shallow and Deep Copy

import copy

original_list = [1, [2, 3], 4]

# Shallow Copy
shallow_copied_list = copy.copy(original_list)
original_list[1][0] = 'Changed'
print("After shallow copy modification:")
print("Original list:", original_list) # Output: [1, ['Changed', 3], 4]
print("Shallow copy:", shallow_copied_list) # Output: [1, ['Changed', 3], 4]

# Deep Copy
original_list = [1, [2, 3], 4]
deep_copied_list = copy.deepcopy(original_list)
original_list[1][0] = 'Changed'
print("After deep copy modification:")
print("Original list:", original_list) # Output: [1, ['Changed', 3], 4]
print("Deep copy:", deep_copied_list) # Output: [1, [2, 3], 4]

Explanation

Shallow Copy: Nested objects are shared, so changes affect both original and copied objects.
Deep Copy: Nested objects are independently copied, so changes only affect the original
object.

Applications

Shallow Copying: Useful when you need a new object but can share nested objects. It's faster
and consumes less memory.
Deep Copying: Necessary when you need complete independence between objects. Useful for
complex structures where nested elements should not be shared.

Summary

Shallow Copying: Creates a new object with references to nested objects; changes to nested
objects affect both copies.
Deep Copying: Creates a completely independent copy of an object and all its nested
elements; changes to one do not affect the other.
Use Cases: Choose shallow copying for performance and when sharing nested objects is
acceptable; use deep copying for full independence.
Introduction to Analysis of Algorithms

Introduction

The analysis of algorithms is a critical aspect of computer science that focuses on evaluating the
efficiency and effectiveness of algorithms. This analysis helps in determining how well an algorithm
performs in terms of time and space as the size of the input data increases. Understanding these
principles is essential for designing efficient algorithms and systems.

This lecture will cover:

Algorithm Analysis: The need and importance of analyzing algorithms.


Asymptotic Notations: Tools for describing the growth of algorithmic complexity.
Recursion: Understanding recursive algorithms and their analysis.
Analyzing Recursive Algorithms: Techniques and methods for evaluating recursive solutions.

Algorithm Analysis

Importance

Analyzing algorithms allows us to:

Evaluate Efficiency: Determine how well an algorithm performs.


Compare Algorithms: Choose the best algorithm for a given problem.
Predict Performance: Estimate resource requirements for large inputs.

Components of Analysis

1. Time Complexity: Measures the amount of time an algorithm takes to complete.


2. Space Complexity: Measures the amount of memory an algorithm uses.

Asymptotic Notations

Asymptotic notations provide a way to describe the behavior of algorithms as the input size grows.
They help in expressing the efficiency of algorithms in a standard format.

Big-O Notation (O)

Definition: Describes an upper bound on the time complexity of an algorithm. It provides a


worst-case scenario.
Example: If an algorithm has a time complexity of (O(n^2)), it means that the time taken grows
quadratically with the input size (n).
def example_function(n):
for i in range(n):
for j in range(n):
print(i, j)

Here, the time complexity is (O(n^2)) because of the nested loops.

Big-Ω Notation (Ω)

Definition: Describes a lower bound on the time complexity of an algorithm. It provides a best-
case scenario.
Example: If an algorithm has a time complexity of (\Omega(n)), it means that the algorithm will
take at least linear time in the best case.

Big-Θ Notation (Θ)

Definition: Describes a tight bound on the time complexity. It represents both upper and lower
bounds.
Example: An algorithm with (Θ(n \log n)) complexity grows at a rate that is bounded both above
and below by (n \log n).

Little-o Notation (o)

Definition: Describes an upper bound that is not tight. It indicates that the algorithm’s
complexity grows slower than a certain function.
Example: If (f(n) = o(n^2)), then (f(n)) grows slower than (n^2) as (n) increases.

Little-ω Notation (ω)

Definition: Describes a lower bound that is not tight. It indicates that the algorithm’s complexity
grows faster than a certain function.
Example: If (f(n) = ω(n)), then (f(n)) grows faster than (n) as (n) increases.

Recursion

Definition

Recursion occurs when a function calls itself to solve a problem. It often breaks a problem into
smaller, more manageable sub-problems.

Example: Recursive Factorial Function

def factorial(n):
if n == 0:
return 1
else:
return n * factorial(n - 1)

Explanation:

Base Case: When (n) is 0, return 1.


Recursive Case: Compute factorial by calling the function itself with (n - 1).

Analyzing Recursive Algorithms

Recursive algorithms are analyzed based on:

1. Time Complexity: How many times the function calls itself.


2. Space Complexity: Memory used for the call stack.

Example: Analyzing Merge Sort

Merge Sort is a divide-and-conquer algorithm with the following recurrence relation:

[ T(n) = 2T\left(\frac{n}{2}\right) + O(n) ]

Using the Master Theorem, the time complexity can be determined as (O(n \log n)).

Summary

Algorithm Analysis: Essential for evaluating the efficiency of algorithms.


Asymptotic Notations: Provides a standardized way to describe time and space complexity.
Recursion: A technique for solving problems by breaking them into smaller sub-problems.
Analyzing Recursive Algorithms: Focuses on time and space complexities, often using
recurrence relations.
Asymptotic Notations & Recursion

Introduction

Understanding the efficiency of algorithms is crucial in computer science, and it often starts with the
analysis of their time and space complexities. This analysis helps in predicting how algorithms
perform as the input size grows. Asymptotic notations provide a formal way to describe these
complexities, while recursion is a fundamental concept used to solve problems by dividing them into
simpler sub-problems.

This lecture will cover:

Asymptotic Notations: Descriptions of algorithm complexity in terms of growth rates.


Recursion: An overview of recursive algorithms and their analysis.

Asymptotic Notations

Asymptotic notations are mathematical tools used to describe the behavior of algorithms in terms of
their efficiency. They allow us to express how the running time or space requirements of an algorithm
grow with the input size.

Big-O Notation (O)

Definition: Big-O notation describes an upper bound on the time complexity of an algorithm. It
provides a worst-case scenario of how the running time increases as the input size (n) grows.
Example: If an algorithm has a time complexity of (O(n^2)), it means the running time increases
quadratically with the size of the input.

def bubble_sort(arr):
n = len(arr)
for i in range(n):
for j in range(0, n-i-1):
if arr[j] > arr[j+1]:
arr[j], arr[j+1] = arr[j+1], arr[j]

Explanation:
The bubble_sort function has a time complexity of (O(n^2)) due to the nested loops,
where (n) is the number of elements in the array.

Big-Ω Notation (Ω)


Definition: Big-Ω notation describes a lower bound on the time complexity. It provides a best-
case scenario where the algorithm performs at least this well.
Example: An algorithm with a time complexity of (\Omega(n)) means that, in the best case, the
running time grows linearly with the input size.

Big-Θ Notation (Θ)

Definition: Big-Θ notation describes a tight bound on the time complexity. It indicates that the
algorithm's running time grows asymptotically as a function of the input size.
Example: An algorithm with a time complexity of (Θ(n \log n)) grows at a rate bounded above
and below by (n \log n).

Little-o Notation (o)

Definition: Little-o notation provides an upper bound that is not tight. It indicates that the growth
rate of the algorithm is strictly less than a certain function.
Example: If (f(n) = o(n^2)), (f(n)) grows slower than (n^2).

Little-ω Notation (ω)

Definition: Little-ω notation provides a lower bound that is not tight. It shows that the growth
rate of the algorithm is strictly more than a certain function.
Example: If (f(n) = ω(n)), (f(n)) grows faster than (n).

Recursion

Definition

Recursion is a technique where a function calls itself to solve smaller instances of the same
problem. This method is particularly useful for problems that can be broken down into simpler sub-
problems.

Example: Recursive Factorial Function

def factorial(n):
if n == 0:
return 1
else:
return n * factorial(n - 1)

Explanation:

Base Case: When (n) is 0, the function returns 1.


Recursive Case: The function calls itself with (n-1) and multiplies the result by (n).

Analyzing Recursive Algorithms

Recursive algorithms can be analyzed based on their time and space complexities.

1. Time Complexity: Determined by the number of recursive calls and the work done outside the
recursive calls.
2. Space Complexity: Mainly depends on the depth of the recursion stack.

Example: Analyzing Merge Sort

Merge Sort is a classic divide-and-conquer algorithm with the following recurrence relation:

[ T(n) = 2T\left(\frac{n}{2}\right) + O(n) ]

Using the Master Theorem, the time complexity of Merge Sort is (O(n \log n)).

Explanation:

Divide Step: Splits the array into two halves.


Conquer Step: Recursively sorts the two halves.
Combine Step: Merges the two sorted halves.

Summary

Asymptotic Notations: Provide a way to describe the efficiency of algorithms in terms of


growth rates. They include Big-O, Big-Ω, Big-Θ, Little-o, and Little-ω notations.
Recursion: A method of solving problems by breaking them down into simpler sub-problems.
Analyzing recursive algorithms involves understanding both their time and space complexities.
Unit II
Linear Structures
List ADT

Introduction

The List Abstract Data Type (ADT) represents a collection of elements arranged in a linear order.
Lists are fundamental data structures that support a variety of operations, including insertion,
deletion, and access of elements. Lists can be implemented in different ways, such as using arrays
or linked lists. This section covers the following aspects of the List ADT:

List ADT: Definition and operations


Array-based Implementations: Characteristics and examples
Linked List Implementations: Singly linked lists, circularly linked lists, and doubly linked lists
Applications of Lists: Use cases and examples

List ADT

Definition and Operations

A List ADT is a data structure that provides a way to store a collection of elements in a sequential
manner. The main operations supported by a List ADT include:

Insertion: Adding an element at a specified position.


Deletion: Removing an element from a specified position.
Access: Retrieving the element at a specified position.
Update: Modifying the value of an element at a specified position.
Traversal: Iterating through all elements in the list.

Example: List Operations

Python List Implementation

In Python, lists are a built-in data type that supports a variety of operations:

# Creating a list
my_list = [1, 2, 3, 4, 5]

# Insertion
my_list.append(6) # Adds 6 at the end
my_list.insert(2, 10) # Inserts 10 at index 2

# Deletion
my_list.remove(3) # Removes the first occurrence of 3
del my_list[1] # Removes element at index 1

# Access
element = my_list[2] # Retrieves the element at index 2

# Update
my_list[0] = 0 # Updates the element at index 0

# Traversal
for item in my_list:
print(item)

Explanation:

append : Adds an element to the end of the list.


insert : Inserts an element at a specific index.
remove : Removes the first occurrence of a value.
del : Deletes an element at a specified index.
Indexing: Accesses elements by their index.
Looping: Traverses through the list to print each element.

Array-Based Implementations

Characteristics

Array-based lists use a contiguous block of memory to store elements. This implementation
provides:

Random Access: Constant-time access to elements by index.


Fixed Size: The size of the array is fixed upon creation. Resizing requires creating a new array
and copying elements.

Example: Array-Based List Implementation

Python Example

Python’s built-in list type is implemented using dynamic arrays. Here’s an example of a basic array-
based implementation in Python:

class ArrayList:
def __init__(self):
self.array = []

def append(self, item):


self.array.append(item)

def insert(self, index, item):


self.array.insert(index, item)

def remove(self, item):


self.array.remove(item)

def get(self, index):


return self.array[index]

def delete(self, index):


del self.array[index]

# Usage
array_list = ArrayList()
array_list.append(1)
array_list.append(2)
array_list.insert(1, 3)
print(array_list.get(1)) # Output: 3
array_list.delete(1)
print(array_list.get(1)) # Output: 2

Explanation:

append : Adds an item to the end of the list.


insert : Inserts an item at a specific index.
remove : Removes the first occurrence of an item.
get : Retrieves an item at a specific index.
delete : Deletes an item at a specific index.

Linked List Implementations

Linked lists are a dynamic data structure consisting of nodes where each node points to the next
node. They provide flexibility in terms of size and insertion/deletion operations.

Singly Linked Lists

In a singly linked list, each node contains data and a reference to the next node.

Example: Singly Linked List Implementation

class Node:
def __init__(self, data):
self.data = data
self.next = None

class SinglyLinkedList:
def __init__(self):
self.head = None

def append(self, data):


new_node = Node(data)
if not self.head:
self.head = new_node
else:
current = self.head
while current.next:
current = current.next
current.next = new_node

def display(self):
current = self.head
while current:
print(current.data, end=" -> ")
current = current.next
print("None")

# Usage
sll = SinglyLinkedList()
sll.append(1)
sll.append(2)
sll.append(3)
sll.display() # Output: 1 -> 2 -> 3 -> None

Explanation:

Node class: Represents a single node in the list.


append : Adds a new node to the end of the list.
display : Prints all nodes in the list.

Circularly Linked Lists

In a circularly linked list, the last node points back to the first node, forming a circle.

Example: Circularly Linked List Implementation

class CircularNode:
def __init__(self, data):
self.data = data
self.next = None

class CircularLinkedList:
def __init__(self):
self.head = None

def append(self, data):


new_node = CircularNode(data)
if not self.head:
self.head = new_node
new_node.next = self.head
else:
current = self.head
while current.next != self.head:
current = current.next
current.next = new_node
new_node.next = self.head

def display(self):
if not self.head:
print("Empty List")
return
current = self.head
while True:
print(current.data, end=" -> ")
current = current.next
if current == self.head:
break
print("... (circular)")

# Usage
cll = CircularLinkedList()
cll.append(1)
cll.append(2)
cll.append(3)
cll.display() # Output: 1 -> 2 -> 3 -> ... (circular)

Explanation:

CircularNode class: Represents a node in a circular linked list.


append : Adds a new node to the end of the list, maintaining the circular reference.
display : Prints all nodes in the list, indicating the circular nature.

Doubly Linked Lists

In a doubly linked list, each node has references to both the next and previous nodes, allowing for
bidirectional traversal.

Example: Doubly Linked List Implementation

class DoublyNode:
def __init__(self, data):
self.data = data
self.next = None
self.prev = None

class DoublyLinkedList:
def __init__(self):
self.head = None

def append(self, data):


new_node = DoublyNode(data)
if not self.head:
self.head = new_node
else:
current = self.head
while current.next:
current = current.next
current.next = new_node
new_node.prev = current

def display_forward(self):
current = self.head
while current:
print(current.data, end=" <-> ")
current = current.next
print("None")

def display_backward(self):
current = self.head
while current and current.next:
current = current.next
while current:
print(current.data, end=" <-> ")
current = current.prev
print("None")

# Usage
dll = DoublyLinkedList()
dll.append(1)
dll.append(2)
dll.append(3)
dll.display_forward() # Output: 1 <-> 2 <-> 3 <-> None
dll.display_backward() # Output: 3 <-> 2 <-> 1 <-> None

Explanation:

DoublyNode class: Represents a node in a doubly linked list.


append : Adds a new node to the end of the list.
display_forward : Prints nodes from head to tail.
display_backward : Prints nodes from tail to head.

Applications of Lists

Lists are versatile and used in various applications, including:

Data Storage: Storing collections of data for processing.


Stacks and Queues: Implementing stack and queue data structures.
Dynamic Arrays: Used in languages that provide dynamic array functionalities.
Graphs: Adjacency lists to represent graphs.

Example: Stack Implementation Using List

class Stack:
def __init__(self):
self.items = []
def push(self, item):
self.items.append(item)

def pop(self):
return self.items.pop() if not self.is_empty() else None

def is_empty(self):

return len(self.items) == 0

def peek(self):
return self.items[-1] if not self.is_empty() else None

# Usage
stack = Stack()
stack.push(1)
stack.push(2)
print(stack.pop()) # Output: 2

Explanation:

push : Adds an element to the stack.


pop : Removes and returns the top element.
is_empty : Checks if the stack is empty.
peek : Returns the top element without removing it.
Array-Based Implementations of Lists

Introduction

Array-based implementations of lists use a contiguous block of memory to store elements. This
approach provides efficient access to elements via indexing but requires handling dynamic resizing
when the list grows beyond its initial capacity. This section covers:

Characteristics of Array-Based Lists: Benefits and limitations


Dynamic Arrays: How they manage resizing
Example Implementation: Python implementation of an array-based list

Characteristics of Array-Based Lists

Benefits

1. Random Access: Elements can be accessed in constant time (O(1)) using their index, making
retrieval very efficient.
2. Contiguous Memory: All elements are stored in a contiguous block of memory, which can be
advantageous for cache performance.

Limitations

1. Fixed Size: Initial array size must be specified and resizing involves creating a new array and
copying elements, which can be time-consuming.
2. Memory Overhead: If the array is too large, it can waste memory. If too small, it can require
frequent resizing.

Dynamic Arrays

Resizing Strategy

Dynamic arrays grow automatically as elements are added. The typical strategy involves:

1. Initial Capacity: Start with a fixed size.


2. Resizing: When the array is full, create a new larger array (usually double the size of the
current one), and copy the elements from the old array to the new one.

Example Implementation in Python

Python's built-in list type is an example of a dynamic array. Below is a simplified version of an array-
based list implementation that demonstrates resizing:
class ArrayList:
def __init__(self):
self.array = [None] * 2 # Initial capacity of 2
self.size = 0

def append(self, item):


if self.size == len(self.array):
self._resize()
self.array[self.size] = item
self.size += 1

def _resize(self):
new_capacity = len(self.array) * 2
new_array = [None] * new_capacity
for i in range(self.size):
new_array[i] = self.array[i]
self.array = new_array

def insert(self, index, item):


if self.size == len(self.array):
self._resize()
for i in range(self.size, index, -1):
self.array[i] = self.array[i - 1]
self.array[index] = item
self.size += 1

def remove(self, index):


if index < 0 or index >= self.size:
raise IndexError("Index out of bounds")
for i in range(index, self.size - 1):
self.array[i] = self.array[i + 1]
self.array[self.size - 1] = None
self.size -= 1

def get(self, index):


if index < 0 or index >= self.size:
raise IndexError("Index out of bounds")
return self.array[index]

def display(self):
for i in range(self.size):
print(self.array[i], end=" ")
print()

# Usage
array_list = ArrayList()
array_list.append(1)
array_list.append(2)
array_list.append(3)
array_list.display() # Output: 1 2 3
array_list.insert(1, 4)
array_list.display() # Output: 1 4 2 3
array_list.remove(2)
array_list.display() # Output: 1 4 3
Explanation

__init__ : Initializes the array with a fixed size and sets the initial size to 0.
append : Adds an element to the end of the array. Resizes the array if needed.
_resize : Creates a new, larger array and copies existing elements.
insert : Inserts an element at a specific index, shifting elements as necessary.
remove : Removes an element from a specific index, shifting elements to fill the gap.
get : Retrieves an element by index.
display : Prints all elements in the list.

Summary

Array-Based Lists: Efficient for random access but require dynamic resizing for growth.
Dynamic Arrays: Automatically resize when needed to accommodate more elements.
Implementation: Python’s built-in lists provide an example of dynamic arrays, offering a
practical approach to array-based list management.
Linked List Implementations

Introduction

Linked lists are a type of data structure where elements, called nodes, are linked together in a linear
sequence. Unlike arrays, linked lists do not require contiguous memory allocation, making them
suitable for scenarios where the size of the data structure is frequently changing. This section
explores:

Singly Linked Lists: Basic structure and operations


Circular Linked Lists: Structure and applications
Doubly Linked Lists: Advantages and operations

Singly Linked Lists

Structure

A singly linked list consists of nodes where each node contains two fields:

Data: The value or data to be stored in the node.


Next: A reference (or pointer) to the next node in the list.

The list starts with a head node and ends with a node whose next reference is None , indicating the
end of the list.

Example Implementation in Python

class Node:
def __init__(self, data):
self.data = data
self.next = None

class SinglyLinkedList:
def __init__(self):
self.head = None

def append(self, data):


new_node = Node(data)
if self.head is None:
self.head = new_node
return
current = self.head
while current.next:
current = current.next
current.next = new_node

def prepend(self, data):


new_node = Node(data)
new_node.next = self.head
self.head = new_node

def delete_with_value(self, data):


if self.head is None:
return
if self.head.data == data:
self.head = self.head.next
return
current = self.head
while current.next and current.next.data != data:
current = current.next
if current.next:
current.next = current.next.next

def display(self):
current = self.head
while current:
print(current.data, end=" -> ")
current = current.next
print("None")

# Usage
linked_list = SinglyLinkedList()
linked_list.append(1)
linked_list.append(2)
linked_list.append(3)
linked_list.display() # Output: 1 -> 2 -> 3 -> None
linked_list.prepend(0)
linked_list.display() # Output: 0 -> 1 -> 2 -> 3 -> None
linked_list.delete_with_value(2)
linked_list.display() # Output: 0 -> 1 -> 3 -> None

Explanation

Node class: Represents a node in the list with data and a reference to the next node.
SinglyLinkedList class: Manages the linked list with methods to append, prepend, delete,
and display nodes.

Circular Linked Lists

Structure

In a circular linked list, the last node points back to the first node, forming a circle. This can be either
a singly circular linked list or a doubly circular linked list.

Example Implementation in Python

class CircularNode:
def __init__(self, data):
self.data = data
self.next = None

class CircularLinkedList:
def __init__(self):
self.head = None

def append(self, data):


new_node = CircularNode(data)
if self.head is None:
self.head = new_node
new_node.next = self.head
return
current = self.head
while current.next != self.head:
current = current.next
current.next = new_node
new_node.next = self.head

def display(self):
if self.head is None:
print("List is empty")
return
current = self.head
while True:
print(current.data, end=" -> ")
current = current.next
if current == self.head:
break
print("(circular)")

# Usage
circular_list = CircularLinkedList()
circular_list.append(1)
circular_list.append(2)
circular_list.append(3)
circular_list.display() # Output: 1 -> 2 -> 3 -> (circular)

Explanation

CircularNode class: Represents a node in the circular list.


CircularLinkedList class: Manages the circular linked list with an append method and a
display method.

Doubly Linked Lists

Structure

A doubly linked list allows traversal in both directions by maintaining two references in each node:

Next: A reference to the next node.


Prev: A reference to the previous node.
Example Implementation in Python

class DoublyNode:
def __init__(self, data):
self.data = data
self.next = None
self.prev = None

class DoublyLinkedList:
def __init__(self):
self.head = None

def append(self, data):


new_node = DoublyNode(data)
if self.head is None:
self.head = new_node
return
last = self.head
while last.next:
last = last.next
last.next = new_node
new_node.prev = last

def display_forward(self):
current = self.head
while current:
print(current.data, end=" <-> ")
current = current.next
print("None")

def display_backward(self):
current = self.head
if not current:
print("List is empty")
return
while current.next:
current = current.next
while current:
print(current.data, end=" <-> ")
current = current.prev
print("None")

# Usage
doubly_list = DoublyLinkedList()
doubly_list.append(1)
doubly_list.append(2)
doubly_list.append(3)
doubly_list.display_forward() # Output: 1 <-> 2 <-> 3 <-> None
doubly_list.display_backward() # Output: 3 <-> 2 <-> 1 <-> None

Explanation

DoublyNode class: Represents a node with data, next, and previous references.
DoublyLinkedList class: Manages the doubly linked list with methods to append and display
nodes in both forward and backward directions.

Summary

Singly Linked Lists: Efficient for basic operations but only supports unidirectional traversal.
Circular Linked Lists: Nodes form a circle, allowing continuous traversal.
Doubly Linked Lists: Supports bidirectional traversal, useful for more complex operations and
efficient insertions/deletions.
Singly Linked Lists

Introduction

Singly linked lists are a fundamental data structure in computer science. They consist of a sequence
of nodes, where each node contains data and a reference (or pointer) to the next node in the
sequence. This structure allows for efficient insertion and deletion of elements as compared to
arrays, especially when the size of the list changes frequently.

Structure

A singly linked list is composed of nodes, where each node contains:

Data: The value or data that the node holds.


Next: A reference to the next node in the list.

The list starts with a head node and ends with a node whose next reference is None , indicating the
end of the list.

Python Implementation

Node Class

The Node class represents a single node in the linked list.

class Node:
def __init__(self, data):
self.data = data # The data stored in the node
self.next = None # Reference to the next node, initially None

SinglyLinkedList Class

The SinglyLinkedList class manages the linked list and provides methods for various operations
such as appending, prepending, deleting, and displaying nodes.

class SinglyLinkedList:
def __init__(self):
self.head = None # Initialize the head of the list as None

def append(self, data):


"""Append a new node with the given data to the end of the list."""
new_node = Node(data)
if self.head is None:
self.head = new_node # If the list is empty, set head to the new node
return
current = self.head
while current.next: # Traverse to the last node
current = current.next
current.next = new_node # Link the last node to the new node

def prepend(self, data):


"""Prepend a new node with the given data to the beginning of the list."""
new_node = Node(data)
new_node.next = self.head # Set the new node's next to the current head
self.head = new_node # Update head to be the new node

def delete_with_value(self, data):


"""Delete the first occurrence of a node with the given data."""
if self.head is None:
return # If the list is empty, do nothing
if self.head.data == data:
self.head = self.head.next # If the head node needs to be removed
return
current = self.head
while current.next and current.next.data != data:
current = current.next # Traverse the list to find the node to delete
if current.next:
current.next = current.next.next # Bypass the node to be deleted

def display(self):
"""Display the elements of the list."""
current = self.head
while current:
print(current.data, end=" -> ")
current = current.next
print("None")

Example Usage

# Create a new singly linked list


linked_list = SinglyLinkedList()

# Append elements to the list


linked_list.append(1)
linked_list.append(2)
linked_list.append(3)

# Display the list


linked_list.display() # Output: 1 -> 2 -> 3 -> None

# Prepend an element to the list


linked_list.prepend(0)
linked_list.display() # Output: 0 -> 1 -> 2 -> 3 -> None

# Delete an element from the list


linked_list.delete_with_value(2)
linked_list.display() # Output: 0 -> 1 -> 3 -> None
Explanation

Node Class: Defines the basic structure of a node with data and next attributes.
SinglyLinkedList Class: Manages the linked list operations:
append : Adds a node to the end of the list.
prepend : Adds a node to the beginning of the list.
delete_with_value : Removes the first occurrence of a node with the specified data.
display : Prints the elements of the list in a readable format.

Summary

Singly Linked Lists: Efficient for insertion and deletion at the beginning or end of the list.
Suitable for scenarios where the size of the list varies frequently.
Operations: Basic operations include appending, prepending, deleting, and displaying nodes.
Advantages: Flexibility in size, dynamic memory allocation, efficient for certain operations
compared to arrays.
Circularly Linked Lists

Introduction

Circularly linked lists are a variation of linked lists where the last node points back to the first node,
creating a circular structure. This design allows for efficient traversal of the list as it can be navigated
starting from any node and will eventually loop back to the starting node. Circularly linked lists can
be singly circular or doubly circular, depending on whether they have one or two references to
adjacent nodes.

Types

1. Singly Circularly Linked List: Each node points to the next node, and the last node points
back to the first node.
2. Doubly Circularly Linked List: Each node has references to both the next and the previous
nodes, with the last node pointing back to the first node and vice versa.

Python Implementation

Node Class

The Node class represents a single node in the circularly linked list.

class Node:
def __init__(self, data):
self.data = data # The data stored in the node
self.next = None # Reference to the next node, initially None

Singly Circularly Linked List Class

The SinglyCircularLinkedList class manages the list operations for a singly circularly linked list.

class SinglyCircularLinkedList:
def __init__(self):
self.head = None # Initialize the head of the list as None

def append(self, data):


"""Append a new node with the given data to the end of the list."""
new_node = Node(data)
if self.head is None:
self.head = new_node
new_node.next = self.head # Point to itself, creating a circular link
else:
current = self.head
while current.next != self.head:
current = current.next # Traverse to the node before head
current.next = new_node # Link the last node to the new node
new_node.next = self.head # Point the new node to the head

def prepend(self, data):


"""Prepend a new node with the given data to the beginning of the list."""
new_node = Node(data)
if self.head is None:
self.head = new_node
new_node.next = self.head # Point to itself, creating a circular link
else:
new_node.next = self.head
current = self.head
while current.next != self.head:
current = current.next
current.next = new_node
self.head = new_node

def delete_with_value(self, data):


"""Delete the first occurrence of a node with the given data."""
if self.head is None:
return # If the list is empty, do nothing
if self.head.data == data:
if self.head.next == self.head:
self.head = None # If the list has only one node
else:
current = self.head
while current.next != self.head:
current = current.next
current.next = self.head.next # Bypass the head node
self.head = self.head.next # Update head to the next node
return
current = self.head
while current.next != self.head and current.next.data != data:
current = current.next
if current.next.data == data:
current.next = current.next.next # Bypass the node to be deleted

def display(self):
"""Display the elements of the list."""
if self.head is None:
print("List is empty")
return
current = self.head
while True:
print(current.data, end=" -> ")
current = current.next
if current == self.head:
break
print("(head)")

Example Usage
# Create a new singly circularly linked list
circular_list = SinglyCircularLinkedList()

# Append elements to the list


circular_list.append(1)
circular_list.append(2)
circular_list.append(3)

# Display the list


circular_list.display() # Output: 1 -> 2 -> 3 -> (head)

# Prepend an element to the list


circular_list.prepend(0)
circular_list.display() # Output: 0 -> 1 -> 2 -> 3 -> (head)

# Delete an element from the list


circular_list.delete_with_value(2)
circular_list.display() # Output: 0 -> 1 -> 3 -> (head)

Explanation

Node Class: Defines the basic structure of a node with data and next attributes.
SinglyCircularLinkedList Class: Manages the circular linked list operations:
append : Adds a node to the end of the list and maintains the circular link.
prepend : Adds a node to the beginning of the list and updates references.
delete_with_value : Removes the first occurrence of a node with the specified data and
adjusts references accordingly.
display : Prints the elements of the list, indicating the circular nature with (head) .

Summary

Circularly Linked Lists: Efficient for applications requiring circular traversal. Can be singly or
doubly linked.
Operations: Basic operations include appending, prepending, deleting, and displaying nodes.
Advantages: Useful in scenarios where circular traversal is needed, such as in round-robin
scheduling.
Doubly Linked Lists

Introduction

A doubly linked list is a variation of a linked list where each node contains references to both its
previous and next nodes, allowing traversal in both directions. This bi-directional linking provides
greater flexibility compared to singly linked lists, where traversal is limited to one direction. Doubly
linked lists can be used for various applications, including navigation systems, memory
management, and complex data manipulations.

Structure

Each node in a doubly linked list contains:

Data: The value or information stored in the node.


Next Reference: A pointer to the next node in the sequence.
Previous Reference: A pointer to the previous node in the sequence.

Python Implementation

Node Class

The Node class represents a single node in the doubly linked list.

class Node:
def __init__(self, data):
self.data = data # The data stored in the node
self.next = None # Reference to the next node, initially None
self.prev = None # Reference to the previous node, initially None

Doubly Linked List Class

The DoublyLinkedList class manages the list operations for a doubly linked list.

class DoublyLinkedList:
def __init__(self):
self.head = None # Initialize the head of the list as None

def append(self, data):


"""Append a new node with the given data to the end of the list."""
new_node = Node(data)
if self.head is None:
self.head = new_node
else:
current = self.head
while current.next:
current = current.next
current.next = new_node
new_node.prev = current

def prepend(self, data):


"""Prepend a new node with the given data to the beginning of the list."""
new_node = Node(data)
if self.head is None:
self.head = new_node
else:
new_node.next = self.head
self.head.prev = new_node
self.head = new_node

def delete_with_value(self, data):


"""Delete the first occurrence of a node with the given data."""
if self.head is None:
return # If the list is empty, do nothing
if self.head.data == data:
if self.head.next is None:
self.head = None # If the list has only one node
else:
self.head = self.head.next
self.head.prev = None
return
current = self.head
while current and current.data != data:
current = current.next
if current:
if current.next:
current.next.prev = current.prev
if current.prev:
current.prev.next = current.next

def display(self):
"""Display the elements of the list from head to tail."""
if self.head is None:
print("List is empty")
return
current = self.head
while current:
print(current.data, end=" <-> ")
current = current.next
print("None")

def display_reverse(self):
"""Display the elements of the list from tail to head."""
if self.head is None:
print("List is empty")
return
current = self.head
while current.next:
current = current.next
while current:
print(current.data, end=" <-> ")
current = current.prev
print("None")

Example Usage

# Create a new doubly linked list


doubly_list = DoublyLinkedList()

# Append elements to the list


doubly_list.append(1)
doubly_list.append(2)
doubly_list.append(3)

# Display the list


doubly_list.display() # Output: 1 <-> 2 <-> 3 <-> None

# Prepend an element to the list


doubly_list.prepend(0)
doubly_list.display() # Output: 0 <-> 1 <-> 2 <-> 3 <-> None

# Delete an element from the list


doubly_list.delete_with_value(2)
doubly_list.display() # Output: 0 <-> 1 <-> 3 <-> None

# Display the list in reverse order


doubly_list.display_reverse() # Output: 3 <-> 1 <-> 0 <-> None

Explanation

Node Class: Defines the basic structure of a node with data , next , and prev attributes.
DoublyLinkedList Class: Manages the doubly linked list operations:
append : Adds a node to the end of the list and updates references.
prepend : Adds a node to the beginning of the list and updates references.
delete_with_value : Removes the first occurrence of a node with the specified data and
adjusts references.
display : Prints the elements of the list from head to tail.
display_reverse : Prints the elements of the list from tail to head.

Summary

Doubly Linked Lists: Provide flexibility with bi-directional traversal. Suitable for complex data
manipulations.
Operations: Include appending, prepending, deleting, and displaying nodes in both directions.
Advantages: Allows efficient traversal and manipulation in both directions, unlike singly linked
lists.
Applications of Linked Lists

Introduction

Linked lists are versatile data structures that offer various advantages in certain scenarios compared
to arrays and other data structures. Their ability to efficiently handle dynamic data and provide
flexibility in insertion and deletion operations makes them suitable for numerous applications. This
section explores several key applications of linked lists, illustrating their utility across different
domains.

Applications

1. Dynamic Memory Allocation

Linked lists are fundamental to dynamic memory allocation, where memory is allocated and
deallocated at runtime. The operating system uses linked lists to manage free memory blocks and
allocate space for processes and applications.

Example:

Memory Management: In systems like malloc/free in C or new/delete in C++, linked lists


manage free memory blocks, allowing the system to efficiently allocate and deallocate memory
as needed.

2. Implementing Data Structures

Linked lists are used to implement various abstract data types (ADTs) and data structures, providing
the flexibility needed for dynamic operations.

Examples:

Stacks: A stack can be efficiently implemented using a singly linked list, where push and pop
operations occur at the head of the list.
Queues: A queue can be implemented using a singly linked list with operations at both the head
and tail.
Deques: Double-ended queues (deques) can be implemented using doubly linked lists to allow
operations at both ends efficiently.

3. Undo Mechanism in Applications

Linked lists are used to implement undo mechanisms in applications, where each action is stored in
a linked list of actions. Users can traverse backward to undo previous actions.
Example:

Text Editors: Many text editors use linked lists to maintain a history of changes, allowing users
to undo or redo text modifications.

4. Graph Representations

Linked lists are used to represent graphs, particularly in the adjacency list representation. Each
vertex maintains a list of its adjacent vertices, making graph traversal and manipulation efficient.

Example:

Adjacency List: In a graph, each vertex can use a linked list to store its adjacent vertices,
allowing efficient edge insertions and deletions.

5. Polynomial Arithmetic

Linked lists are used to represent and perform arithmetic operations on polynomials, where each
node represents a term with its coefficient and exponent.

Example:

Polynomial Operations: Linked lists facilitate polynomial addition, subtraction, and


multiplication by allowing efficient traversal and manipulation of polynomial terms.

6. Implementation of Hash Tables

Linked lists are used in hash tables to handle collisions through chaining. Each bucket in a hash
table contains a linked list of entries that hash to the same index, allowing efficient resolution of
collisions.

Example:

Chaining: In hash tables, each bucket is implemented as a linked list to manage collisions and
maintain a list of entries with the same hash value.

7. Music Playlists and Media Players

Linked lists are used in applications like music playlists and media players to manage and traverse a
list of media items efficiently.

Example:
Music Playlists: A media player may use a linked list to represent a playlist, allowing efficient
insertion and deletion of songs and easy traversal of the playlist.

Summary

Dynamic Memory Allocation: Linked lists manage free memory blocks and handle runtime
memory allocation.
Implementing Data Structures: Linked lists are used to implement stacks, queues, deques,
and other ADTs.
Undo Mechanism: Linked lists store actions for undo/redo functionality in applications.
Graph Representations: Linked lists are used in adjacency lists for graph representation.
Polynomial Arithmetic: Linked lists represent and manipulate polynomials.
Hash Tables: Linked lists handle collisions through chaining in hash tables.
Music Playlists: Linked lists manage and traverse playlists and media items efficiently.

Linked lists provide a flexible and efficient way to handle dynamic data and various application
requirements, making them a crucial data structure in computer science.
Stack ADT

Introduction

The Stack Abstract Data Type (ADT) is a fundamental data structure used in computer science to
model a collection of elements with Last In, First Out (LIFO) access. This means that the most
recently added element is the first one to be removed. Stacks are widely used in various
applications, including algorithm implementation, expression evaluation, and memory management.

Definition

A stack is defined by the following operations:

Push: Add an element to the top of the stack.


Pop: Remove and return the top element of the stack.
Peek (or Top): Return the top element without removing it.
Is Empty: Check whether the stack is empty.
Size: Return the number of elements in the stack.

Operations

1. Push

The push operation adds an element to the top of the stack. This operation increases the size of the
stack by one.

Example:

stack.push(element)

2. Pop

The pop operation removes the element from the top of the stack and returns it. This operation
decreases the size of the stack by one. If the stack is empty, an error or exception is usually raised.

Example:

element = stack.pop()

3. Peek (or Top)


The peek operation returns the top element of the stack without removing it. This operation does not
modify the stack.

Example:

element = stack.peek()

4. Is Empty

The is_empty operation checks whether the stack has any elements. It returns True if the stack is
empty and False otherwise.

Example:

empty = stack.is_empty()

5. Size

The size operation returns the number of elements currently in the stack.

Example:

size = stack.size()

Implementation

Stacks can be implemented using arrays or linked lists. Below are implementations using both
approaches.

Array-Based Implementation

Python Code:

class Stack:
def __init__(self):
self.items = []

def is_empty(self):
return len(self.items) == 0

def push(self, item):


self.items.append(item)
def pop(self):
if not self.is_empty():
return self.items.pop()
else:
raise IndexError("pop from empty stack")

def peek(self):
if not self.is_empty():
return self.items[-1]
else:
raise IndexError("peek from empty stack")

def size(self):
return len(self.items)

Explanation:

__init__ method: Initializes an empty list to represent the stack.


push method: Adds an item to the end of the list (top of the stack).
pop method: Removes and returns the last item from the list (top of the stack).
peek method: Returns the last item from the list without removing it.
size method: Returns the number of items in the stack.

Linked List-Based Implementation

Python Code:

class Node:
def __init__(self, value):
self.value = value
self.next = None

class Stack:
def __init__(self):
self.top = None
self._size = 0

def is_empty(self):
return self.top is None

def push(self, item):


new_node = Node(item)
new_node.next = self.top
self.top = new_node
self._size += 1

def pop(self):
if self.is_empty():
raise IndexError("pop from empty stack")
value = self.top.value
self.top = self.top.next
self._size -= 1
return value

def peek(self):
if self.is_empty():
raise IndexError("peek from empty stack")
return self.top.value

def size(self):
return self._size

Explanation:

Node class: Represents a single element in the linked list.


Stack class: Manages the stack using linked nodes.
push method: Adds a new node with the given item at the top of the stack.
pop method: Removes the top node and returns its value.
peek method: Returns the value of the top node without removing it.
size method: Returns the number of nodes in the stack.

Applications

Stacks are used in various applications, including:

Function Call Management: Stacks are used to manage function calls and local variables in
programming languages.
Expression Evaluation: Stacks are used in algorithms to evaluate expressions, such as infix,
postfix, and prefix notations.
Backtracking Algorithms: Stacks are used to keep track of choices in backtracking algorithms,
such as solving puzzles or navigating mazes.
Undo Mechanisms: Stacks are used in applications like text editors to implement undo
functionality.

Summary

Stack ADT: A data structure with Last In, First Out (LIFO) access.
Operations: Push, pop, peek, is_empty, and size.
Implementations: Array-based and linked list-based.
Applications: Function call management, expression evaluation, backtracking algorithms, and
undo mechanisms.

Stacks provide a simple yet powerful way to manage and manipulate data in a controlled manner,
making them an essential tool in various computational problems.
Queue ADT and Double-Ended Queues

Queue ADT

Introduction

The Queue Abstract Data Type (ADT) is a fundamental data structure used in computer science to
model a collection of elements with First In, First Out (FIFO) access. This means that the first
element added to the queue will be the first one to be removed. Queues are essential in various
applications such as task scheduling, buffering, and data streaming.

Definition

A queue is defined by the following operations:

Enqueue: Add an element to the end of the queue.


Dequeue: Remove and return the front element of the queue.
Front (or Peek): Return the front element without removing it.
Is Empty: Check whether the queue is empty.
Size: Return the number of elements in the queue.

Operations

1. Enqueue

The enqueue operation adds an element to the end of the queue, increasing its size by one.

Example:

queue.enqueue(element)

2. Dequeue

The dequeue operation removes the element from the front of the queue and returns it. This
operation decreases the size of the queue by one. If the queue is empty, an error or exception is
usually raised.

Example:

element = queue.dequeue()
3. Front (or Peek)

The front operation returns the front element of the queue without removing it. This operation does
not modify the queue.

Example:

element = queue.front()

4. Is Empty

The is_empty operation checks whether the queue has any elements. It returns True if the queue
is empty and False otherwise.

Example:

empty = queue.is_empty()

5. Size

The size operation returns the number of elements currently in the queue.

Example:

size = queue.size()

Implementation

Queues can be implemented using arrays or linked lists. Below are implementations using both
approaches.

Array-Based Implementation

Python Code:

class Queue:
def __init__(self):
self.items = []

def is_empty(self):
return len(self.items) == 0

def enqueue(self, item):


self.items.append(item)

def dequeue(self):
if not self.is_empty():
return self.items.pop(0)
else:
raise IndexError("dequeue from empty queue")

def front(self):
if not self.is_empty():
return self.items[0]
else:
raise IndexError("front from empty queue")

def size(self):
return len(self.items)

Explanation:

__init__ method: Initializes an empty list to represent the queue.


enqueue method: Adds an item to the end of the list (rear of the queue).
dequeue method: Removes and returns the first item from the list (front of the queue).
front method: Returns the first item from the list without removing it.
size method: Returns the number of items in the queue.

Linked List-Based Implementation

Python Code:

class Node:
def __init__(self, value):
self.value = value
self.next = None

class Queue:
def __init__(self):
self.front = None
self.rear = None
self._size = 0

def is_empty(self):
return self.front is None

def enqueue(self, item):


new_node = Node(item)
if self.is_empty():
self.front = new_node
else:
self.rear.next = new_node
self.rear = new_node
self._size += 1
def dequeue(self):
if self.is_empty():
raise IndexError("dequeue from empty queue")
value = self.front.value
self.front = self.front.next
if self.front is None:
self.rear = None
self._size -= 1
return value

def front(self):
if self.is_empty():
raise IndexError("front from empty queue")
return self.front.value

def size(self):
return self._size

Explanation:

Node class: Represents a single element in the linked list.


Queue class: Manages the queue using linked nodes.
enqueue method: Adds a new node with the given item to the end of the queue.
dequeue method: Removes the front node and returns its value.
front method: Returns the value of the front node without removing it.
size method: Returns the number of nodes in the queue.

Double-Ended Queues (Deque)

Introduction

A Double-Ended Queue (Deque) is a data structure that allows insertion and removal of elements
from both ends—front and rear. This makes deques more flexible than standard queues, as they
support operations from both ends.

Definition

A deque supports the following operations:

Add to Front: Add an element to the front of the deque.


Add to Rear: Add an element to the rear of the deque.
Remove from Front: Remove and return the front element of the deque.
Remove from Rear: Remove and return the rear element of the deque.
Front (or Peek Front): Return the front element without removing it.
Rear (or Peek Rear): Return the rear element without removing it.
Is Empty: Check whether the deque is empty.
Size: Return the number of elements in the deque.

Operations

1. Add to Front

The add_to_front operation adds an element to the front of the deque.

Example:

deque.add_to_front(element)

2. Add to Rear

The add_to_rear operation adds an element to the rear of the deque.

Example:

deque.add_to_rear(element)

3. Remove from Front

The remove_from_front operation removes the element from the front of the deque and returns it.

Example:

element = deque.remove_from_front()

4. Remove from Rear

The remove_from_rear operation removes the element from the rear of the deque and returns it.

Example:

element = deque.remove_from_rear()

5. Front (or Peek Front)

The front operation returns the front element of the deque without removing it.
Example:

element = deque.front()

6. Rear (or Peek Rear)

The rear operation returns the rear element of the deque without removing it.

Example:

element = deque.rear()

7. Is Empty

The is_empty operation checks whether the deque is empty.

Example:

empty = deque.is_empty()

8. Size

The size operation returns the number of elements currently in the deque.

Example:

size = deque.size()

Implementation

Deques can be implemented using arrays or linked lists. Below are implementations using both
approaches.

Array-Based Implementation

Python Code:

from collections import deque

class Deque:
def __init__(self):
self.items = deque()

def add_to_front(self, item):


self.items.appendleft(item)

def add_to_rear(self, item):


self.items.append(item)

def remove_from_front(self):
if not self.is_empty():
return self.items.popleft()
else:
raise IndexError("remove from empty deque")

def remove_from_rear(self):
if not self.is_empty():
return self.items.pop()
else:
raise IndexError("remove from empty deque")

def front(self):
if not self.is_empty():
return self.items[0]
else:
raise IndexError("front from empty deque")

def rear(self):
if not self.is_empty():
return self.items[-1]
else:
raise IndexError("rear from empty deque")

def is_empty(self):
return len(self.items) == 0

def size(self):
return len(self.items)

Explanation:

add_to_front method: Adds an item to the front of the deque.


add_to_rear method: Adds an item to the rear of the deque.
remove_from_front method: Removes and returns the front item.
remove_from_rear method: Removes and returns the rear item.
front method: Returns the front item without removing it.
rear method: Returns the rear item without removing it.
is_empty method: Checks if the deque is empty.
size method: Returns the number of items in the deque.

Linked List-Based Implementation

Python Code:
class Node:
def __init__(self, value):
self.value = value
self.next = None
self.prev = None

class Deque:
def __init__(self):
self.front = None
self.rear = None
self._size = 0

def add_to_front(self, item):


new_node = Node(item)
if self

.is_empty():
self.front = self.rear = new_node
else:
new_node.next = self.front
self.front.prev = new_node
self.front = new_node
self._size += 1

def add_to_rear(self, item):


new_node = Node(item)
if self.is_empty():
self.front = self.rear = new_node
else:
new_node.prev = self.rear
self.rear.next = new_node
self.rear = new_node
self._size += 1

def remove_from_front(self):
if self.is_empty():
raise IndexError("remove from empty deque")
value = self.front.value
self.front = self.front.next
if self.front is None:
self.rear = None
else:
self.front.prev = None
self._size -= 1
return value

def remove_from_rear(self):
if self.is_empty():
raise IndexError("remove from empty deque")
value = self.rear.value
self.rear = self.rear.prev
if self.rear is None:
self.front = None
else:
self.rear.next = None
self._size -= 1
return value

def front(self):
if self.is_empty():
raise IndexError("front from empty deque")
return self.front.value

def rear(self):
if self.is_empty():
raise IndexError("rear from empty deque")
return self.rear.value

def is_empty(self):
return self.front is None

def size(self):
return self._size

Explanation:

Node class: Represents an element in the linked list with references to both the next and
previous nodes.
Deque class: Manages the deque using doubly linked nodes.
add_to_front method: Adds a new node with the given item to the front of the deque.
add_to_rear method: Adds a new node with the given item to the rear of the deque.
remove_from_front method: Removes the front node and returns its value.
remove_from_rear method: Removes the rear node and returns its value.
front method: Returns the value of the front node without removing it.
rear method: Returns the value of the rear node without removing it.
is_empty method: Checks if the deque is empty.
size method: Returns the number of nodes in the deque.

Summary

Queue ADT: A data structure with First In, First Out (FIFO) access. Operations include
enqueue, dequeue, front, is_empty, and size.
Double-Ended Queue (Deque): A more flexible data structure that allows insertion and removal
from both ends. Operations include add_to_front, add_to_rear, remove_from_front,
remove_from_rear, front, rear, is_empty, and size.
Implementations: Both queues and deques can be implemented using arrays or linked lists,
each with specific advantages and trade-offs.

Understanding queues and deques, along with their implementations, provides essential tools for
managing and manipulating data in a wide range of applications.
Unit-III
Sorting & Searching
Introduction to Sorting Algorithms

Overview

Sorting is a fundamental operation in computer science and programming, and it refers to the
process of arranging data in a particular order, typically in ascending or descending order. Sorting
algorithms play a crucial role in improving the efficiency of searching, data retrieval, and various
other operations. They are used in many applications, including database management, file systems,
and networking.

Sorting algorithms differ in terms of efficiency, complexity, stability, and whether they require
additional space. The choice of the sorting algorithm depends on the nature of the data and the
constraints of the system where it is being implemented.

Importance of Sorting

Sorting is important for several reasons:

Improved Data Access: Sorted data allows for faster searching operations, especially in large
datasets. Binary search, for example, requires sorted data to function correctly.
Efficient Organization: Many algorithms that operate on data sets (like searching algorithms)
assume that the data is already sorted, which makes the operations more efficient.
Data Presentation: Sorted data is easier to visualize and analyze, especially when presented
to users.
Optimization: In some cases, sorted data can lead to optimizations in other operations, such as
merging or filtering.

Types of Sorting Algorithms

Sorting algorithms can generally be divided into two categories:

1. Comparison-based Sorting Algorithms: These algorithms determine the order of elements by


comparing them. Examples include:
Bubble Sort
Selection Sort
Insertion Sort
Merge Sort
Quick Sort
Heap Sort
2. Non-Comparison-based Sorting Algorithms: These algorithms sort elements without
comparing them, typically by using some form of counting or distribution. Examples include:
Counting Sort
Radix Sort
Bucket Sort
Characteristics of Sorting Algorithms

Sorting algorithms can be characterized based on several factors:

Time Complexity: This measures the number of operations the algorithm performs relative to
the input size. The worst-case, average-case, and best-case time complexities are important in
determining the efficiency of the algorithm.
Space Complexity: This measures the amount of additional memory space required by the
algorithm aside from the input data.
Stability: A sorting algorithm is stable if it preserves the relative order of equal elements in the
sorted output.
In-Place vs. Out-of-Place: An in-place algorithm sorts the data without requiring extra space for
another copy of the data, while out-of-place algorithms need additional space to store temporary
data.
Adaptive vs. Non-Adaptive: An adaptive algorithm takes advantage of existing order in the
input data, while a non-adaptive algorithm does not.

Common Sorting Algorithms

1. Bubble Sort

Bubble Sort is a simple, comparison-based sorting algorithm that repeatedly steps through the list,
compares adjacent elements, and swaps them if they are in the wrong order. It is not efficient for
large datasets due to its quadratic time complexity.

Time Complexity: O(n²) in the worst and average case.


Space Complexity: O(1) (in-place sorting).
Stability: Yes.

2. Selection Sort

Selection Sort is another simple, comparison-based algorithm. It divides the input list into two parts:
the sorted part and the unsorted part. It repeatedly selects the smallest (or largest) element from the
unsorted part and moves it to the sorted part.

Time Complexity: O(n²) in the worst and average case.


Space Complexity: O(1).
Stability: No.

3. Insertion Sort

Insertion Sort builds the sorted list one element at a time by repeatedly picking the next element and
inserting it into the correct position among the already sorted elements. It is efficient for small or
nearly sorted datasets.
Time Complexity: O(n²) in the worst and average case, O(n) in the best case.
Space Complexity: O(1).
Stability: Yes.

4. Merge Sort

Merge Sort is a comparison-based, divide-and-conquer algorithm. It divides the list into two halves,
recursively sorts each half, and then merges the sorted halves to produce the final sorted list. Merge
Sort is efficient for large datasets.

Time Complexity: O(n log n) in the worst, average, and best cases.
Space Complexity: O(n) (out-of-place sorting).
Stability: Yes.

5. Quick Sort

Quick Sort is another divide-and-conquer algorithm. It works by selecting a pivot element,


partitioning the list around the pivot so that elements smaller than the pivot are on one side and
elements greater than the pivot are on the other, and then recursively sorting each partition.

Time Complexity: O(n²) in the worst case, O(n log n) in the average case.
Space Complexity: O(log n) (in-place sorting).
Stability: No.

6. Heap Sort

Heap Sort is a comparison-based sorting algorithm that utilizes a binary heap data structure. It first
builds a max heap from the input data and then repeatedly extracts the largest element from the
heap and rebuilds the heap.

Time Complexity: O(n log n) in the worst, average, and best cases.
Space Complexity: O(1).
Stability: No.

7. Counting Sort

Counting Sort is a non-comparison-based algorithm that counts the occurrences of each unique
element in the input list and uses this information to place the elements in the correct position in the
sorted output. It is efficient for data with a small range of values.

Time Complexity: O(n + k), where n is the number of elements and k is the range of the input
values.
Space Complexity: O(n + k) (out-of-place sorting).
Stability: Yes.
8. Radix Sort

Radix Sort is a non-comparison-based sorting algorithm that sorts numbers digit by digit starting
from the least significant digit. It uses a stable sorting algorithm (e.g., Counting Sort) as a subroutine.

Time Complexity: O(nk), where n is the number of elements and k is the number of digits.
Space Complexity: O(n + k).
Stability: Yes.

9. Bucket Sort

Bucket Sort divides the input data into several buckets and then sorts each bucket individually using
another sorting algorithm. It is particularly effective when the input data is uniformly distributed
across a range.

Time Complexity: O(n + k) in the best case, O(n²) in the worst case.
Space Complexity: O(n + k).
Stability: Yes (depends on the sorting algorithm used within buckets).

Summary

Sorting algorithms are essential tools in computer science for organizing and retrieving data
efficiently. The choice of sorting algorithm depends on factors such as time complexity, space
complexity, stability, and whether the data is partially sorted. While simpler algorithms like Bubble
Sort, Selection Sort, and Insertion Sort are easy to implement, more complex algorithms like Merge
Sort, Quick Sort, and Heap Sort offer better performance for large datasets. Non-comparison-based
algorithms like Counting Sort, Radix Sort, and Bucket Sort provide optimal performance for specific
types of data. Understanding these algorithms allows for more informed decisions when handling
data-intensive applications.
Bubble Sort

Overview

Bubble Sort is one of the simplest sorting algorithms. It repeatedly steps through the list to be sorted,
compares adjacent elements, and swaps them if they are in the wrong order. The pass through the
list is repeated until the list is sorted. Although it is not efficient for large datasets due to its quadratic
time complexity, Bubble Sort is useful for educational purposes to illustrate fundamental sorting
concepts.

Key Characteristics of Bubble Sort

Time Complexity: O(n²) in the worst and average case, O(n) in the best case (when the list is
already sorted).
Space Complexity: O(1) (in-place sorting).
Stability: Yes, since equal elements maintain their relative order in the sorted array.
Adaptability: It can stop early if the list becomes sorted before all passes are complete.

Algorithm Explanation

The Bubble Sort algorithm works by comparing each pair of adjacent elements in a list and swapping
them if they are in the wrong order. This process "bubbles" the largest unsorted element to its correct
position with each pass through the list. The algorithm repeats this process until no more swaps are
needed, meaning the list is sorted.

Steps of the Algorithm

1. Starting from the beginning: Begin at the first element of the list.
2. Compare adjacent elements: Compare the current element with the next element.
3. Swap if necessary: If the current element is greater than the next element, swap them.
4. Repeat: Continue this process for all elements in the list. At the end of the first pass, the largest
element will have "bubbled" up to the last position.
5. Iterate for remaining elements: Repeat the process for the rest of the list, excluding the last
sorted element after each pass, until no more swaps are needed.

Example

Consider an unsorted list: [5, 3, 8, 4, 2]

1. First Pass:

Compare 5 and 3: Swap → [3, 5, 8, 4, 2]


Compare 5 and 8: No swap
Compare 8 and 4: Swap → [3, 5, 4, 8, 2]
Compare 8 and 2: Swap → [3, 5, 4, 2, 8]

After the first pass, the largest element 8 is in its correct position.
2. Second Pass:

Compare 3 and 5: No swap


Compare 5 and 4: Swap → [3, 4, 5, 2, 8]
Compare 5 and 2: Swap → [3, 4, 2, 5, 8]

After the second pass, 5 is in its correct position.


3. Third Pass:

Compare 3 and 4: No swap


Compare 4 and 2: Swap → [3, 2, 4, 5, 8]

After the third pass, 4 is in its correct position.


4. Fourth Pass:

Compare 3 and 2: Swap → [2, 3, 4, 5, 8]

Now, the list is fully sorted.

Python Implementation of Bubble Sort

def bubble_sort(arr):
n = len(arr)
for i in range(n):
# Track whether a swap was made in this pass
swapped = False

# Last i elements are already sorted


for j in range(0, n - i - 1):
# Compare adjacent elements
if arr[j] > arr[j + 1]:
# Swap if they are in the wrong order
arr[j], arr[j + 1] = arr[j + 1], arr[j]
swapped = True

# If no elements were swapped, the list is already sorted


if not swapped:
break

# Example usage:
arr = [5, 3, 8, 4, 2]
bubble_sort(arr)
print("Sorted array:", arr)
Explanation of the Code

1. Outer Loop: The outer loop controls the number of passes through the list. After each pass, the
largest unsorted element is placed in its correct position.
2. Inner Loop: The inner loop compares adjacent elements and swaps them if necessary. It
iterates only over the unsorted portion of the list.
3. Early Termination: The variable swapped is used to check if any swaps were made during the
pass. If no swaps were made, the list is already sorted, and the algorithm can terminate early,
improving efficiency for partially sorted lists.

Time Complexity

Worst and Average Case: O(n²), when the list is in reverse order or unsorted.
Best Case: O(n), when the list is already sorted.

Space Complexity

O(1): Bubble Sort is an in-place sorting algorithm, meaning it only requires a constant amount of
additional memory space aside from the input list.

Stability of Bubble Sort

Bubble Sort is a stable sorting algorithm. This means that when two elements have equal values,
their relative order will be preserved in the sorted output. This is an important characteristic in
scenarios where the order of equal elements matters, such as when sorting records in a database by
multiple keys.

Advantages of Bubble Sort

Simplicity: The algorithm is straightforward and easy to implement.


Stability: It preserves the order of equal elements.
Adaptability: If the list is already sorted, the algorithm can terminate early, making it more
efficient in such cases.

Disadvantages of Bubble Sort

Inefficiency: Bubble Sort has a time complexity of O(n²), making it inefficient for large datasets.
Not Optimal: There are more efficient algorithms, such as Quick Sort and Merge Sort, for large
or random datasets.

Summary

Bubble Sort is a simple, yet inefficient, comparison-based sorting algorithm. Its primary advantage is
its simplicity and stability, but it suffers from poor time complexity when applied to large or unsorted
datasets. Despite its limitations, Bubble Sort remains a popular teaching tool to introduce students to
basic sorting principles.
Selection Sort

Overview

Selection Sort is a simple, in-place comparison-based sorting algorithm. It divides the input list into
two parts: the sublist of items already sorted (on the left) and the sublist of items yet to be sorted (on
the right). The algorithm repeatedly selects the smallest (or largest, depending on the sorting order)
element from the unsorted sublist and swaps it with the first unsorted element, moving the boundary
between sorted and unsorted sublists one step to the right.

Key Characteristics of Selection Sort

Time Complexity: O(n²) for all cases (best, average, and worst).
Space Complexity: O(1) as it is an in-place sorting algorithm.
Stability: No, because the order of equal elements may not be preserved after sorting.
Adaptability: Does not adapt to pre-sorted lists.

Algorithm Explanation

The algorithm proceeds by iterating over the list and selecting the smallest element from the
unsorted portion of the list and placing it in the correct position in the sorted portion. This process is
repeated until the entire list is sorted.

Steps of the Algorithm

1. Select the minimum element: Begin at the first element, iterate through the unsorted portion of
the list, and find the smallest element.
2. Swap the smallest element: Swap the smallest element found in the previous step with the
first unsorted element.
3. Repeat for remaining elements: Repeat the process for the rest of the list, gradually growing
the sorted portion until the entire list is sorted.

Example

Consider an unsorted list: [64, 25, 12, 22, 11]

1. First Pass:
Find the smallest element (11) and swap it with the first element → [11, 25, 12, 22, 64]
2. Second Pass:
Find the smallest element from the remaining unsorted portion (12) and swap it with the
second element → [11, 12, 25, 22, 64]
3. Third Pass:
Find the smallest element from the remaining unsorted portion (22) and swap it with the
third element → [11, 12, 22, 25, 64]
4. Fourth Pass:
The remaining portion is already sorted, so no more swaps are needed.

The sorted list is: [11, 12, 22, 25, 64]

Python Implementation of Selection Sort

def selection_sort(arr):
n = len(arr)
for i in range(n):
# Assume the first unsorted element is the minimum
min_index = i

# Find the minimum element in the unsorted portion


for j in range(i + 1, n):
if arr[j] < arr[min_index]:
min_index = j

# Swap the found minimum element with the first unsorted element
arr[i], arr[min_index] = arr[min_index], arr[i]

# Example usage:
arr = [64, 25, 12, 22, 11]
selection_sort(arr)
print("Sorted array:", arr)

Explanation of the Code

1. Outer Loop: The outer loop runs through each element of the array. The boundary between the
sorted and unsorted sublists is defined by the variable i .
2. Inner Loop: The inner loop finds the smallest element in the unsorted portion of the array by
comparing each element to the current minimum.
3. Swapping: Once the smallest element is found, it is swapped with the first unsorted element,
effectively placing it in its correct sorted position.

Time Complexity

Worst, Average, and Best Case: O(n²). Selection Sort always performs n - 1 comparisons for
each element in the array, regardless of whether the array is already sorted or not.

Space Complexity

O(1): Selection Sort is an in-place algorithm, meaning it only requires a constant amount of
additional memory.

Stability of Selection Sort


Selection Sort is not stable. In some cases, when swapping elements, the relative order of equal
elements may be changed. For example, if two elements have the same value but different positions
in the original list, their positions may be reversed after sorting.

Advantages of Selection Sort

Simplicity: Easy to understand and implement.


Fewer swaps: Compared to Bubble Sort, Selection Sort performs fewer swaps, which can be
advantageous in situations where swap operations are expensive.

Disadvantages of Selection Sort

Inefficiency: Selection Sort has a time complexity of O(n²), which makes it inefficient for large
datasets.
Not Adaptive: The algorithm does not take advantage of pre-sorted elements, always
performing the same number of comparisons, regardless of the initial order of the list.

Summary

Selection Sort is a straightforward sorting algorithm that repeatedly selects the smallest unsorted
element and swaps it into place. While it is not efficient for large datasets due to its quadratic time
complexity, it has a clear and easy-to-follow logic, making it useful for educational purposes.
Insertion Sort

Overview

Insertion Sort is a simple, intuitive, and efficient algorithm for small datasets or nearly sorted data. It
works similarly to the way people sort playing cards in their hands: by gradually creating a sorted
section of the list and inserting each new element into its correct position. It is an in-place
comparison-based sorting algorithm.

Key Characteristics of Insertion Sort

Time Complexity:
Best Case: O(n) (when the list is already sorted).
Average and Worst Case: O(n²) (when the list is randomly ordered or sorted in reverse).
Space Complexity: O(1) since it requires a constant amount of extra space.
Stability: Yes, Insertion Sort is stable, meaning equal elements retain their relative order after
sorting.
Adaptability: It is adaptive in nature and performs better on nearly sorted data.

Algorithm Explanation

Insertion Sort builds the sorted list one element at a time by comparing the current element with the
previous elements and inserting it in the correct position within the sorted portion of the list.

Steps of the Algorithm

1. Start from the second element: Treat the first element as sorted.
2. Insert each element: Take the next element from the unsorted portion of the list and insert it
into the correct position within the sorted portion by shifting larger elements to the right.
3. Repeat until the list is sorted: Continue this process for all elements until the entire list is
sorted.

Example

Consider an unsorted list: [12, 11, 13, 5, 6]

1. First Pass (Insert 11):


Sorted portion: [12]
Compare 11 with 12 and insert it before 12.
Updated list: [11, 12, 13, 5, 6]
2. Second Pass (Insert 13):
Sorted portion: [11, 12]
13 is already greater than 12, so it remains in its place.
Updated list: [11, 12, 13, 5, 6]
3. Third Pass (Insert 5):
Sorted portion: [11, 12, 13]
Compare 5 with each element in the sorted portion and shift 11, 12, and 13 to the right to
make space for 5 at the beginning.
Updated list: [5, 11, 12, 13, 6]
4. Fourth Pass (Insert 6):
Sorted portion: [5, 11, 12, 13]
Compare 6 with each element in the sorted portion and shift 11, 12, and 13 to the right to
insert 6 in its correct position.
Updated list: [5, 6, 11, 12, 13]

The sorted list is: [5, 6, 11, 12, 13]

Python Implementation of Insertion Sort

def insertion_sort(arr):
# Traverse from the second element to the end of the list
for i in range(1, len(arr)):
key = arr[i]

# Move elements of arr[0..i-1], that are greater than key, to one position
ahead
j = i - 1
while j >= 0 and key < arr[j]:
arr[j + 1] = arr[j]
j -= 1

# Insert the key at its correct position


arr[j + 1] = key

# Example usage:
arr = [12, 11, 13, 5, 6]
insertion_sort(arr)
print("Sorted array:", arr)

Explanation of the Code

1. Outer Loop: The outer loop traverses the list starting from the second element (index 1), as the
first element is considered sorted by default.
2. Inner Loop: The inner loop shifts elements in the sorted portion of the list to the right if they are
greater than the current element ( key ).
3. Insertion: After shifting elements, the current element ( key ) is inserted at its correct position.

Time Complexity

Best Case: O(n). The best case occurs when the list is already sorted. In this case, each
element only needs to be compared once with the previous element.
Average and Worst Case: O(n²). In the worst case, when the list is sorted in reverse order,
each element will need to be compared with every other element in the sorted portion, resulting
in n-1 comparisons for each element.

Space Complexity

O(1): Insertion Sort is an in-place sorting algorithm, meaning it only requires a constant amount
of extra space.

Stability of Insertion Sort

Insertion Sort is stable. Equal elements remain in their original order relative to each other after
sorting because the algorithm only shifts larger elements to the right and inserts the current element
without changing the order of equal elements.

Advantages of Insertion Sort

Simplicity: Easy to understand and implement.


Efficient for Small or Nearly Sorted Datasets: Insertion Sort is particularly efficient for small
datasets and nearly sorted data.
In-Place and Stable: It requires no extra space and preserves the relative order of equal
elements.

Disadvantages of Insertion Sort

Inefficiency for Large Datasets: Insertion Sort is not suitable for large datasets due to its O(n²)
time complexity.
Not Parallelizable: The algorithm is inherently sequential and cannot be easily parallelized.

Summary

Insertion Sort is a simple and effective algorithm for sorting small or nearly sorted lists. While it has
quadratic time complexity in the worst case, it performs efficiently on small datasets and is adaptive
to nearly sorted input. Its stability, in-place sorting, and ease of implementation make it a practical
choice for scenarios with small input sizes.
Merge Sort

Overview

Merge Sort is a classic example of the divide and conquer algorithmic paradigm. It works by
recursively dividing the unsorted list into smaller sublists until each sublist contains a single element
(which is trivially sorted). Then, it merges these sublists back together in the correct order to produce
a fully sorted list.

Key Characteristics of Merge Sort

Time Complexity:
Best, Average, and Worst Case: O(n log n)
Space Complexity: O(n) due to the use of temporary arrays during the merge process.
Stability: Yes, Merge Sort is stable, meaning equal elements retain their relative order after
sorting.
Adaptability: Merge Sort is non-adaptive, meaning it does not perform better for nearly sorted
data.

Algorithm Explanation

Merge Sort operates by repeatedly dividing the list into two halves until each half contains only a
single element. Once the list is divided, the merging phase begins, where the elements are
combined in a sorted order.

Steps of the Algorithm

1. Divide: Split the list into two equal halves.


2. Conquer: Recursively sort both halves.
3. Combine: Merge the two sorted halves into a single sorted list.

Example

Consider an unsorted list: [38, 27, 43, 3, 9, 82, 10]

Step 1: Divide the List

1. Split the list into two halves:


Left half: [38, 27, 43]
Right half: [3, 9, 82, 10]
2. Recursively divide the halves:
[38, 27, 43] → [38] and [27, 43]
[3, 9, 82, 10] → [3, 9] and [82, 10]
3. Continue dividing until each sublist contains a single element:
[27, 43] → [27] and [43]
[3, 9] → [3] and [9]
[82, 10] → [82] and [10]

Step 2: Conquer and Merge the Sublists

1. Merge [27] and [43] into [27, 43]


2. Merge [82] and [10] into [10, 82]
3. Merge [3] and [9] into [3, 9]
4. Continue merging:
Merge [38] and [27, 43] into [27, 38, 43]
Merge [3, 9] and [10, 82] into [3, 9, 10, 82]
5. Finally, merge [27, 38, 43] and [3, 9, 10, 82] into [3, 9, 10, 27, 38, 43, 82]

The sorted list is: [3, 9, 10, 27, 38, 43, 82]

Python Implementation of Merge Sort

def merge_sort(arr):
# Base case: a list of 1 or fewer elements is already sorted
if len(arr) > 1:
mid = len(arr) // 2
left_half = arr[:mid]
right_half = arr[mid:]

# Recursive calls to sort both halves


merge_sort(left_half)
merge_sort(right_half)

# Merging the sorted halves


i = j = k = 0

# Merge the two halves while comparing their elements


while i < len(left_half) and j < len(right_half):
if left_half[i] < right_half[j]:
arr[k] = left_half[i]
i += 1
else:
arr[k] = right_half[j]
j += 1
k += 1

# Copy any remaining elements of left_half


while i < len(left_half):
arr[k] = left_half[i]
i += 1
k += 1

# Copy any remaining elements of right_half


while j < len(right_half):
arr[k] = right_half[j]
j += 1
k += 1

# Example usage:
arr = [38, 27, 43, 3, 9, 82, 10]
merge_sort(arr)
print("Sorted array:", arr)

Explanation of the Code

1. Base Case: The recursion stops when the list has one or zero elements, as such lists are
inherently sorted.
2. Recursive Division: The list is split into two halves recursively using slicing.
3. Merge Process: The sorted halves are merged by comparing the elements of the two halves
and inserting the smaller one into the original array.

Time Complexity

O(n log n) in all cases (best, average, and worst). Merge Sort consistently divides the list into
two halves and merges them, leading to a logarithmic number of levels and linear work at each
level.

Space Complexity

O(n): The algorithm requires additional space for storing the left and right halves during the
merge step, resulting in linear space usage.

Stability of Merge Sort

Merge Sort is stable. During the merging process, when two elements are equal, the one from the
left sublist is chosen first, preserving the relative order of equal elements.

Advantages of Merge Sort

Consistent Time Complexity: Merge Sort guarantees O(n log n) time complexity in all cases,
unlike algorithms like Quick Sort, which may degrade to O(n²) in the worst case.
Stability: It is a stable sorting algorithm, which is useful when the relative order of equal
elements is important.
Works Well with Large Data: It is particularly well-suited for sorting large datasets due to its
predictable performance.

Disadvantages of Merge Sort

Space Complexity: The additional space requirement of O(n) may be prohibitive for systems
with limited memory.
Slower for Small Datasets: Although Merge Sort has better time complexity, it is generally
slower than simpler algorithms like Insertion Sort for small datasets due to the overhead of
recursive calls and additional memory usage.

Summary

Merge Sort is a highly efficient, stable sorting algorithm with a guaranteed O(n log n) time
complexity. It is particularly well-suited for large datasets but requires additional space due to the use
of auxiliary arrays. Merge Sort's consistent performance and stability make it a preferred choice for
scenarios where memory is not a concern and the size of the dataset is large.
Quick Sort

Overview

Quick Sort is a highly efficient sorting algorithm based on the divide and conquer paradigm. It
works by selecting a "pivot" element from the array and partitioning the other elements into two
subarrays, according to whether they are less than or greater than the pivot. The subarrays are then
sorted recursively.

Key Characteristics of Quick Sort

Time Complexity:
Best and Average Case: O(n log n)
Worst Case: O(n²)
Space Complexity: O(log n) due to recursion.
Stability: No, Quick Sort is not a stable sorting algorithm.
Adaptability: Quick Sort is adaptive based on the choice of the pivot.

Algorithm Explanation

Quick Sort repeatedly partitions the array into two subarrays by selecting a pivot element and
ensuring all elements less than the pivot are on its left and all elements greater than the pivot are on
its right. This process is recursively applied to the subarrays.

Steps of the Algorithm

1. Choose a Pivot: Select a pivot element from the array.


2. Partition: Rearrange the array elements so that elements less than the pivot are on the left, and
elements greater than the pivot are on the right.
3. Recursion: Recursively apply Quick Sort to the subarrays formed by the partition.

Example

Consider an unsorted list: [9, 3, 7, 6, 2, 8, 5]

Step 1: Choose a Pivot

Let’s choose the last element as the pivot, 5 .

Step 2: Partition the List

1. Rearrange the list so that elements less than 5 are on the left, and elements greater than 5
are on the right.
2. After the first partition, the list becomes: [3, 2, 5, 7, 6, 9, 8] .

Now, 5 is in its correct position.

Step 3: Apply Recursion to Subarrays

1. Recursively apply Quick Sort to the left subarray [3, 2] and the right subarray [7, 6, 9, 8] .
2. Continue recursively applying the process until all subarrays contain only one element, which
means the array is sorted.

The final sorted list is: [2, 3, 5, 6, 7, 8, 9] .

Python Implementation of Quick Sort

def quick_sort(arr):
# Base case: arrays of length 1 or less are inherently sorted
if len(arr) <= 1:
return arr

pivot = arr[-1] # Choosing the last element as the pivot


left = [x for x in arr[:-1] if x <= pivot]
right = [x for x in arr[:-1] if x > pivot]

# Recursively sort the left and right partitions


return quick_sort(left) + [pivot] + quick_sort(right)

# Example usage:
arr = [9, 3, 7, 6, 2, 8, 5]
sorted_arr = quick_sort(arr)
print("Sorted array:", sorted_arr)

Explanation of the Code

1. Base Case: The recursion terminates when the list has one or zero elements, as these lists are
already sorted.
2. Pivot Selection: The pivot is chosen as the last element in the list ( arr[-1] ).
3. Partitioning: The list is split into two sublists based on the pivot. Elements less than or equal to
the pivot go to the left, and elements greater than the pivot go to the right.
4. Recursive Sorting: Quick Sort is applied recursively to both partitions. The final result is the
concatenation of the sorted left sublist, the pivot, and the sorted right sublist.

Time Complexity

Best and Average Case: O(n log n), which occurs when the partitioning divides the array into
two nearly equal halves.
Worst Case: O(n²), which occurs when the pivot is consistently the smallest or largest element
in the list, leading to unbalanced partitions.
Space Complexity

O(log n): This comes from the recursive calls. Unlike Merge Sort, Quick Sort does not require
additional arrays, making it more space-efficient.

Stability of Quick Sort

Quick Sort is not stable by default. This is because elements that are equal to the pivot may be
rearranged in the partitioning process.

Advantages of Quick Sort

In-place Sorting: Quick Sort sorts the list in place, meaning it does not require additional
memory proportional to the size of the input array.
Efficiency: On average, Quick Sort performs better than other sorting algorithms like Merge
Sort for most inputs.
Cache Efficiency: Quick Sort's in-place nature makes it more cache-efficient than algorithms
like Merge Sort, which rely on auxiliary arrays.

Disadvantages of Quick Sort

Worst-case Time Complexity: Quick Sort can degrade to O(n²) in its worst case, which can be
avoided with better pivot selection strategies (e.g., choosing the median or using
randomization).
Non-Stability: Quick Sort is not stable, which can be a disadvantage when the order of equal
elements matters.

Optimizations of Quick Sort

Randomized Pivot: To avoid the worst-case time complexity, one can randomize the pivot
selection or use the median-of-three method (picking the pivot as the median of the first, middle,
and last elements).
Tail Recursion: Optimizing the recursive calls can reduce the recursion depth, improving the
space efficiency of the algorithm.

Summary

Quick Sort is a widely used and efficient sorting algorithm with an average time complexity of O(n log
n). It is an in-place, divide-and-conquer algorithm but can degrade to O(n²) in the worst case. Careful
pivot selection and optimizations like randomization can help mitigate this issue. Quick Sort is ideal
for large datasets where space is a constraint, but the lack of stability may make it less suitable
when the relative order of equal elements matters.
Linear Search

Overview

Linear Search is one of the simplest search algorithms, used to find the position of a target value
within an unsorted list or array. The search process sequentially checks each element of the list,
starting from the first element and moving towards the last, until the target value is found or the list is
fully traversed.

Key Characteristics of Linear Search

Time Complexity:
Best Case: O(1) (when the target is at the first position).
Worst and Average Case: O(n) (when the target is at the end or not in the list).
Space Complexity: O(1), as it requires a constant amount of memory regardless of input size.
Applicability: Can be used with both unsorted and sorted lists.

Algorithm Explanation

Linear Search starts from the first element of the list and compares each element with the target
value. If the target is found, it returns the index of that element. If the target is not found by the end
of the list, it returns an indication that the target is not present in the list.

Steps of the Algorithm

1. Start from the first element of the list.


2. Compare the current element with the target value.
3. If the current element matches the target value, return the index of the element.
4. If the target is not found, move to the next element.
5. Repeat the process until either the target is found or the list is fully traversed.
6. If the target is not found by the end of the list, return a "not found" result.

Example

Consider an unsorted list: [5, 2, 9, 1, 6, 7]

Target Value: 6

Start from the first element: 5 (not a match).


Move to the second element: 2 (not a match).
Move to the third element: 9 (not a match).
Move to the fourth element: 1 (not a match).
Move to the fifth element: 6 (match found).
The target value 6 is found at index 4.

Python Implementation of Linear Search

def linear_search(arr, target):


# Traverse the list
for i in range(len(arr)):
# Check if the current element matches the target
if arr[i] == target:
return i # Return the index if target is found
return -1 # Return -1 if the target is not found

# Example usage:
arr = [5, 2, 9, 1, 6, 7]
target = 6
index = linear_search(arr, target)

if index != -1:
print(f"Target found at index: {index}")
else:
print("Target not found in the list")

Explanation of the Code

1. Loop: The for loop iterates through the list from the first to the last element.
2. Comparison: Each element is compared with the target.
3. Return: If the target is found, the function returns the index of the matching element. If the
target is not found by the end of the list, the function returns -1 , indicating the target is absent.

Time Complexity

Best Case (O(1)): The target is at the first position.


Worst and Average Case (O(n)): The target is either at the end of the list or not present in the
list.

Space Complexity

O(1): Linear Search uses a constant amount of extra space, regardless of the size of the input
list.

Limitations of Linear Search

Efficiency: Linear Search is not efficient for large datasets due to its O(n) time complexity, as it
checks every element in the list.
Unsorted Data: Linear Search can be used with unsorted data, but this comes at the cost of
performance. For sorted data, more efficient algorithms such as Binary Search should be used.
Advantages of Linear Search

Simplicity: The algorithm is easy to understand and implement.


No Preprocessing Required: The list does not need to be sorted, making Linear Search
applicable to any dataset.
Applicability: It works on any list, regardless of the data structure or whether the data is sorted.

Disadvantages of Linear Search

Inefficiency for Large Datasets: Linear Search is slow when dealing with large lists, as it
requires examining each element until the target is found.
Not Suitable for Sorted Data: For sorted datasets, more advanced search algorithms like
Binary Search are preferred due to their logarithmic time complexity.

Summary

Linear Search is a simple search algorithm that checks each element of a list sequentially until the
target value is found or the end of the list is reached. Though straightforward and effective for small
or unsorted datasets, it becomes inefficient for large datasets. Its primary advantage is its simplicity
and applicability to any type of list, but its performance limitations make it unsuitable for larger or
sorted data where more efficient algorithms are available.
Binary Search

Overview

Binary Search is an efficient search algorithm used to find the position of a target value within a
sorted array or list. Unlike linear search, which checks every element sequentially, binary search
divides the search space in half with each step, making it much faster for large datasets.

Key Characteristics of Binary Search

Time Complexity:
Best Case: O(1) (when the target is the middle element).
Worst and Average Case: O(log n) (with each comparison, the search space is halved).
Space Complexity: O(1) for iterative implementation and O(log n) for recursive implementation
due to the call stack.
Applicability: Works only on sorted lists or arrays.

Algorithm Explanation

Binary search repeatedly divides the sorted list into two halves and checks whether the target value
is in the left or right half. By comparing the target with the middle element of the list, the algorithm
eliminates half of the search space with each step.

Steps of the Algorithm

1. Set two pointers, low and high , to the beginning and the end of the list, respectively.
2. Find the middle element mid of the current search range.
3. Compare the target value with mid .
If the target is equal to mid , return the index.
If the target is less than mid , narrow the search to the left half by setting high = mid - 1 .
If the target is greater than mid , narrow the search to the right half by setting low = mid +
1.
4. Repeat steps 2 and 3 until the search space is empty or the target is found.

Example

Consider a sorted list: [1, 3, 5, 7, 9, 11, 13, 15, 17]

Target Value: 7

Step 1: Start with low = 0 and high = 8 .


Step 2: Calculate mid = (0 + 8) // 2 = 4 . The middle element is 9 .
Step 3: Since 7 < 9 , search in the left half ( high = 3 ).
Step 4: Recalculate mid = (0 + 3) // 2 = 1 . The middle element is 3 .
Step 5: Since 7 > 3 , search in the right half ( low = 2 ).
Step 6: Recalculate mid = (2 + 3) // 2 = 2 . The middle element is 5 .
Step 7: Since 7 > 5 , search in the right half ( low = 3 ).
Step 8: Recalculate mid = 3 . The middle element is 7 , which is the target.

The target value 7 is found at index 3.

Python Implementation of Binary Search

Iterative Approach

def binary_search(arr, target):


low = 0
high = len(arr) - 1

while low <= high:


mid = (low + high) // 2

if arr[mid] == target:
return mid
elif arr[mid] < target:
low = mid + 1
else:
high = mid - 1

return -1 # Target not found

# Example usage:
arr = [1, 3, 5, 7, 9, 11, 13, 15, 17]
target = 7
index = binary_search(arr, target)

if index != -1:
print(f"Target found at index: {index}")
else:
print("Target not found in the list")

Recursive Approach

def binary_search_recursive(arr, target, low, high):


if low > high:
return -1 # Target not found

mid = (low + high) // 2

if arr[mid] == target:
return mid
elif arr[mid] < target:
return binary_search_recursive(arr, target, mid + 1, high)
else:
return binary_search_recursive(arr, target, low, mid - 1)

# Example usage:
arr = [1, 3, 5, 7, 9, 11, 13, 15, 17]
target = 7
index = binary_search_recursive(arr, target, 0, len(arr) - 1)

if index != -1:
print(f"Target found at index: {index}")
else:
print("Target not found in the list")

Explanation of the Code

Iterative Version:
A while loop continuously narrows the search space by adjusting the low and high
pointers based on the comparison of the target with the middle element.
The search stops when the target is found or when low exceeds high , indicating that the
target is not in the list.
Recursive Version:
The function calls itself recursively, adjusting the search bounds ( low and high ) after each
comparison until the base case is reached (either the target is found or the search space is
empty).

Time Complexity

Best Case (O(1)): The target is found at the middle of the list on the first comparison.
Worst and Average Case (O(log n)): The list is divided in half with each step, resulting in
logarithmic time complexity.

Space Complexity

O(1) (Iterative Version): The iterative approach uses a constant amount of space.
O(log n) (Recursive Version): The recursive approach uses extra space on the call stack
proportional to the depth of the recursion, which is O(log n).

Limitations of Binary Search

Requires Sorted List: Binary search can only be applied to sorted lists. For unsorted data,
sorting the list first is necessary, which incurs additional time complexity (e.g., O(n log n) for
efficient sorting algorithms).
Overhead: For small lists, the overhead of managing the pointers or recursive calls may
outweigh the benefits of binary search.

Advantages of Binary Search


Efficiency: Binary search is much faster than linear search, especially for large datasets, due to
its O(log n) time complexity.
Simplicity: Despite its logarithmic efficiency, binary search is relatively simple to understand
and implement.

Disadvantages of Binary Search

Requires Sorted Data: This limitation can make binary search impractical if sorting is not
feasible due to the size of the dataset or other constraints.
Fixed Dataset: If the dataset changes frequently (inserts or deletions), binary search may not
be as effective since the dataset must remain sorted for the search to work efficiently.

Summary

Binary search is a powerful and efficient search algorithm that drastically reduces the search space
with each comparison by leveraging the sorted nature of the dataset. With O(log n) time complexity,
it is highly effective for large datasets, but requires the data to be sorted beforehand. Both iterative
and recursive implementations are widely used, each with their own advantages and trade-offs in
terms of space and clarity.
Hashing

Overview

Hashing is a technique used to map data to a fixed-size value (called a hash code or hash value)
that represents the original data in a more compact form. The primary purpose of hashing is to
enable fast data retrieval, making it an essential component in many data structures like hash tables,
dictionaries, and sets.

Key Characteristics of Hashing

Efficiency: Provides average O(1) time complexity for search, insert, and delete operations.
Compactness: Hashing compresses potentially large data into a smaller hash code.
Deterministic: Given the same input, a hashing function will always produce the same output.

Hash Functions

A hash function is used to convert input data (called the key) into an integer, which is then used as
an index in a hash table.

Properties of a Good Hash Function

1. Deterministic: The same input should always yield the same hash code.
2. Uniform Distribution: The hash function should evenly distribute the keys across the table to
minimize collisions.
3. Fast Computation: The hash function should be quick to compute.
4. Minimizes Collisions: The number of collisions (when two keys map to the same index) should
be minimized.

Example: Simple Hash Function

One common and simple hash function is based on the remainder (modulus) operation:

def simple_hash(key, table_size):


return key % table_size

This function divides the key by the size of the table and returns the remainder, which serves as the
index.

Real-world Hash Functions


In practice, hash functions are more complex and may involve bit manipulation, prime numbers, or
other techniques to ensure good distribution and efficiency. Python’s built-in hash() function is an
example of a more sophisticated hash function.

Hash Table

A hash table is a data structure that implements an associative array, mapping keys to values. It
uses a hash function to compute the index (or location) where the data is stored.

Structure of a Hash Table

Buckets: The hash table consists of an array, each slot of which is called a bucket. The bucket
can store the actual value or a reference to the value.
Key-Value Pairs: Each bucket in the table can store a key-value pair. The key is hashed to
determine its bucket.

Insertion and Retrieval

Insertion: To insert a key-value pair, the key is hashed to determine the bucket index. The value
is then stored at that index.
Retrieval: To retrieve a value, the key is hashed to find the appropriate bucket, and the value is
fetched from that bucket.

Collisions and Collision Handling

A collision occurs when two different keys produce the same hash value and thus are mapped to
the same bucket. Handling collisions is a crucial aspect of designing an efficient hash table.

Common Collision Handling Techniques

1. Chaining (Separate Chaining): Each bucket contains a linked list of key-value pairs. When a
collision occurs, the new key-value pair is added to the linked list at that index.

hash_table = [[] for _ in range(10)]

def insert(hash_table, key, value):


index = simple_hash(key, len(hash_table))
hash_table[index].append((key, value))

def retrieve(hash_table, key):


index = simple_hash(key, len(hash_table))
for k, v in hash_table[index]:
if k == key:
return v
return None # Key not found
2. Open Addressing: In open addressing, collisions are resolved by probing for the next available
slot in the array. There are several strategies for probing:

Linear Probing: If a collision occurs, the algorithm checks the next bucket in the array until
an empty slot is found.
Quadratic Probing: The interval between probes increases quadratically.
Double Hashing: A secondary hash function is used to calculate the step size for probing.

Example of linear probing:

def linear_probing_insert(hash_table, key, value):


index = simple_hash(key, len(hash_table))
while hash_table[index] is not None:
index = (index + 1) % len(hash_table)
hash_table[index] = (key, value)

Advantages and Disadvantages of Collision Handling Techniques

Chaining:
Advantages: Simple to implement; allows for an unlimited number of collisions at the same
index.
Disadvantages: Requires extra memory for the linked lists; performance degrades with too
many collisions.
Open Addressing:
Advantages: More memory-efficient since no extra data structures are needed.
Disadvantages: Performance suffers as the hash table fills up due to clustering
(consecutive filled slots).

Load Factor and Rehashing

Load Factor

The load factor of a hash table is the ratio of the number of elements stored to the total number of
buckets. It is used to determine when the hash table needs to be resized to maintain efficient
performance.

[
\text{Load Factor} = \frac{\text{Number of Elements}}{\text{Number of Buckets}}
]

A high load factor means the table is filling up, which can lead to more collisions.
A low load factor means the table is under-utilized, wasting space.

Rehashing
Rehashing is the process of resizing the hash table when the load factor exceeds a certain
threshold. This involves creating a new, larger table and rehashing all the keys to distribute them
across the new table.

When to Rehash: Typically, when the load factor exceeds 0.7 (70% full).
How to Rehash: Create a new hash table with double the size of the original, and re-insert all
elements into the new table using the original hash function.

def rehash(hash_table):
new_table = [[] for _ in range(len(hash_table) * 2)]
for bucket in hash_table:
for key, value in bucket:
index = simple_hash(key, len(new_table))
new_table[index].append((key, value))
return new_table

Example of Hash Table in Python

Here’s a Python implementation of a hash table using separate chaining for collision handling:

class HashTable:
def __init__(self, size):
self.size = size
self.table = [[] for _ in range(size)]

def hash_function(self, key):


return hash(key) % self.size

def insert(self, key, value):


index = self.hash_function(key)
for pair in self.table[index]:
if pair[0] == key:
pair[1] = value # Update existing value
return
self.table[index].append([key, value]) # Insert new value

def retrieve(self, key):


index = self.hash_function(key)
for pair in self.table[index]:
if pair[0] == key:
return pair[1]
return None # Key not found

def delete(self, key):


index = self.hash_function(key)
for pair in self.table[index]:
if pair[0] == key:
self.table[index].remove(pair)
return

# Example usage:
hash_table = HashTable(10)
hash_table.insert("apple", 10)
hash_table.insert("banana", 20)

print(hash_table.retrieve("apple")) # Output: 10
print(hash_table.retrieve("banana")) # Output: 20
hash_table.delete("apple")
print(hash_table.retrieve("apple")) # Output: None

Explanation of the Code

hash_function : Computes the index by taking the modulo of the hash value of the key with the
table size.
insert : Inserts a key-value pair into the table, updating the value if the key already exists.
retrieve : Retrieves the value associated with the given key.
delete : Removes the key-value pair from the table.

Advantages of Hashing

Fast Lookups: Hashing offers constant-time lookups, making it highly efficient for search,
insert, and delete operations.
Simple Implementation: Hashing can be implemented easily with basic programming
constructs.
Versatile: Hash tables are used in various applications like caching, indexing databases, and
implementing sets and dictionaries.

Disadvantages of Hashing

Collisions: Despite good hash functions, collisions can occur, requiring extra mechanisms to
handle them efficiently.
Inefficient for Sorted Data: Hashing does not preserve order, so it is not suitable for
applications where sorted data is needed.
Memory Overhead: Hash tables can require significant memory, especially when rehashing is
used to maintain performance.

Summary

Hashing is an essential technique for fast data retrieval, particularly when dealing with large
datasets. By using a hash function to map data to fixed-size values, it allows constant-time
operations in average cases. However, handling collisions and managing the load factor are critical
aspects of ensuring that a hash table remains efficient. With proper implementation and good hash
functions, hashing provides a robust solution for implementing associative arrays, dictionaries, and
sets in Python and other programming languages.
Unit IV
Tree Structures
Introduction to Tree ADT

Overview

A Tree Abstract Data Type (ADT) is a hierarchical data structure that represents a collection of
elements in a parent-child relationship. It is widely used in computer science for organizing data in a
way that allows efficient searching, insertion, and deletion operations. Trees are fundamental in
various applications, including file systems, databases, and graphical user interfaces.

Key Characteristics of Trees

Hierarchical Structure: A tree has a hierarchical organization with a single root node and zero
or more subtrees.
Nodes and Edges: The elements in a tree are called nodes, and the connections between
nodes are called edges.
Root Node: The topmost node of a tree from which all other nodes descend.
Leaves: Nodes with no children are called leaves or terminal nodes.
Internal Nodes: Nodes that have at least one child.
Subtrees: Each node (except the root) has a parent node and zero or more child nodes,
forming subtrees.

Terminology

Node: A fundamental part of a tree that contains data and pointers to other nodes.
Root: The top node of the tree from which all nodes originate.
Parent: A node that has one or more children.
Child: A node that descends from another node.
Siblings: Nodes that share the same parent.
Depth: The level of a node, starting with the root at depth 0.
Height: The length of the longest path from a node to a leaf.
Subtree: A tree formed by a node and all its descendants.

Types of Trees

Binary Tree

A Binary Tree is a tree in which each node has at most two children, commonly referred to as the
left child and the right child.

Properties of Binary Trees

Full Binary Tree: Every node has either 0 or 2 children.


Complete Binary Tree: All levels are fully filled except possibly for the last level, which is filled
from left to right.
Perfect Binary Tree: All internal nodes have exactly two children, and all leaf nodes are at the
same level.

Binary Search Tree (BST)

A Binary Search Tree (BST) is a binary tree where the nodes are arranged in such a way that for
each node:

The left subtree contains only nodes with values less than the node’s value.
The right subtree contains only nodes with values greater than the node’s value.

Properties of BST

In-order Traversal: The in-order traversal of a BST yields the nodes in ascending order.

AVL Tree

An AVL Tree is a self-balancing binary search tree where the height of the two child subtrees of any
node differs by at most one.

Properties of AVL Trees

Balance Factor: For any node, the difference in height between its left and right subtrees is at
most 1.
Rotation: AVL trees use rotations (left, right, left-right, and right-left) to maintain balance during
insertions and deletions.

B-Tree

A B-Tree is a self-balancing tree data structure that maintains sorted data and allows searches,
sequential access, insertions, and deletions in logarithmic time. It is commonly used in databases
and file systems.

Properties of B-Trees

Degree: A B-tree of degree ( t ) (minimum degree) has nodes with at most ( 2t-1 ) keys and at
least ( t-1 ) keys.
Balanced: All leaf nodes are at the same level.

Tree Operations

Insertion
Insertion in a tree involves adding a new node while preserving the tree’s properties. For example, in
a binary search tree, the new node is placed in such a way that the BST property is maintained.

Deletion

Deletion in a tree involves removing a node and reorganizing the tree to preserve its properties. In a
binary search tree, this can involve replacing the node with its successor or predecessor if the node
has two children.

Traversal

Traversal refers to visiting all nodes in a specific order. Common traversal methods include:

In-order Traversal: Visit the left subtree, then the node, and finally the right subtree.
Pre-order Traversal: Visit the node first, then the left subtree, and finally the right subtree.
Post-order Traversal: Visit the left subtree, then the right subtree, and finally the node.
Level-order Traversal: Visit nodes level by level from top to bottom.

Example of Tree Representation in Python

Here is an example of how to represent and traverse a binary tree in Python:

class TreeNode:
def __init__(self, key):
self.left = None
self.right = None
self.value = key

class BinaryTree:
def __init__(self):
self.root = None

def insert(self, key):


if self.root is None:
self.root = TreeNode(key)
else:
self._insert(self.root, key)

def _insert(self, node, key):


if key < node.value:
if node.left is None:
node.left = TreeNode(key)
else:
self._insert(node.left, key)
else:
if node.right is None:
node.right = TreeNode(key)
else:
self._insert(node.right, key)
def in_order_traversal(self, node):
if node:
self.in_order_traversal(node.left)
print(node.value, end=' ')
self.in_order_traversal(node.right)

# Example usage:
tree = BinaryTree()
tree.insert(5)
tree.insert(3)
tree.insert(7)
tree.insert(2)
tree.insert(4)

print("In-order Traversal:")
tree.in_order_traversal(tree.root) # Output: 2 3 4 5 7

Explanation of the Code

TreeNode Class: Represents a node in the binary tree, containing a value and pointers to left
and right children.
BinaryTree Class: Contains methods for inserting nodes and performing in-order traversal.
insert Method: Adds a new node to the tree.
in_order_traversal Method: Recursively visits nodes in in-order fashion and prints their
values.

Summary

Tree ADT: A hierarchical data structure with nodes connected by edges.


Types of Trees: Binary trees, binary search trees, AVL trees, and B-trees.
Operations: Insertion, deletion, and traversal.
Python Implementation: Example of binary tree representation and in-order traversal.

Trees are crucial for organizing and managing data efficiently and are foundational for many
advanced data structures and algorithms. Understanding trees and their properties is essential for
tackling problems in computer science and software development.
Binary Tree ADT

Overview

A Binary Tree Abstract Data Type (ADT) is a specialized form of a tree data structure where each
node has at most two children, commonly referred to as the left child and the right child. Binary trees
are used in various applications, including expression parsing, decision-making algorithms, and data
storage systems. They provide efficient methods for organizing data and performing operations like
insertion, deletion, and traversal.

Key Characteristics of Binary Trees

Node: The fundamental unit of a binary tree, which contains a value and pointers to left and
right children.
Root: The topmost node in the tree. All other nodes are descendants of the root.
Leaves: Nodes with no children.
Internal Nodes: Nodes with at least one child.
Subtree: A tree consisting of a node and all its descendants.

Properties of Binary Trees

1. Structure

Binary Tree: A tree where each node has at most two children.
Height: The length of the longest path from the root to a leaf. The height of a tree with only one
node is 0.
Depth: The length of the path from the root to a node. The depth of the root is 0.

2. Types of Binary Trees

Full Binary Tree: Every node has either 0 or 2 children.


Complete Binary Tree: All levels are fully filled except possibly for the last level, which is filled
from left to right.
Perfect Binary Tree: All internal nodes have exactly two children, and all leaf nodes are at the
same level.

3. Binary Search Tree (BST)

A Binary Search Tree (BST) is a binary tree where the nodes are arranged in such a way that:

The left subtree of a node contains only nodes with values less than the node’s value.
The right subtree of a node contains only nodes with values greater than the node’s value.
4. AVL Tree

An AVL Tree is a self-balancing binary search tree where:

The height difference between the left and right subtrees of any node is at most 1.
Rotations are used to maintain balance during insertions and deletions.

Operations on Binary Trees

1. Insertion

Inserting a new node into a binary tree involves placing the node in the correct position while
maintaining the tree's properties.

Example: Inserting into a Binary Search Tree

class TreeNode:
def __init__(self, key):
self.left = None
self.right = None
self.value = key

class BinarySearchTree:
def __init__(self):
self.root = None

def insert(self, key):


if self.root is None:
self.root = TreeNode(key)
else:
self._insert(self.root, key)

def _insert(self, node, key):


if key < node.value:
if node.left is None:
node.left = TreeNode(key)
else:
self._insert(node.left, key)
else:
if node.right is None:
node.right = TreeNode(key)
else:
self._insert(node.right, key)

Explanation:

insert Method: Adds a new node to the binary search tree.


_insert Method: Recursively finds the correct position for the new node.
2. Deletion

Deleting a node from a binary tree involves removing the node and rearranging the tree to maintain
its properties. In a binary search tree, this may involve replacing the node with its successor or
predecessor if it has two children.

3. Traversal

Traversal methods involve visiting all nodes in a specific order. Common methods include:

In-order Traversal: Visit the left subtree, then the node, and finally the right subtree.

def in_order_traversal(self, node):


if node:
self.in_order_traversal(node.left)
print(node.value, end=' ')
self.in_order_traversal(node.right)

Pre-order Traversal: Visit the node first, then the left subtree, and finally the right subtree.

def pre_order_traversal(self, node):


if node:
print(node.value, end=' ')
self.pre_order_traversal(node.left)
self.pre_order_traversal(node.right)

Post-order Traversal: Visit the left subtree, then the right subtree, and finally the node.

def post_order_traversal(self, node):


if node:
self.post_order_traversal(node.left)
self.post_order_traversal(node.right)
print(node.value, end=' ')

Level-order Traversal: Visit nodes level by level from top to bottom using a queue.

from collections import deque

def level_order_traversal(self, root):


if root is None:
return
queue = deque([root])
while queue:
node = queue.popleft()
print(node.value, end=' ')
if node.left:
queue.append(node.left)
if node.right:
queue.append(node.right)
Summary

Binary Tree ADT: A hierarchical structure where each node has at most two children.
Types: Includes full binary trees, complete binary trees, perfect binary trees, binary search
trees, and AVL trees.
Operations: Key operations include insertion, deletion, and various traversal methods.
Python Implementation: Examples of insertion and traversal methods in binary search trees.

Binary trees are versatile and efficient for many applications, and understanding their properties and
operations is crucial for effective data management and algorithm design.
Tree Traversals

Overview

Tree traversal is the process of visiting all the nodes in a tree data structure in a specific order. Tree
traversal is essential for various operations like searching, sorting, and processing tree-based data.
There are several methods for traversing a tree, each with different applications and use cases. The
most common traversal methods are in-order, pre-order, post-order, and level-order.

Types of Tree Traversals

1. In-Order Traversal

In in-order traversal, nodes are visited in the following order:

1. Left subtree
2. Current node
3. Right subtree

This traversal is particularly useful for binary search trees (BSTs) as it retrieves nodes in ascending
order.

Python Example: In-Order Traversal

class TreeNode:
def __init__(self, key):
self.left = None
self.right = None
self.value = key

class BinaryTree:
def __init__(self):
self.root = None

def in_order_traversal(self, node):


if node:
self.in_order_traversal(node.left)
print(node.value, end=' ')
self.in_order_traversal(node.right)

# Example usage
tree = BinaryTree()
tree.root = TreeNode(4)
tree.root.left = TreeNode(2)
tree.root.right = TreeNode(6)
tree.root.left.left = TreeNode(1)
tree.root.left.right = TreeNode(3)
tree.root.right.left = TreeNode(5)
tree.root.right.right = TreeNode(7)

tree.in_order_traversal(tree.root) # Output: 1 2 3 4 5 6 7

Explanation:

in_order_traversal Method: Recursively visits nodes in left subtree, then the current node,
followed by the right subtree.

2. Pre-Order Traversal

In pre-order traversal, nodes are visited in the following order:

1. Current node
2. Left subtree
3. Right subtree

This traversal method is useful for creating a copy of the tree or for prefix expressions in expression
trees.

Python Example: Pre-Order Traversal

class BinaryTree:
def __init__(self):
self.root = None

def pre_order_traversal(self, node):


if node:
print(node.value, end=' ')
self.pre_order_traversal(node.left)
self.pre_order_traversal(node.right)

# Example usage
tree = BinaryTree()
tree.root = TreeNode(4)
tree.root.left = TreeNode(2)
tree.root.right = TreeNode(6)
tree.root.left.left = TreeNode(1)
tree.root.left.right = TreeNode(3)
tree.root.right.left = TreeNode(5)
tree.root.right.right = TreeNode(7)

tree.pre_order_traversal(tree.root) # Output: 4 2 1 3 6 5 7

Explanation:
pre_order_traversal Method: Visits the current node first, followed by the left subtree and
then the right subtree.

3. Post-Order Traversal

In post-order traversal, nodes are visited in the following order:

1. Left subtree
2. Right subtree
3. Current node

This traversal method is useful for deleting a tree or evaluating postfix expressions.

Python Example: Post-Order Traversal

class BinaryTree:
def __init__(self):
self.root = None

def post_order_traversal(self, node):


if node:
self.post_order_traversal(node.left)
self.post_order_traversal(node.right)
print(node.value, end=' ')

# Example usage
tree = BinaryTree()
tree.root = TreeNode(4)
tree.root.left = TreeNode(2)
tree.root.right = TreeNode(6)
tree.root.left.left = TreeNode(1)
tree.root.left.right = TreeNode(3)
tree.root.right.left = TreeNode(5)
tree.root.right.right = TreeNode(7)

tree.post_order_traversal(tree.root) # Output: 1 3 2 5 7 6 4

Explanation:

post_order_traversal Method: Visits the left and right subtrees first, then the current node.

4. Level-Order Traversal

In level-order traversal, nodes are visited level by level from top to bottom. This is typically
implemented using a queue.

Python Example: Level-Order Traversal


from collections import deque

class BinaryTree:
def __init__(self):
self.root = None

def level_order_traversal(self, root):


if root is None:
return
queue = deque([root])
while queue:
node = queue.popleft()
print(node.value, end=' ')
if node.left:
queue.append(node.left)
if node.right:
queue.append(node.right)

# Example usage
tree = BinaryTree()
tree.root = TreeNode(4)
tree.root.left = TreeNode(2)
tree.root.right = TreeNode(6)
tree.root.left.left = TreeNode(1)
tree.root.left.right = TreeNode(3)
tree.root.right.left = TreeNode(5)
tree.root.right.right = TreeNode(7)

tree.level_order_traversal(tree.root) # Output: 4 2 6 1 3 5 7

Explanation:

level_order_traversal Method: Uses a queue to visit nodes level by level.

Summary

In-Order Traversal: Visits left subtree, current node, right subtree. Useful for BSTs.
Pre-Order Traversal: Visits current node, left subtree, right subtree. Useful for creating copies.
Post-Order Traversal: Visits left subtree, right subtree, current node. Useful for deletions.
Level-Order Traversal: Visits nodes level by level using a queue.

Understanding these traversal methods is crucial for efficiently processing and manipulating tree
data structures.
Binary Search Trees (BST)

Overview

A Binary Search Tree (BST) is a type of binary tree that maintains a specific order among its nodes.
In a BST, for each node:

The left subtree contains only nodes with values less than the node’s value.
The right subtree contains only nodes with values greater than the node’s value.

This property makes BSTs useful for various applications like searching, sorting, and dynamically
managing ordered data.

Key Operations

1. Insertion

Insertion in a BST involves placing a new node in the correct position to maintain the BST property.

Python Example: Insertion in BST

class TreeNode:
def __init__(self, key):
self.left = None
self.right = None
self.value = key

class BinarySearchTree:
def __init__(self):
self.root = None

def insert(self, key):


if self.root is None:
self.root = TreeNode(key)
else:
self._insert(self.root, key)

def _insert(self, node, key):


if key < node.value:
if node.left is None:
node.left = TreeNode(key)
else:
self._insert(node.left, key)
else:
if node.right is None:
node.right = TreeNode(key)
else:
self._insert(node.right, key)

Explanation:

insert Method: Initiates the insertion process; if the tree is empty, the new node becomes the
root.
_insert Method: Recursively finds the correct location for the new node.

2. Search

Search in a BST involves traversing the tree to find a node with a specific value.

Python Example: Search in BST

class BinarySearchTree:
# ... (previous methods)

def search(self, key):


return self._search(self.root, key)

def _search(self, node, key):


if node is None or node.value == key:
return node
if key < node.value:
return self._search(node.left, key)
else:
return self._search(node.right, key)

Explanation:

search Method: Begins the search process.


_search Method: Recursively searches left or right subtree based on the value.

3. Deletion

Deletion in a BST involves removing a node while preserving the BST property. This can be more
complex and involves three cases:

1. Node to be deleted has no children: Simply remove the node.


2. Node to be deleted has one child: Remove the node and replace it with its child.
3. Node to be deleted has two children: Find the node’s in-order predecessor or successor to
replace the node.

Python Example: Deletion in BST


class BinarySearchTree:
# ... (previous methods)

def delete(self, key):


self.root = self._delete(self.root, key)

def _delete(self, node, key):


if node is None:
return node
if key < node.value:
node.left = self._delete(node.left, key)
elif key > node.value:
node.right = self._delete(node.right, key)
else:
# Node with only one child or no child
if node.left is None:
return node.right
elif node.right is None:
return node.left
# Node with two children: Get the inorder successor
temp = self._min_value_node(node.right)
node.value = temp.value
node.right = self._delete(node.right, temp.value)
return node

def _min_value_node(self, node):


current = node
while current.left is not None:
current = current.left
return current

Explanation:

delete Method: Initiates the deletion process.


_delete Method: Handles node deletion based on the number of children.
_min_value_node Method: Finds the node with the minimum value in the subtree.

4. Traversals

BSTs can be traversed using in-order, pre-order, post-order, and level-order methods, as discussed
earlier.

Python Example: BST Traversal

class BinarySearchTree:
# ... (previous methods)

def in_order_traversal(self, node):


if node:
self.in_order_traversal(node.left)
print(node.value, end=' ')
self.in_order_traversal(node.right)

Explanation:

in_order_traversal Method: Retrieves nodes in ascending order for BSTs.

Summary

Binary Search Tree (BST): A binary tree where each node follows the BST property (left
children are smaller, right children are larger).
Insertion: Adding a node while maintaining the BST property.
Search: Finding a node in the BST.
Deletion: Removing a node with handling cases for no, one, or two children.
Traversals: Methods for visiting nodes in different orders.

Understanding and implementing BSTs is fundamental for efficient searching and sorting operations
in computer science.
AVL Trees

Overview

An AVL Tree is a self-balancing binary search tree where the difference in heights between the left
and right subtrees of any node is at most one. This balance ensures that the tree remains
approximately balanced, leading to operations such as insertion, deletion, and search being
performed in ( O(\log n) ) time, where ( n ) is the number of nodes in the tree.

AVL Tree Properties

1. Height-Balancing: For every node in the tree, the height difference between the left and right
subtrees (known as the balance factor) is between -1 and 1.
2. Balance Factor: Calculated as the difference between the heights of the left and right subtrees.
For a node with balance factor ( bf ):
( bf = \text{height(left subtree)} - \text{height(right subtree)} )
Valid balance factors are -1, 0, and 1.

Rotations

To maintain the balance factor, AVL trees use rotations. These rotations adjust the tree's structure to
restore balance. There are four types of rotations:

1. Right Rotation (Single Rotation)


2. Left Rotation (Single Rotation)
3. Right-Left Rotation (Double Rotation)
4. Left-Right Rotation (Double Rotation)

Right Rotation (Single Rotation)

Purpose: Used when a left-heavy subtree needs balancing.

Python Example: Right Rotation

class AVLTree:
# Assuming TreeNode and other methods are defined

def right_rotate(self, y):


x = y.left
T2 = x.right

# Perform rotation
x.right = y
y.left = T2
# Update heights
y.height = max(self.get_height(y.left), self.get_height(y.right)) + 1
x.height = max(self.get_height(x.left), self.get_height(x.right)) + 1

return x

Explanation:

right_rotate Method: Performs a right rotation on node y , making x the new root of the
subtree.

Left Rotation (Single Rotation)

Purpose: Used when a right-heavy subtree needs balancing.

Python Example: Left Rotation

class AVLTree:
# Assuming TreeNode and other methods are defined

def left_rotate(self, x):


y = x.right
T2 = y.left

# Perform rotation
y.left = x
x.right = T2

# Update heights
x.height = max(self.get_height(x.left), self.get_height(x.right)) + 1
y.height = max(self.get_height(y.left), self.get_height(y.right)) + 1

return y

Explanation:

left_rotate Method: Performs a left rotation on node x , making y the new root of the
subtree.

Right-Left Rotation (Double Rotation)

Purpose: Used when the right subtree of the left child is heavier.

Python Example: Right-Left Rotation


class AVLTree:
# Assuming TreeNode and other methods are defined

def right_left_rotate(self, node):


node.left = self.left_rotate(node.left)
return self.right_rotate(node)

Explanation:

right_left_rotate Method: Combines right and left rotations to balance the subtree.

Left-Right Rotation (Double Rotation)

Purpose: Used when the left subtree of the right child is heavier.

Python Example: Left-Right Rotation

class AVLTree:
# Assuming TreeNode and other methods are defined

def left_right_rotate(self, node):


node.right = self.right_rotate(node.right)
return self.left_rotate(node)

Explanation:

left_right_rotate Method: Combines left and right rotations to balance the subtree.

AVL Tree Operations

Insertion

Insertion in an AVL tree involves adding a node and then performing rotations if necessary to
maintain balance.

Python Example: Insertion in AVL Tree

class AVLTree:
# Assuming TreeNode and rotation methods are defined

def insert(self, root, key):


if not root:
return TreeNode(key)
elif key < root.value:
root.left = self.insert(root.left, key)
else:
root.right = self.insert(root.right, key)

# Update height and balance factor


root.height = 1 + max(self.get_height(root.left),
self.get_height(root.right))
balance = self.get_balance(root)

# Balance the tree


if balance > 1 and key < root.left.value:
return self.right_rotate(root)
if balance < -1 and key > root.right.value:
return self.left_rotate(root)
if balance > 1 and key > root.left.value:
root.left = self.left_rotate(root.left)
return self.right_rotate(root)
if balance < -1 and key < root.right.value:
root.right = self.right_rotate(root.right)
return self.left_rotate(root)

return root

Explanation:

insert Method: Handles the insertion and performs rotations to keep the tree balanced.

Deletion

Deletion in an AVL tree involves removing a node and then adjusting the tree using rotations to
restore balance.

Python Example: Deletion in AVL Tree

class AVLTree:
# Assuming TreeNode and rotation methods are defined

def delete(self, root, key):


if not root:
return root
elif key < root.value:
root.left = self.delete(root.left, key)
elif key > root.value:
root.right = self.delete(root.right, key)
else:
if not root.left:
return root.right
elif not root.right:
return root.left
temp = self._min_value_node(root.right)
root.value = temp.value
root.right = self.delete(root.right, temp.value)
# Update height and balance factor
root.height = 1 + max(self.get_height(root.left),
self.get_height(root.right))
balance = self.get_balance(root)

# Balance the tree


if balance > 1 and self.get_balance(root.left) >= 0:
return self.right_rotate(root)
if balance > 1 and self.get_balance(root.left) < 0:
root.left = self.left_rotate(root.left)
return self.right_rotate(root)
if balance < -1 and self.get_balance(root.right) <= 0:
return self.left_rotate(root)
if balance < -1 and self.get_balance(root.right) > 0:
root.right = self.right_rotate(root.right)
return self.left_rotate(root)

return root

Explanation:

delete Method: Handles the deletion and performs rotations to keep the tree balanced.

Summary

AVL Tree: A self-balancing binary search tree with a height difference constraint between
subtrees.
Rotations: Right, left, right-left, and left-right rotations are used to maintain balance.
Insertion and Deletion: Operations that involve adjustments and rotations to keep the AVL tree
balanced.

AVL trees are essential for scenarios requiring frequent insertions and deletions while ensuring
efficient search operations.
Heaps

Overview

A heap is a specialized tree-based data structure that satisfies the heap property. It is commonly
used to implement priority queues and is crucial for algorithms like heap sort. There are two main
types of heaps: max-heaps and min-heaps.

Max-Heap: In a max-heap, the value of each node is greater than or equal to the values of its
children. This ensures that the maximum value is always at the root.
Min-Heap: In a min-heap, the value of each node is less than or equal to the values of its
children. This ensures that the minimum value is always at the root.

Properties of Heaps

1. Complete Binary Tree: A heap is a complete binary tree, meaning all levels of the tree are fully
filled except possibly for the last level, which is filled from left to right.
2. Heap Property: The key at each node must be greater than or equal to (for max-heaps) or less
than or equal to (for min-heaps) the keys of its children.

Operations on Heaps

Insertion

Insertion in a heap involves adding a new element and then restoring the heap property.

Python Example: Insertion in a Min-Heap

import heapq

class MinHeap:
def __init__(self):
self.heap = []

def insert(self, item):


heapq.heappush(self.heap, item)

def extract_min(self):
return heapq.heappop(self.heap)

def get_min(self):
return self.heap[0]

def __str__(self):
return str(self.heap)
Explanation:

insert Method: Adds a new item to the heap and ensures the heap property is maintained.
extract_min Method: Removes and returns the minimum element from the heap.
get_min Method: Returns the minimum element without removing it.

Deletion

Deletion typically involves removing the root (which is the maximum in a max-heap or minimum in a
min-heap) and then re-adjusting the heap.

Python Example: Deletion in a Max-Heap

import heapq

class MaxHeap:
def __init__(self):
self.heap = []

def insert(self, item):


heapq.heappush(self.heap, -item) # Invert values for max-heap

def extract_max(self):
return -heapq.heappop(self.heap) # Invert values to retrieve max

def get_max(self):
return -self.heap[0] # Invert values to get max

def __str__(self):
return str([-i for i in self.heap]) # Invert values for display

Explanation:

insert Method: Adds a new item to the max-heap (by inserting the negative value to use
Python’s min-heap implementation).
extract_max Method: Removes and returns the maximum element (inverted back to positive).
get_max Method: Returns the maximum element without removing it.

Heapify

Heapify is a process of converting an unordered array into a heap. It is used in the heap sort
algorithm.

Python Example: Heapify Function


def heapify(arr, n, i):
largest = i
left = 2 * i + 1
right = 2 * i + 2

if left < n and arr[i] < arr[left]:


largest = left

if right < n and arr[largest] < arr[right]:


largest = right

if largest != i:
arr[i], arr[largest] = arr[largest], arr[i]
heapify(arr, n, largest)

Explanation:

heapify Function: Ensures the subtree rooted at index i is a heap. It compares the node with
its children and swaps if necessary, then recursively heapifies the affected subtree.

Heap Sort

Heap Sort is a comparison-based sorting algorithm that uses the heap data structure.

Python Example: Heap Sort

def heap_sort(arr):
n = len(arr)

# Build max heap


for i in range(n // 2 - 1, -1, -1):
heapify(arr, n, i)

# Extract elements one by one


for i in range(n - 1, 0, -1):
arr[i], arr[0] = arr[0], arr[i] # Swap
heapify(arr, i, 0) # Heapify the reduced heap

return arr

Explanation:

heap_sort Function: First builds a max-heap from the array. Then repeatedly extracts the
maximum element and heapifies the reduced heap.

Summary
Heap: A complete binary tree that satisfies the heap property.
Max-Heap and Min-Heap: Max-heaps have the largest value at the root; min-heaps have the
smallest.
Operations: Insertion, deletion, heapify, and heap sort are key operations on heaps.
Applications: Heaps are used in priority queues and heap sort algorithms.

Heaps are powerful data structures that maintain a specific order in a binary tree, enabling efficient
retrieval and manipulation of the maximum or minimum elements.
Multiway Search Trees

Overview

Multiway search trees are a generalization of binary search trees (BSTs) where nodes can have
more than two children. These trees are particularly useful in scenarios where we need to manage a
large amount of data in a balanced manner and perform efficient search operations. Examples of
multiway search trees include B-trees and B+ trees, which are commonly used in databases and file
systems.

Properties of Multiway Search Trees

1. Balanced Structure: Multiway search trees are usually balanced, meaning that all leaf nodes
are at the same level. This balance ensures that operations like search, insertion, and deletion
have logarithmic time complexity.
2. Order: The order of a multiway search tree defines the maximum number of children each node
can have. For example, in a B-tree of order m , each node can have at most m children.
3. Search Property: In multiway search trees, keys are organized in such a way that for any node
with keys k1, k2, ..., kn and children C1, C2, ..., C(n+1) , all keys in C1 are less than
k1 , keys in C2 are between k1 and k2 , and so on.

B-Trees

A B-tree is a type of balanced multiway search tree where nodes can have multiple children. B-trees
are widely used in databases and file systems due to their efficient search, insert, and delete
operations.

B-Tree Properties

Order t : A B-tree of order t has nodes with at most 2t-1 keys and at least t-1 keys (except
for the root, which may have fewer keys).
Height: The height of a B-tree is kept balanced to ensure that all leaf nodes are at the same
level, providing logarithmic time complexity for operations.
Node Structure: Each node contains keys and child pointers. For a node with k keys, there
are k+1 child pointers.

B-Tree Operations

Insertion

Insertion in a B-tree involves placing a new key in the correct position and splitting nodes if
necessary.

Python Example: Insertion in a B-Tree


class BTreeNode:
def __init__(self, t, leaf=False):
self.t = t # Minimum degree (defines the range for number of keys)
self.leaf = leaf # True if leaf node
self.keys = []
self.children = []

class BTree:
def __init__(self, t):
self.root = BTreeNode(t, True)
self.t = t

def traverse(self, node, level=0):


print('Level', level, 'Keys:', node.keys)
level += 1
if not node.leaf:
for child in node.children:
self.traverse(child, level)

def insert(self, key):


root = self.root
if len(root.keys) == (2 * self.t - 1):
s = BTreeNode(self.t, False)
self.root = s
s.children.append(root)
self.split_child(s, 0)
self.insert_non_full(s, key)
else:
self.insert_non_full(root, key)

def insert_non_full(self, node, key):


i = len(node.keys) - 1
if node.leaf:
node.keys.append(None)
while i >= 0 and key < node.keys[i]:
node.keys[i + 1] = node.keys[i]
i -= 1
node.keys[i + 1] = key
else:
while i >= 0 and key < node.keys[i]:
i -= 1
i += 1
if len(node.children[i].keys) == (2 * self.t - 1):
self.split_child(node, i)
if key > node.keys[i]:
i += 1
self.insert_non_full(node.children[i], key)

def split_child(self, parent, i):


t = self.t
y = parent.children[i]
z = BTreeNode(t, y.leaf)
parent.children.insert(i + 1, z)
parent.keys.insert(i, y.keys[t - 1])
z.keys = y.keys[t:(2 * t - 1)]
y.keys = y.keys[0:(t - 1)]
if not y.leaf:
z.children = y.children[t:(2 * t)]
y.children = y.children[0:t]

# Example usage
b_tree = BTree(3)
b_tree.insert(10)
b_tree.insert(20)
b_tree.insert(5)
b_tree.insert(6)
b_tree.insert(15)

b_tree.traverse(b_tree.root)

Explanation:

BTreeNode Class: Represents a node in the B-tree with properties for keys, children, and
whether it is a leaf.
insert Method: Handles insertion and splits nodes as necessary to maintain the B-tree
properties.
split_child Method: Splits a full child node into two nodes and adjusts the parent node
accordingly.
traverse Method: Prints the keys at each level of the B-tree.

B+ Trees

B+ trees are a variation of B-trees where all values are stored at the leaf level and internal nodes
only store keys to guide searches. B+ trees are often used in databases and file systems because of
their efficient range queries.

B+ Tree Properties

Leaf Nodes: All keys are stored at the leaf nodes, and these leaves are linked in a linked list to
allow for efficient range queries.
Internal Nodes: Only contain keys to guide the search process.

Python Example: B+ Tree Implementation is similar to B-tree but with all values stored at leaf
nodes.

Summary

Multiway Search Trees: Generalization of binary trees where nodes can have more than two
children.
B-Trees: Balanced multiway search trees used in databases and file systems, supporting
efficient search and modification operations.
B+ Trees: A variation of B-trees where all values are stored in the leaf nodes, and internal
nodes only store keys for search guidance.
Multiway search trees are efficient for managing large sets of data and supporting fast search,
insertion, and deletion operations while maintaining a balanced structure.
Implementing Heapsort

Overview

Heapsort is a comparison-based sorting algorithm that utilizes the heap data structure to sort
elements efficiently. It first builds a max-heap (or min-heap, depending on the sorting order) from the
input data and then repeatedly extracts the maximum (or minimum) element from the heap, placing it
in its correct position in the sorted array. The key advantage of heapsort is its O(n log n) time
complexity, which is comparable to other efficient sorting algorithms like quicksort and merge sort,
but with a different approach to data organization and manipulation.

Heap Data Structure

A heap is a complete binary tree that satisfies the heap property. There are two types of heaps:

1. Max-Heap: The value of each node is greater than or equal to the values of its children.
2. Min-Heap: The value of each node is less than or equal to the values of its children.

For heapsort, we use a max-heap to facilitate sorting in ascending order.

Building the Max-Heap

To sort an array using heapsort, we first need to build a max-heap from the array. This process
ensures that the largest element is at the root of the heap.

Max-Heap Property

For a given node at index i , the following conditions must be satisfied:

The left child is located at index 2i + 1 .


The right child is located at index 2i + 2 .

If the value of the node is less than the value of either of its children, we swap the node with the
largest child and then recursively apply the heap property to the affected subtree.

Steps to Build the Max-Heap

1. Heapify: Adjust the heap to maintain the max-heap property.


2. Build: Construct the heap from the array by calling the heapify function from the bottom of the
tree up to the root.

Heapsort Algorithm
1. Build a max-heap from the input array.
2. Extract the maximum element from the heap (root) and place it at the end of the array.
3. Reduce the size of the heap and heapify the root to maintain the heap property.
4. Repeat steps 2 and 3 until the heap is empty.

Python Implementation

Here’s how you can implement heapsort in Python:

def heapify(arr, n, i):


largest = i # Initialize largest as root
l = 2 * i + 1 # Left child index
r = 2 * i + 2 # Right child index

# Check if the left child exists and is greater than root


if l < n and arr[i] < arr[l]:
largest = l

# Check if the right child exists and is greater than largest


if r < n and arr[largest] < arr[r]:
largest = r

# Change root if needed


if largest != i:
arr[i], arr[largest] = arr[largest], arr[i] # Swap

# Recursively heapify the affected subtree


heapify(arr, n, largest)

def heapsort(arr):
n = len(arr)

# Build a max heap


for i in range(n // 2 - 1, -1, -1):
heapify(arr, n, i)

# Extract elements from heap one by one


for i in range(n - 1, 0, -1):
arr[0], arr[i] = arr[i], arr[0] # Swap
heapify(arr, i, 0)

# Example usage
array = [12, 11, 13, 5, 6, 7]
heapsort(array)
print("Sorted array is", array)

Explanation

1. heapify Function: Ensures that the subtree rooted at index i follows the max-heap property.
2. heapsort Function: Builds the max-heap and then extracts elements from the heap, placing
them into their correct position in the array.
Summary

Heapsort: Efficient sorting algorithm with a time complexity of O(n log n) that uses a heap data
structure.
Max-Heap: Used for sorting in ascending order, where the largest element is extracted first.
Heapify: The process of maintaining the heap property by rearranging elements.

Heapsort provides a reliable and efficient method for sorting, particularly useful when dealing with
large datasets where in-place sorting is preferred.
Unit V
Graph Structures
Introduction to Graph ADT

Overview

A Graph is a fundamental data structure in computer science used to represent a set of objects with
pairwise connections between them. Graphs are widely used to model complex relationships in
various applications, including network routing, social networks, and more. The Graph Abstract Data
Type (ADT) provides a way to abstractly define these relationships and operations on graphs without
specifying the underlying implementation.

Definition of Graph ADT

A graph consists of two main components:

1. Vertices (or Nodes): These are the fundamental units or points in the graph.
2. Edges (or Arcs): These are the connections or links between pairs of vertices.

Graphs can be classified based on several criteria:

Directed vs. Undirected Graphs:


Directed Graph (Digraph): Edges have a direction from one vertex to another.
Undirected Graph: Edges have no direction, meaning the connection is bidirectional.
Weighted vs. Unweighted Graphs:
Weighted Graph: Each edge has an associated weight or cost.
Unweighted Graph: All edges are considered equal in terms of weight.
Cyclic vs. Acyclic Graphs:
Cyclic Graph: Contains at least one cycle (a path that starts and ends at the same vertex).
Acyclic Graph: Does not contain any cycles.
Connected vs. Disconnected Graphs:
Connected Graph: There is a path between every pair of vertices.
Disconnected Graph: Some vertices may not be reachable from others.

Key Operations on Graphs

The Graph ADT supports several operations, including:

1. Adding and Removing Vertices:


Add Vertex: Insert a new vertex into the graph.
Remove Vertex: Delete a vertex and all its incident edges.
2. Adding and Removing Edges:
Add Edge: Create a connection between two vertices.
Remove Edge: Delete the connection between two vertices.
3. Traversal:
Breadth-First Search (BFS): Visits all vertices reachable from a given start vertex in
breadthward motion.
Depth-First Search (DFS): Visits all vertices reachable from a given start vertex in
depthward motion.
4. Finding Shortest Paths:
Dijkstra’s Algorithm: Finds the shortest path from a source vertex to all other vertices in a
weighted graph.
Bellman-Ford Algorithm: Computes shortest paths from a source vertex in a graph that
may have negative weights.
5. Finding Minimum Spanning Tree:
Kruskal’s Algorithm: Finds the minimum spanning tree of a graph by adding edges in
increasing order of weight.
Prim’s Algorithm: Finds the minimum spanning tree by growing the tree from an initial
vertex.

Example of Graph Representation

Graphs can be represented in various ways, including:

1. Adjacency Matrix:
A 2D array where the cell at row i and column j represents the presence (and possibly
weight) of an edge between vertices i and j .
2. Adjacency List:
An array of lists. Each list corresponds to a vertex and contains a list of adjacent vertices
(and possibly edge weights).

Example Graph Representation

Adjacency Matrix for a graph with 4 vertices:

A B C D

A 0 1 0 1

B 1 0 1 0

C 0 1 0 1

D 1 0 1 0

Adjacency List for the same graph:

A: [B, D]
B: [A, C]
C: [B, D]
D: [A, C]

Summary
Graph ADT: Abstract representation of a set of vertices connected by edges.
Types of Graphs: Directed, undirected, weighted, unweighted, cyclic, acyclic, connected, and
disconnected.
Key Operations: Adding/removing vertices and edges, traversals, finding shortest paths, and
minimum spanning trees.
Representations: Adjacency matrix and adjacency list.

Understanding the Graph ADT is crucial for solving complex problems involving relationships and
networks efficiently.
Representations of Graphs

Overview

Graph representations are crucial for efficiently performing operations on graphs. The choice of
representation impacts the complexity of algorithms for various graph operations. Two primary
methods for representing graphs are the adjacency matrix and the adjacency list. Each
representation has its advantages and trade-offs, depending on the specific operations and the
nature of the graph (e.g., dense vs. sparse).

Adjacency Matrix

An adjacency matrix is a 2D array used to represent a graph. The matrix's rows and columns
represent vertices, and the matrix entries indicate the presence and weight of edges.

Characteristics

Size: For a graph with V vertices, the adjacency matrix is a V x V matrix.


Space Complexity: O(V^2)
Edge Lookup: O(1) – Direct access to check if an edge exists between two vertices.

Example

Consider the following graph:

A -- B
| |
C -- D

The adjacency matrix for this undirected graph is:

A B C D

A 0 1 1 0

B 1 0 0 1

C 1 0 0 1

D 0 1 1 0

Explanation:

matrix[i][j] = 1 indicates an edge between vertex i and vertex j .


matrix[i][j] = 0 indicates no edge between vertex i and vertex j .
Advantages

Fast edge lookup and modification.


Simple and straightforward implementation.

Disadvantages

High space complexity for sparse graphs.


Inefficient for iterating over edges.

Adjacency List

An adjacency list represents a graph using an array of lists. Each index in the array represents a
vertex, and each list contains the vertices adjacent to that vertex.

Characteristics

Size: An array of size V , where each element is a list of adjacent vertices.


Space Complexity: O(V + E) – E is the number of edges.
Edge Lookup: O(V) in the worst case – requires iterating through the adjacency list.

Example

Consider the same graph:

A -- B
| |
C -- D

The adjacency list for this undirected graph is:

A: [B, C]
B: [A, D]
C: [A, D]
D: [B, C]

Explanation:

The list for vertex A contains vertices B and C (indicating edges A-B and A-C ).
The list for vertex B contains vertices A and D .

Advantages
Efficient for sparse graphs.
Lower space complexity compared to adjacency matrices.

Disadvantages

Slower edge lookup and modification compared to adjacency matrices.


More complex to implement and manage.

Comparison of Representations

Adjacency Matrix: Better for dense graphs where E is close to V^2 . Fast edge operations but
high space cost.
Adjacency List: Better for sparse graphs where E is much smaller than V^2 . Efficient in space
and better for iterating over edges.

Summary

Adjacency Matrix: Uses a 2D array to represent edges. Simple but space-inefficient for sparse
graphs.
Adjacency List: Uses an array of lists to represent edges. Space-efficient for sparse graphs but
may require more complex operations.

Choosing the appropriate graph representation depends on the specific requirements of the
algorithm and the nature of the graph.
Graph Traversals

Overview

Graph traversal refers to the process of visiting all the vertices in a graph. Traversal algorithms are
fundamental in graph theory and are used for a variety of purposes including searching, finding
connected components, and solving problems like shortest paths. The two primary methods for
traversing graphs are Breadth-First Search (BFS) and Depth-First Search (DFS). Each method
explores the graph in a different manner and is suited for different types of problems.

Breadth-First Search (BFS)

Breadth-First Search (BFS) is a graph traversal algorithm that explores the vertices level by level. It
starts at a given source vertex and explores all its neighbors before moving to the next level of
neighbors.

Characteristics

Data Structure: Queue


Time Complexity: O(V + E), where V is the number of vertices and E is the number of edges.
Space Complexity: O(V) – The space required for storing the queue and visited vertices.

Algorithm

1. Initialize a queue and enqueue the starting vertex.


2. Mark the starting vertex as visited.
3. While the queue is not empty:
Dequeue a vertex v .
Visit and process vertex v .
Enqueue all unvisited neighbors of v and mark them as visited.

Example

Consider the following graph:

A -- B -- D
| |
C -- E

BFS Traversal from vertex A:

1. Start at A: Queue = [A], Visited = {A}


2. Dequeue A, enqueue B and C: Queue = [B, C], Visited = {A, B, C}
3. Dequeue B, enqueue D and E: Queue = [C, D, E], Visited = {A, B, C, D, E}
4. Dequeue C: Queue = [D, E], Visited = {A, B, C, D, E}
5. Dequeue D: Queue = [E], Visited = {A, B, C, D, E}
6. Dequeue E: Queue = [], Visited = {A, B, C, D, E}

Applications

Finding the shortest path in an unweighted graph.


Checking connectivity of a graph.
Level-order traversal of trees.

Depth-First Search (DFS)

Depth-First Search (DFS) is a graph traversal algorithm that explores as far as possible along each
branch before backtracking. It can be implemented using either recursion or a stack.

Characteristics

Data Structure: Stack (or recursion)


Time Complexity: O(V + E), where V is the number of vertices and E is the number of edges.
Space Complexity: O(V) – The space required for storing the stack (or recursion stack) and
visited vertices.

Algorithm

1. Initialize a stack (or use recursion) and push the starting vertex.
2. Mark the starting vertex as visited.
3. While the stack is not empty (or during recursion):
Pop a vertex v from the stack (or visit v in recursion).
Visit and process vertex v .
Push all unvisited neighbors of v onto the stack (or recursively visit them).

Example

Consider the same graph:

A -- B -- D
| |
C -- E

DFS Traversal from vertex A:


1. Start at A: Stack = [A], Visited = {A}
2. Push B and C: Stack = [B, C], Visited = {A, B, C}
3. Pop C, push E: Stack = [B, E], Visited = {A, B, C, E}
4. Pop E: Stack = [B], Visited = {A, B, C, E}
5. Pop B, push D: Stack = [D], Visited = {A, B, C, E, D}
6. Pop D: Stack = [], Visited = {A, B, C, E, D}

Applications

Topological sorting of a directed graph.


Finding strongly connected components.
Pathfinding in mazes.

Summary

BFS: Explores level by level using a queue. Suitable for finding shortest paths in unweighted
graphs.
DFS: Explores as far as possible using a stack or recursion. Useful for tasks like topological
sorting and component identification.

Both BFS and DFS are essential algorithms in graph theory and are used to solve a wide range of
problems involving graphs. The choice between BFS and DFS depends on the specific requirements
of the problem and the structure of the graph.
Directed Acyclic Graphs (DAGs)

Overview

A Directed Acyclic Graph (DAG) is a directed graph that contains no cycles. In other words, it is a
graph with directed edges where there is no way to start at a vertex and follow a sequence of edges
to return to the starting vertex. DAGs are widely used in various applications, including scheduling,
data processing, and representing hierarchical structures.

Properties of DAGs

1. No Cycles: There are no cycles or loops in a DAG. This property ensures that there is a clear
direction of flow with no feedback loops.
2. Topological Ordering: In a DAG, it is possible to linearly order the vertices such that for every
directed edge u → v , vertex u comes before vertex v in the ordering.

Applications of DAGs

1. Task Scheduling: In scheduling problems, tasks can be represented as vertices, and


dependencies between tasks as edges. A topological order can be used to schedule tasks such
that each task is performed only after its dependencies are completed.
2. Data Processing Pipelines: DAGs are used to represent workflows or data processing
pipelines where each step in the pipeline depends on the output of previous steps.
3. Version Control Systems: DAGs model the history of commits in version control systems like
Git, where each commit depends on previous commits.

Topological Sorting

Topological sorting is the process of finding a linear ordering of vertices in a DAG such that for every
directed edge u → v , vertex u comes before vertex v in the ordering. This is a crucial operation in
many applications such as task scheduling and resolving dependencies.

Algorithm for Topological Sorting

There are two primary algorithms for topological sorting:

1. Kahn’s Algorithm: Uses an in-degree approach and a queue to perform the sort.
2. Depth-First Search (DFS) Based Algorithm: Uses DFS and a stack to perform the sort.

Kahn’s Algorithm

1. Compute In-degrees: Calculate the in-degree (number of incoming edges) for each vertex.
2. Initialize Queue: Enqueue all vertices with an in-degree of 0.
3. Process Queue:
Dequeue a vertex u and append it to the topological order.
For each outgoing edge from u , reduce the in-degree of the connected vertex v by 1.
If the in-degree of v becomes 0, enqueue v .
4. Repeat until the queue is empty.

Depth-First Search (DFS) Based Algorithm

1. Initialize: Create an empty stack to store the topological order.


2. DFS Traversal:
For each vertex, if it has not been visited, perform a DFS traversal.
During DFS, push each vertex onto the stack only after all its neighbors have been visited.
3. Pop from Stack: The stack now contains vertices in topologically sorted order.

Example

Consider the following DAG representing task dependencies:

A
/ \
B C
\ /
D

Topological Sorting Using Kahn’s Algorithm:

1. Compute In-degrees:
In-degree of A: 0
In-degree of B: 1 (due to A → B)
In-degree of C: 1 (due to A → C)
In-degree of D: 2 (due to B → D and C → D)
2. Initialize Queue: [A]
3. Process Queue:
Dequeue A, append to topological order: [A]
Reduce in-degree of B and C by 1. Both B and C now have an in-degree of 0, enqueue
them: [B, C]
4. Process B:
Dequeue B, append to topological order: [A, B]
Reduce in-degree of D by 1. In-degree of D is now 1, so D remains in the queue.
5. Process C:
Dequeue C, append to topological order: [A, B, C]
Reduce in-degree of D by 1. In-degree of D is now 0, enqueue D: [D]
6. Process D:
Dequeue D, append to topological order: [A, B, C, D]
Topological Order: [A, B, C, D]

Applications

1. Scheduling: Determining the order of tasks based on dependencies.


2. Course Prerequisites: Determining the order of courses based on prerequisites.
3. Build Systems: Managing dependencies between components in a build system.

Summary

Directed Acyclic Graph (DAG): A graph with directed edges and no cycles.
Topological Sorting: Finding a linear order of vertices where each edge u → v implies u
comes before v .
Algorithms: Kahn’s Algorithm and DFS-based approach for topological sorting.
Applications: Scheduling, data processing, version control, and more.
Topological Ordering

Overview

Topological ordering of a Directed Acyclic Graph (DAG) is a linear ordering of its vertices such that
for every directed edge ( u \to v ), vertex ( u ) appears before vertex ( v ) in the ordering. This
ordering is essential for tasks such as scheduling, where certain tasks must precede others, or for
resolving dependencies in various applications.

Properties

1. Existence: A topological ordering exists if and only if the graph is a DAG. If the graph contains
cycles, no such ordering is possible.
2. Uniqueness: A DAG may have multiple valid topological orderings. The number of possible
orderings depends on the structure of the graph.
3. Applications: Topological ordering is used in task scheduling, course prerequisite planning,
build systems, and more.

Algorithms for Topological Sorting

1. Kahn’s Algorithm

Kahn’s Algorithm uses an in-degree approach and a queue to perform the topological sorting.

Steps:

1. Compute In-degrees: For each vertex, compute the in-degree (number of incoming edges).
2. Initialize Queue: Add all vertices with an in-degree of 0 to a queue.
3. Process Queue:
Dequeue a vertex ( u ) and append it to the topological order.
For each outgoing edge from ( u ) to ( v ), reduce the in-degree of ( v ) by 1.
If the in-degree of ( v ) becomes 0, enqueue ( v ).
4. Repeat until the queue is empty.

Example:

Consider the following DAG:

A
/ \
B C
\ /
D

Steps Using Kahn’s Algorithm:

1. Compute In-degrees:
In-degree of A: 0
In-degree of B: 1
In-degree of C: 1
In-degree of D: 2
2. Initialize Queue: [A]
3. Process Queue:
Dequeue A, append to topological order: [A]
Reduce in-degrees of B and C by 1. Enqueue B and C: [B, C]
4. Process B:
Dequeue B, append to topological order: [A, B]
Reduce in-degree of D by 1. D's in-degree becomes 1.
5. Process C:
Dequeue C, append to topological order: [A, B, C]
Reduce in-degree of D by 1. D's in-degree becomes 0. Enqueue D: [D]
6. Process D:
Dequeue D, append to topological order: [A, B, C, D]

Topological Order: [A, B, C, D]

2. Depth-First Search (DFS) Based Algorithm

The DFS-based algorithm uses a depth-first traversal and a stack to determine the topological order.

Steps:

1. Initialize: Create an empty stack to store the topological order.


2. DFS Traversal:
For each vertex, if it has not been visited, perform a DFS traversal.
During DFS, push each vertex onto the stack only after all its neighbors have been visited.
3. Pop from Stack: The stack now contains vertices in topologically sorted order.

Example:

Consider the same DAG:

Steps Using DFS-based Algorithm:


1. DFS Traversal:
Start with vertex A, perform DFS. Mark vertices B, C, and D.
After visiting all reachable vertices from A, push A onto the stack.
Repeat the process for remaining vertices B, C, and D.
2. Pop from Stack: The stack will have the vertices in topologically sorted order.

Topological Order: [A, B, C, D] (Note: The order may vary depending on DFS traversal starting
points)

Summary

Topological Ordering: Linear ordering of vertices in a DAG where for every directed edge ( u
\to v ), vertex ( u ) precedes vertex ( v ).
Algorithms: Kahn’s Algorithm and DFS-based Algorithm.
Applications: Scheduling, dependency resolution, build systems.

The choice of algorithm depends on the specific requirements and constraints of the problem. Both
Kahn’s Algorithm and the DFS-based Algorithm are effective for finding a topological ordering of a
DAG.
Applications of Topological Ordering

Overview

Topological ordering of a Directed Acyclic Graph (DAG) is a powerful concept with diverse
applications in various fields. It provides a way to order tasks or entities in a manner that respects
their dependencies, ensuring that each task is performed before any dependent tasks. Below are
some notable applications of topological ordering:

1. Task Scheduling

Description

In project management and computer systems, certain tasks must be completed before others can
begin. Topological ordering helps schedule these tasks efficiently.

Example

Consider a project with the following tasks and dependencies:

Task A must be completed before Task B and Task C.


Task B must be completed before Task D.
Task C must be completed before Task D.

A topological ordering of the tasks will help determine a sequence to complete all tasks respecting
their dependencies.

Topological Order Example:

A, B, C, D

Here, A is scheduled first, followed by B and C, and finally D, which depends on B and C.

2. Course Prerequisite Scheduling

Description

In academic institutions, courses often have prerequisites. Topological ordering can determine a
valid sequence of courses that students should follow to satisfy all prerequisites.

Example

Suppose the following courses and prerequisites:


Course 1 must be completed before Course 2 and Course 3.
Course 2 must be completed before Course 4.
Course 3 must be completed before Course 4.

A topological ordering will provide a sequence in which students can take the courses to meet all
prerequisites.

Topological Order Example:

1, 2, 3, 4

Here, Course 1 is taken first, followed by Courses 2 and 3, and finally Course 4.

3. Build Systems

Description

In software engineering, build systems often involve dependencies between different modules or
components. Topological ordering ensures that each module is built only after all its dependencies
are resolved.

Example

Consider a build system with the following dependencies:

Module A depends on Module B and Module C.


Module B depends on Module D.
Module C depends on Module D.

A topological ordering will help determine the order in which modules should be built.

Topological Order Example:

D, B, C, A

Here, Module D is built first, followed by Module B and Module C, and finally Module A.

4. Resolving Dependencies in Package Managers

Description

Package managers in software systems often need to resolve dependencies between packages.
Topological ordering ensures that packages are installed in the correct sequence to satisfy all
dependencies.
Example

Suppose a package manager needs to install the following packages and their dependencies:

Package A depends on Package B and Package C.


Package B depends on Package D.
Package C depends on Package D.

A topological ordering will provide the sequence in which packages should be installed.

Topological Order Example:

D, B, C, A

Here, Package D is installed first, followed by Package B and Package C, and finally Package A.

5. Instruction Scheduling in Compilers

Description

Compilers often need to schedule instructions for execution on a processor, considering


dependencies between instructions. Topological ordering helps in scheduling instructions to optimize
execution time and avoid hazards.

Example

Consider a set of instructions where:

Instruction A must be executed before Instruction B and Instruction C.


Instruction B must be executed before Instruction D.
Instruction C must be executed before Instruction D.

A topological ordering will determine the sequence of instruction execution.

Topological Order Example:

A, B, C, D

Here, Instruction A is executed first, followed by Instructions B and C, and finally Instruction D.

Summary
Topological ordering is a versatile concept used in various applications to manage dependencies
and schedule tasks effectively. Its applications span from project management and academic course
scheduling to build systems, package managers, and compiler instruction scheduling. By ensuring
that all dependencies are respected, topological ordering helps in optimizing processes and avoiding
conflicts.
Shortest Paths

Overview

The problem of finding the shortest path between two nodes in a graph is fundamental in various
fields, including network routing, geographical mapping, and logistics. Algorithms designed to solve
this problem efficiently are crucial for optimizing routes and minimizing costs.

In graph theory, the shortest path problem involves finding a path between two nodes such that the
sum of the edge weights is minimized. This problem can be solved using different algorithms
depending on the graph's characteristics, such as whether the graph has negative weights or is
weighted/unweighted.

1. Dijkstra's Algorithm

Description

Dijkstra's Algorithm finds the shortest path from a source node to all other nodes in a graph with non-
negative weights. It uses a priority queue to systematically explore the shortest known paths and
update the shortest paths to neighboring nodes.

Algorithm Steps

1. Initialize distances from the source to all other nodes as infinity, except for the source node,
which is set to 0.
2. Use a priority queue to repeatedly extract the node with the smallest distance.
3. For the current node, update the distances to its neighbors if a shorter path is found.
4. Repeat until all nodes have been processed.

Example

Consider a graph with nodes A, B, C, and D, and the following edges with weights:

A → B (2)
A → C (4)
B → C (1)
B → D (7)
C → D (3)

Steps:

1. Initialize distances:
Distance(A) = 0
Distance(B) = ∞
Distance(C) = ∞
Distance(D) = ∞
2. Process node A:
Update Distance(B) = 2
Update Distance(C) = 4
3. Process node B:
Update Distance(C) = min(4, 2 + 1) = 3
Update Distance(D) = 2 + 7 = 9
4. Process node C:
Update Distance(D) = min(9, 3 + 3) = 6
5. Process node D:
No updates needed.

Resulting Shortest Distances:

Distance(A) = 0
Distance(B) = 2
Distance(C) = 3
Distance(D) = 6

Python Code Example

import heapq

def dijkstra(graph, start):


# Initialize distances and priority queue
distances = {node: float('inf') for node in graph}
distances[start] = 0
priority_queue = [(0, start)]

while priority_queue:
current_distance, current_node = heapq.heappop(priority_queue)

if current_distance > distances[current_node]:


continue

for neighbor, weight in graph[current_node].items():


distance = current_distance + weight

if distance < distances[neighbor]:


distances[neighbor] = distance
heapq.heappush(priority_queue, (distance, neighbor))

return distances

# Example graph representation


graph = {
'A': {'B': 2, 'C': 4},
'B': {'C': 1, 'D': 7},
'C': {'D': 3},
'D': {}
}

print(dijkstra(graph, 'A'))

2. Bellman-Ford Algorithm

Description

The Bellman-Ford Algorithm finds the shortest paths from a source node to all other nodes in a
graph, including graphs with negative edge weights. It can also detect negative weight cycles in the
graph.

Algorithm Steps

1. Initialize distances from the source to all other nodes as infinity, except for the source node,
which is set to 0.
2. Relax all edges up to ( V - 1 ) times, where ( V ) is the number of vertices.
3. Check for negative weight cycles by verifying if any distance can still be updated.

Example

Consider the following graph with possible negative weights:

A → B (2)
A → C (4)
B → C (-1)
B → D (3)
C → D (2)

Steps:

1. Initialize distances:
Distance(A) = 0
Distance(B) = ∞
Distance(C) = ∞
Distance(D) = ∞
2. Relax edges multiple times:

After 1st iteration:

Distance(B) = 2
Distance(C) = 4
Distance(D) = ∞

After 2nd iteration:

Distance(C) = 1 (updated from B)


Distance(D) = 3 (updated from C)

After 3rd iteration:


No further updates needed.
3. Check for negative weight cycles:
No negative weight cycles found.

Resulting Shortest Distances:

Distance(A) = 0
Distance(B) = 2
Distance(C) = 1
Distance(D) = 3

Python Code Example

def bellman_ford(graph, start):


# Initialize distances
distances = {node: float('inf') for node in graph}
distances[start] = 0

# Relax edges
for _ in range(len(graph) - 1):
for u in graph:
for v, weight in graph[u].items():
if distances[u] + weight < distances[v]:
distances[v] = distances[u] + weight

# Check for negative weight cycles


for u in graph:
for v, weight in graph[u].items():
if distances[u] + weight < distances[v]:
raise ValueError("Graph contains a negative weight cycle")

return distances

# Example graph representation


graph = {
'A': {'B': 2, 'C': 4},
'B': {'C': -1, 'D': 3},
'C': {'D': 2},
'D': {}
}
print(bellman_ford(graph, 'A'))

3. Floyd-Warshall Algorithm

Description

The Floyd-Warshall Algorithm finds shortest paths between all pairs of nodes in a graph. It can
handle graphs with negative weights and is particularly useful for dense graphs.

Algorithm Steps

1. Initialize a distance matrix where the entry at ( (i, j) ) represents the weight of the edge from
node ( i ) to node ( j ). Set diagonal entries to 0.
2. Update the distance matrix by considering all possible paths through intermediate nodes.
3. After processing, the matrix contains the shortest path distances between all pairs of nodes.

Example

Consider the following graph:

A → B (2)
A → C (4)
B → C (1)
B → D (7)
C → D (3)

Steps:

1. Initialize distance matrix:

[
\begin{bmatrix}
0 & 2 & 4 & \infty \
\infty & 0 & 1 & 7 \
\infty & \infty & 0 & 3 \
\infty & \infty & \infty & 0
\end{bmatrix}
]
2. Update matrix considering intermediate nodes:

After processing:
Distance from A to D is updated to 6.
Resulting Distance Matrix:

[
\begin{bmatrix}
0&2&3&6\
\infty & 0 & 1 & 4 \
\infty & \infty & 0 & 3 \
\infty & \infty & \infty & 0
\end{bmatrix}
]

Python Code Example

def floyd_warshall(graph):
nodes = list(graph.keys())
n = len(nodes)

# Initialize distance matrix


dist = {node: {node2: float('inf') for node2 in nodes} for node in nodes}
for node in nodes:
dist[node][node] = 0

for u in graph:
for v, weight in graph[u].items():
dist[u][v] = weight

# Update distances
for k in nodes:
for i in nodes:
for j in nodes:
if dist[i][j] > dist[i][k] + dist[k][j]:
dist[i][j] = dist[i][k] + dist[k][j]

return dist

# Example graph representation


graph = {
'A': {'B': 2, 'C': 4},
'B': {'C': 1, 'D': 7},
'C': {'D': 3},
'D': {}
}

print(floyd_warshall(graph))

Summary

Dijkstra's Algorithm: Efficient for finding shortest paths from a single source in graphs with
non-negative weights.
Bellman-Ford Algorithm: Handles graphs with negative weights and detects negative weight
cycles.
Floyd-Warshall Algorithm: Computes shortest paths between all pairs of nodes, useful for
dense graphs.

These algorithms are essential tools in computer science for solving various pathfinding and
optimization problems.
Minimum Spanning Trees

Overview

A Minimum Spanning Tree (MST) of a graph is a subset of the edges that connects all the vertices
together, without any cycles, and with the minimum possible total edge weight. MSTs are useful in
network design, clustering, and various optimization problems where a minimal connection cost is
required.

1. Kruskal's Algorithm

Description

Kruskal's Algorithm is a popular algorithm for finding the Minimum Spanning Tree of a graph. It is a
greedy algorithm that works by sorting all the edges in the graph by their weight and then adding
them one by one to the MST, ensuring that no cycles are formed.

Algorithm Steps

1. Sort all edges: Arrange all edges in ascending order of their weights.
2. Initialize MST: Create an empty set for the MST.
3. Add edges: Iterate through the sorted edges and add each edge to the MST if it does not form
a cycle. Use a union-find data structure to detect cycles.
4. Repeat: Continue until the MST contains ( V-1 ) edges, where ( V ) is the number of vertices in
the graph.

Example

Consider a graph with the following edges:

A-B (1)
A-C (3)
B-C (3)
B-D (6)
C-D (4)
C-E (2)
D-E (5)

Steps:

1. Sort edges by weight:


A-B (1)
C-E (2)
A-C (3)
B-C (3)
C-D (4)
D-E (5)
B-D (6)
2. Build MST:
Add A-B
Add C-E
Add A-C
Add C-D
Skip B-C (would form a cycle)
Skip D-E (would form a cycle)
Skip B-D (would form a cycle)

Resulting MST edges:

A-B (1)
C-E (2)
A-C (3)
C-D (4)

Total weight = 10

Python Code Example

class UnionFind:
def __init__(self, size):
self.parent = list(range(size))
self.rank = [0] * size

def find(self, u):


if self.parent[u] != u:
self.parent[u] = self.find(self.parent[u])
return self.parent[u]

def union(self, u, v):


root_u = self.find(u)
root_v = self.find(v)

if root_u != root_v:
if self.rank[root_u] > self.rank[root_v]:
self.parent[root_v] = root_u
elif self.rank[root_u] < self.rank[root_v]:
self.parent[root_u] = root_v
else:
self.parent[root_v] = root_u
self.rank[root_u] += 1

def kruskal(graph, vertices):


edges = sorted(graph, key=lambda x: x[2])
uf = UnionFind(vertices)
mst = []

for u, v, weight in edges:


if uf.find(u) != uf.find(v):
uf.union(u, v)
mst.append((u, v, weight))

return mst

# Example graph as list of edges (u, v, weight)


graph = [
(0, 1, 1),
(0, 2, 3),
(1, 2, 3),
(1, 3, 6),
(2, 3, 4),
(2, 4, 2),
(3, 4, 5)
]

vertices = 5 # Number of vertices


mst = kruskal(graph, vertices)
print("Edges in the Minimum Spanning Tree:", mst)

2. Prim's Algorithm

Description

Prim's Algorithm is another greedy algorithm used to find the Minimum Spanning Tree of a graph. It
grows the MST one edge at a time, starting from an arbitrary vertex and expanding to the closest
vertex not yet in the MST.

Algorithm Steps

1. Initialize MST: Start with an arbitrary node and mark it as part of the MST.
2. Grow MST: Add the minimum weight edge connecting a node in the MST to a node outside the
MST.
3. Repeat: Continue until all nodes are included in the MST.

Example

Consider the same graph as above.

Steps:

1. Start with an arbitrary node, say A (node 0):


MST = {A}
Possible edges: (A-B (1), A-C (3))
2. Add the smallest edge (A-B):
MST = {A, B}
Possible edges: (B-C (3), B-D (6))
3. Add the smallest edge (C-E (2)):
MST = {A, B, C, E}
Possible edges: (C-D (4))
4. Add the smallest edge (C-D):
MST = {A, B, C, D, E}

Resulting MST edges:

A-B (1)
C-E (2)
C-D (4)
A-C (3)

Total weight = 10

Python Code Example

import heapq

def prim(graph, start, vertices):


mst = []
visited = set()
min_heap = [(0, start)]

while min_heap:
weight, u = heapq.heappop(min_heap)

if u in visited:
continue

visited.add(u)
mst.append((weight, u))

for v, w in graph[u]:
if v not in visited:
heapq.heappush(min_heap, (w, v))

return mst

# Example graph as adjacency list: {node: [(neighbor, weight), ...]}


graph = {
0: [(1, 1), (2, 3)],
1: [(0, 1), (2, 3), (3, 6)],
2: [(0, 3), (1, 3), (3, 4), (4, 2)],
3: [(1, 6), (2, 4), (4, 5)],
4: [(2, 2), (3, 5)]
}

vertices = 5 # Number of vertices


mst = prim(graph, 0, vertices)
print("Edges in the Minimum Spanning Tree:", mst)

Summary

Kruskal's Algorithm: Sorts edges and uses a union-find data structure to build the MST,
suitable for sparse graphs.
Prim's Algorithm: Expands the MST from a starting vertex using a priority queue, suitable for
dense graphs.

Both algorithms are essential for solving MST problems and have applications in network design,
circuit design, and clustering.

You might also like