[go: up one dir, main page]

0% found this document useful (0 votes)
10 views1 page

PDC-Assignment 03 1

Uploaded by

Aqib khan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views1 page

PDC-Assignment 03 1

Uploaded by

Aqib khan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 1

CS-402 Parallel and Distributed Computing

Department of Computer Science, UET Taxila.


Assignment No. 03 Spring-2025 Session: 2021-CS Marks=30
PLO-3 Alignment:
This assignment targets PLO-3: Problem Analysis, which requires students in identifying, formulating,
researching literature, and solving complex computing problems. Students are expected to reach substantiated
conclusions using fundamental principles of mathematics, computing sciences, and relevant domain disciplines.

Matrix Multiplication Using CUDA


Objective:
To implement and optimize matrix multiplication using CUDA, and compare its performance with a CPU-based
implementation.
Instructions:
1. Problem Statement:
 Implement matrix multiplication for two square matrices of size N×N.
2. CUDA Implementation:
 Write a CUDA program to perform matrix multiplication.
 Ensure proper memory allocation and management.
 Optimize the kernel functions for better performance.
3. CPU Implementation:
 Write a CPU-based program to perform the same matrix multiplication.
 Use a straightforward nested loop approach.
4. Performance Comparison:
 Measure the execution time for both CUDA and CPU implementations for different matrix sizes
(e.g., N=256,512,1024).
 Record the execution times and calculate the speedup achieved by the CUDA implementation.
5. Analysis:
 Plot the execution times and speedup against matrix sizes.
 Discuss the results and explain the observed performance differences.
Submission Requirements:
 Source code files for both CUDA and CPU implementations.
 A report (2-3 pages) including:
 Problem statement and approach.
 Performance comparison results (tables and plots).
 Analysis and discussion of results.

You might also like