This is my 🔥 100 Days of GPU — a wild, hands-on journey through CUDA/CUTLASS kernels, Triton spells, and PTX sorcery.
-
Updated
Nov 2, 2025 - HTML
8000
This is my 🔥 100 Days of GPU — a wild, hands-on journey through CUDA/CUTLASS kernels, Triton spells, and PTX sorcery.
Profiling with NVIDIA Nsight Tools Bootcamp
References content from the OLCF CUDA Training Series. (https://github.com/olcf/cuda-training-series)
CUDA Samples and Nsight Guided Profiling Samples
University Project for "Computer Architecture" course (MSc Computer Engineering @ University of Pisa). Implementation of a Parallelized Nearest Neighbor Upscaler using CUDA.
Julia tools for NVIDIA Nsight Compute
This project demonstrates the integration of a CUDA kernel within an NVIDIA Holoscan application. It consists of two custom operators: one for memory allocation and data initialization, and another for executing the CUDA kernel. The application was profiled using Nsight systems and the kernel with Nsight compute
A comprehensive, hardware-agnostic GPU benchmarking suite that compares CUDA, OpenCL, and DirectCompute performance using identical workloads. Built from scratch with professional architecture, extensive documentation, and production-ready GUI.
Roofline profiling for Deep Learning models
libHPC
Add a description, image, and links to the nsight-compute topic page so that developers can more easily learn about it.
To associate your repository with the nsight-compute topic, visit your repo's landing page and select "manage topics."