pytorch-icon cuda-icon triton-icon

Notes Overview

check notes

  • Graph-level (e.g., operator fusion, kernel scheduling, memory planning)
  • Kernel-level (e.g., CUDA, Triton, custom operators for specialized hardware)
  • System-level (e.g., distributed training across GPUs/TPUs, inference serving at scale)

Projects

  • Pytorch Compiler (TorchFX, TorchInductor, IR Graph, Functorch)
  • CUDA programming
  • Triton
  • LeetGPU
  • CUDA Graphs
  • NVFuser
  • Modular (mojo)