This repository collects GPU kernel practice projects. It is meant for learning by implementing small Triton kernels by hand on RTX 4090 / RTX 5090-class GPUs. The projects are derived from reading ...
of 4.9.2 or later. Detection of out of bounds accesses to stack or global variables requires gcc 5.0 or later. This feature consumes about 1/8 of available memory and ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results