CUDA

Posts about CUDA

CUDA C - Tutorials and other resources
A collection of CUDA C tutorials and other useful resources (continuously updated).

Memories from CUDA - Constant Memory (I)
We introduce constant memory and we explain how it can be accessed from the the device through a step-by-step comprehensive example.

Memories from CUDA - Symbol Addresses (II)
In this post we focus on how to use cudaGetSymbolAddress to get the address of a device variable (can be a __constant__ or a __device__).

Memories from CUDA - Pinned Memory (III)
In this post we demonstrate some of the features of pinned memory - a.k.a page-locked memory, which is host memory accessible from the device.

Static allocation of __device__ variables
This is a brief post on how to allocate device memory statically using the __device__ keyword

Matrix-vector multiplication in CUDA (I)
Introduction to shared memory through an example: matrix-vector multiplication in CUDA. At the same time we make use of templates; a very powerful feature of C++.

Matrix-vector multiplication in CUDA (II)
Benchmarking of a custom CUDA C kernel which, with proper tuning, may outperform cuBLAS's sgemv!

CUDA pointers to pointers
A brief article about pointers to pointers in CUDA C. How to pass a set of arrays to a CUDA kernel.

No comments:

Post a Comment