Friday 17 October 2014

CUDA C - Tutorials and other resources

This is a collection of tutorials, blogs, articles and other resources for CUDA C that I hope you'll find useful. Feel free to contribute with a comment what you think can help people learn CUDA and optimise their code. I am planning to keep this post continuously updated, so stay tuned. Most of the links hosted here point to free resources.








1. GENERAL LEARNING RESOURCES

 

1.1. Official resources and APIs

  1. CUDA C Programming Guide (pdf)
  2. CUDA Runtime API (pdf)
  3. CUDA Best Practices Guide (pdf)
  4. CUDA Driver API

1.2. Blogs

  1. CUDA Programming
  2. Parallel for all and a list of posts on disqus 
  3. Solarian programmer

1.3. Slides and courseware

  1.  Official slides
  2. Qwiklab's CUDA lab (among other interesting tutorials)
  3. An excellent collection of course material by the university of Illinois
  4. Yong Cao's (Virginia Tech) presentation of the CUDA programming model
  5. A very good collection of lectures notes on the website of the university of Standford 
  6. More slides with a very good tutorial on reduction and scan 
  7. Rice university lecture notes


1.4. Books

  1. CUDA Application Design and Development by Rob Farberg which is an excellent book
  2. CUDA by Example by Jason Sanders and Edward Kandrot
  3. The CUDA Handbook by Nicholas Wilt
  4. More suggestions can be found here by NVIDIA


2. CUDA ACCELERATED LIBRARIES


A collection of general-purpose libraries and links to the official websites:
  1. ArrayFire
  2. cuBLAS (also the User Guide in PDF)
  3. MAGMA
  4. Thrust


3. SPECIAL TOPICS


3.1. Thrust

  1. First steps
  2. Thurst wiki on github - a good reference

3.2. Matrix multiplication

  1. Optimized matrix-matrix multiplication kernel
  2. Bank conflicts and more

3.3. Advanced Topics

  1. Optimisation for your CUDA code: Performance assessment, Memory, Instructions, CUDA and concurrent execution, and more from this web page of the Penn State University, Institute for CyberScience.
  2. M. Harris, S. Sengupta and J.D. Owens, "Parallel Prefix Sum (Scan) with CUDA," GPU Gems 3, by NVIDIA. 
 More is coming soon - stay tuned!

No comments:

Post a Comment