About this Notes
Basic Concepts
SDK install and env config
Journey of Rust
Algorithm
Stochastic Process and Diffusion Model
CUDA Basic Usage
DeepLearning&Pytorch
GPU Function in C++
type
status
date
slug
summary
tags
category
icon
password
In this note, I talk about the basic usage of cuda(C++) for accelerating calculation(a note for learning CUDA).
All code can be found in official NIVIDIA lab course. learn.nvidia.com
📝Heterogeneous Systems
In modern accelerated calculators, CPU is used to distribute calculating task to GPU, then GPU start to run its task(and CPU can still work when GPU run), and finally CPU collect all result and output.
##I will cite some book oneday here##
Difference between code on CPU and GPU
Here is a code segment of .cu file:
Then we run a simple .cu
We can use nvcc to compile this:
-arch is used to restrict the compiling architecture(sm_70 is from Nvidia learn lab)
Parallel running kernel function
This picture is from official Nvidia slide:

Each block has the same number of threads, in above picture, 2 blocks within each 4 threads.
All kernel function(we call them as “GPU Function” previously) are runned in the same time.
But it has some problems caused by physical achievement of GPU (the order of output can not be controlled right now. I may talk about it in future notes.)
Notice to get the condition statement(threadIdx.x == 1023 && blockIdx.x == 255), we choose <<<256, 1024>>> becase the element of array begin from 0
Accelerating ’for‘ loop
In above code, we achieve parallel acceleration by replacing iteration to ThreadIdx.x
What if we want to map a vector(such as integer 0~7) to blocks(such as 2 blocks and each has 4 threads)?
In our example, we have blockDim = 4
Integer 6 = 2 + 1*4
As we can see, the order of outpu is a mess.
Memory Allocation and Deallocation
Global pointer is just replacing
malloc
and free
by cudaMallocManaged
and cudaFree
.Example: double each integer in an int-array.
What if the number of element in the vector is smaller than total number of threads?
引用的话语
观点2
引用的话语
🤗 总结归纳
总结文章的内容
📎 参考文章
- 一些引用
- 引用文章
有关Notion安装或者使用上的问题,欢迎您在底部评论区留言,一起交流~
Prev
Stochastic Process
Next
Introduction
Loading...