Last updated on:a year ago

本着初学deep learning时要理解基本原理的原则,Dave把GPU和深度学习的关系理了一理。


图形处理器(Graphics processing unit, GPU)是一个特殊的电路回路,它被用来快速处理和转化内存,从而加速框架缓存器中图像的生成,并且输出给显示设备。在单片机、移动电话、PC、游戏主控等均有应用。现代的GPUs(加s表示复数)能够非常有效地进行计算机图形学处理和图像处理。GPUs有高度并行结构,从而使它们比传统的中央处理器(CPUs)更加容易并行处理大型数据。


GTX 280


  • 内核量巨大,并行计算更方便
  • 内存带宽大
  • 可处理数量集更大
  • 优化问题较CPU难

Unsupervised learning on GPUs


CPU Memory Occupation

Dataloader 讀取數據效率太慢,考慮更高效的dataloader。

Python lists store only references to the objects, the objects are kept separately in memory. Every object has a refcount, therefore every item in the list has a refcount.

replace your lists / dicts in Dataloader getitem with numpy arrays.

If your Dataloaders iterate across a list of filenames, the references to that list add up over time, occupying memory.

Strictly speaking this is not a memory leak, but a copy-on-access problem of forked python processes due to changing refcounts. It isn’t a Pytorch issue either, but simply is due to how Python is structured.

Deep Learning Conception

A multi-layer neural network

The chain rule of derivatives

Forward pass

Backward pass


[1] Graphics processing unit

[2] What is a GPU and do you need one in Deep Learning?

[3] LeCun, Y., Bengio, Y. and Hinton, G., 2015. Deep learning. nature, 521(7553), pp.436-444.

[4] Raina, R., Madhavan, A. and Ng, A.Y., 2009, June. Large-scale deep unsupervised learning using graphics processors. In Proceedings of the 26th annual international conference on machine learning (pp. 873-880).

[5] DataLoader num_workers > 0 causes CPU memory from parent process to be replicated in all worker processes