Due to the availability of general purpose programming interfaces like CUDA, the immense speed of graphics cards can be put to work for a multitude of parallel tasks. Algorithms for the analysis of images mostly work independently on different regions of an image. These algorithms are therefore inherently parallel and can greatly profit from parallel hardware.
Speedup factors in the order of two magnitudes make it possible to process and extract information from huge datasets, for example the images of the ImageNet Large Scale Visual Recognition Challenge. When experimenting with learning algorithms, the experiment duration is drastically reduced.
In the lab, we learn how to implement learning algorithms from the area of visual pattern recognition and accelerate them using the CUDA C++ extension. It will be split into two parts. In the first part, you will first aquire knowledge of CUDA by programming and accelerating simple algorithms using parallel programming. In the second part, we implement learning algorithms with the help of an existing CUDA library.Prerequisites