A preview of CUDA toolkit 5 is already available for Registered developers and NVIDIA is expected to roll out the production release soon. Besides habitual addition of more image processing functionality, the new toolkit offers some great features including:
- Dynamic parallelism
- GPUDirect for clusters (RDMA)
- GPU object linking
- NVIDIA Nsight, Eclipse Edition
CUVILib has finally came out of Beta. We have added a lot more functionality and made sure that it runs smooth on mission-critical applications. Its simple API, magnitudes better performance than competing solutions and cross-platform support provides you a complete Imaging package. Before we get into what’s new in version 1.2 here are some useful links worth checking out:
The next release of CUVI library is due within next 30 days and we are pleased to announce that it’ll be having lots of functions from Image Enchantments domain. Our filter module just got better and now support dozens of predefined filters as well as the option to add your own custom taps and anchor position. One particular function that I’m excited about in the new release is adjust which is equivalent to MATLAB’s imadjust function.
CUVI version 0.5 is cooked in our labs and we are doing testing and documentation at the moment. The new release will be out anytime in the coming week. We have been working for almost six months on the new framework that couldn’t get any simpler and easy to use. In this release we are also enabling our premium feature detectors that are 10 times faster than OpenCV 2.2
EVGA has announced GTX 460 2Win, the first dual-Fermi graphics card featuring 662 CUDA cores (at 700 MHz) and 2GB of DDR5 memory (3600 MHz effective). According to the company, this combination of two low end Fermi chips will beat the 3D Mark score of the NVIDIA GTX 580. That’s not a biggie, as GTX 580 has only 512 CUDA core, but the better news is that GTX 460 2Win will cost less than GTX 580, says EVGA.
Image filtering is one of the most basic utility of image processing and computer vision. Any image processing application, like feature detection, is composed of applying a series of filters to the image. After reading this guide, you’ll be able to efficiently apply filters to images using shared memory of CUDA architecture. Here’s a step by step guide to write your own filter of any type and size. For simplicity I’ll use a 16 bit unsigned grey scale image in this tutorial.
The release of next generation CUDA architecture, Fermi, marks the fact that CUDA is still an evolving architecture. Fermi having compute capability of 2.0 has several differences from previous architectures. In addition to increasing the number of threads per blocks and packing 512 cores in a single chip, Fermi can also run multiple Kernels simultaneously. Shared memory has also been increased from 16 KB to 48KB and most importantly the number of streaming processors in one SM have been increased to 32. The comparison below, by NVIDIA, gives a complete picture of the differences between compute capability 1.0, 1.1, 1.2, 1.3 and 2.0 of NVIDIA’s CUDA enabled devices.