NVIDIA has planned to drop the support for GPUs with Tesla architecture (compute capability 1.x) in upcoming releases of CUDA Toolkit. In fact, GPUs with compute capability 1.0 have already been removed as a target device from CUDA Toolkit 6.5, released in August 2014. With toolkit 6.5, you can no longer specify compute_10, sm_10 for the code generation. Not only this, NIVIDIA has also removed the CC 1.0 from the comparison tables in the Programming Guide 6.5
The default architecture has been changed to compute_20, sm_20 in the rules file of CUDA Toolkit 6.5. As for the rest of Tesla architectures, i.e. CC 1.1, 1.2 and 1.3, they are still supported as a target, but are marked as deprecated. The following warning is generated by the compiler if we attempt to compile the code for Tesla architecture with CUDA 6.5:
CUDACOMPILE : nvcc warning : The ‘compute_11′, ‘compute_12′, ‘compute_13′, ‘sm_11′, ‘sm_12′, and ‘sm_13′ architectures are deprecated, and may be removed in a future release.
Since its first release back in year 2007 with compute capability 1.0, CUDA has three more architectural releases and eight more compute capabilities which marks the fact that it’s an ever evolving architecture. Although CUDA is forward compatible but every new release comes with its own new features worth using and an increased thread/memory support. As a rule of thumb every new architecture runs the CUDA code faster than previous generation given both cards have same number of cores.
The comparison below gives a list of feature/functionality support between compute capabilities of NVIDIA’s CUDA enabled devices. Note that atomic operations weren’t supported in the first release and since they are so important, NVIDIA now practically compares architectures from 1.1 and later.
Continuing the legacy to provide the best imaging algorithm at lightning fast speed, we are proud to announce the addition of DFPD debayer algorithm in CUVI which is more robust than the existing demosaic and shows no artifacts at high feature areas. The previous implementation of demosaic algorithm (which uses bilinear interpolation) is super fast giving a throughput of more than 500 fps on full HD image on a common GPU yet it has its downside.
Since color planes have severe aliasing, a simple interpolation (or HQ bilinear interpolation for that matter) of the individual planes has little effect in removing the artifacts that appear at high feature regions. Hence we need a a better reconstruction approach:
Not only the new algorithm removes artifacts at high-feature regions, the colors get more natural and crisp. This is due to the fact that DFPD (directional filtering with posteriori decision) algorithm better estimates the green plane taking into account the natural edges of the image and then reconstruct the missing red/blue pixels based on that reconstructed green image instead of calculating all values directly.
This huge improvement over the existing implementation comes at a price: more computational cost. The DFPD algorithm is almost half as slow as the previous one, however, it still gives a whopping 263 fps on a full HD image. Note this time excludes the memory transfers. And as always as in CUVI you can use this GPU accelerated DFPD debayer with just three lines of code:
CuviImage input("D:/bayer.tif", CUVI_LOAD_IMAGE_GRAYSCALE_KEEP_DEPTH), output;
cuvi::colorOperations::demosaic_DFPD(input, output, CUVI_BAYER_RGGB);
There’s an additional refinement step (optional) that comes with DFPD to further refine the pixels values and cut down the unnatural high frequencies. By default, it’s set to false but you can enable it with a flag:
// Further refine the results
cuvi::colorOperations::demosaic_DFPD(input, output, CUVI_BAYER_RGGB, true);
Download the latest cuvi from here or get more information on the features at our wiki.
CUVILib provides out-of-the-box hyper-accelerated Imaging functionality, ready for use in your film scanning, restoration & recoloring applications. With CUVI, you can deliver supercomputing like performance to your users without the need to set up expensive high-end CPUs.
Nsight Eclipse Edition is a full-featured IDE, powered by the Eclipse platform that provides a complete integrated development environment to edit, build, debug and profile CUDA C/C++ applications on MAC and Linux platforms. The combination of CUDA aware source editor and powerful debugging and profiling tools make Nsight Eclipse Edition the ultimate development platform for heterogeneous computing.
A preview of CUDA toolkit 5 is already available for Registered developers and NVIDIA is expected to roll out the production release soon. Besides habitual addition of more image processing functionality, the new toolkit offers some great features including:
- Dynamic parallelism
- GPUDirect for clusters (RDMA)
- GPU object linking
- NVIDIA Nsight, Eclipse Edition
It’s one thing to compare GPU code performance with CPU code performance. If the algorithm is parallel, GPU would beat CPU any day. In our case, CUVI beats the best (performance wise) CPU primitives library on the planet, Intel(r) IPP. Take a look at the performance figures.
CUVILib has finally came out of Beta. We have added a lot more functionality and made sure that it runs smooth on mission-critical applications. Its simple API, magnitudes better performance than competing solutions and cross-platform support provides you a complete Imaging package. Before we get into what’s new in version 1.2 here are some useful links worth checking out:
Ever wanted to add color and edit really, really old movies? Our grand parents did not have HD cameras which means today, we have hundreds of thousands of hours of video recorded a few decades ago. Movies, documentaries and family videos gathering dust in some archive or a shelf in your home. With the new recoloring and digitizing technology, it’s now possible to digitize (meaning you can open them in video editing tools on your computer) and recolor old, shaky videos. Some very interesting work is being done in Sweden by a company which provides software for this. Film studios use this software to restore old movies. It’s called AgiScan.
The era of the next-generation graphics is finally upon us. If you’ve been hankering after a graphics card upgrade lately and wanted to see what NVIDIA’s reply to AMD’s Radeon HD 7970 is, wait no further. The green squad has taken the cloaks off of their GeForce GTX 680, a new piece of graphics silicon targeted at consumers and the enthusiast mob based on its latest Kepler architecture. NVIDIA claims that its new GTX 680 is the fastest and most power-efficient GPU ever made offering significant performance enhancements over its rivals.