Since its first release back in year 2007 with compute capability 1.0, CUDA has three more architectural releases and eight more compute capabilities which marks the fact that it’s an ever evolving architecture. Although CUDA is forward compatible but every new release comes with its own new features worth using and an increased thread/memory support. As a rule of thumb every new architecture runs the CUDA code faster than previous generation given both cards have same number of cores.
The comparison below gives a list of feature/functionality support between compute capabilities of NVIDIA’s CUDA enabled devices. Note that atomic operations weren’t supported in the first release and since they are so important, NVIDIA now practically compares architectures from 1.1 and later.
Continue reading “CUDA Differences b/w Architectures and Compute Capability”