Nvidia recently announced the availability of the new Tesla K80 GPU, continuing its steady evolution of high-performance compute cards.  The K80 sports an impressive 4,992 CUDA cores, 24GB GDDR5 RAM, 480 GB/sec. memory bandwidth, and 1.8 Tflops double-precision floating-point performance, while burning only 300W of fuel for all this speed.  With built-in GPU Boost, a form of automated overclocking, the card can deliver up to 2.9 Tflops of performance on many applications.  If you only need single-precision floating-point performance these figures jump to 5.6 Tflops base clock speed and 8.7 Tflops with GPU Boost.  For most scientific applications we need double-precision arithmetic but it’s comforting knowing that we can jump to 8+ Tflops if necessary.

The Tesla K80 runs on the current Kepler GPU architecture.  The Kepler design includes numerous features that maximize GPU performance, including SMX streaming multiprocessors, dynamic parallelism, Hyper-Q, GPUdirect, GRID management units, and Quad warp schedulers.  See the whitepaper NVIDIA-Kepler-GK110-GK210-Architecture-Whitepaper for more detailed information on Kepler.  The combination of 4,992 cores, high flop count, and Kepler design should provide terrific performance for CUDA-enabled scientific applications.


View Richard Casey's profile on LinkedIn