NVIDIA Announces Tesla T4 Based on Turing GPU For Inferencing – 65 TFLOPs FP16, 130 TOPs INT8, 260 TOPs INT4 at Just 75W

NVIDIA has just announced their latest Turing based Tesla T4 graphics card inference acceleration. The graphics card was announced by NVIDIA’s CEO, Jensen Huang, at the GTC 2018 Japan keynote as the first Tesla based graphics card featuring the brand new Turing GPU.

NVIDIA Tesla T4 With Turing GPU Announced at GTC Japan – Aiming The Inferencing Market With Multi-TFLOPs of Performance at Just 75W, 2560 Cores

The Turing based NVIDIA Tesla T4 graphics card is aimed at inference acceleration markets. It is designed to accelerate deep learning performance by a magnitude over its predecessors and is also going to deliver breakthrough performance for AI video applications. NVIDIA’s own estimate put the graphics card at twice as fast in video processing, enabling users to decode up to 38 full-HD video streams which just wasn’t possible on the previous generation.

nvidia-geforce-rtx-2080-ti_1Related NVIDIA GeForce RTX Turing GPUs Detailed – GeForce RTX 2070 Features TU106, 50% Faster Per Core Performance, 50% Better Memory Compression Than Pascal

The NVIDIA Tesla T4 GPU is the world’s most advanced inference accelerator. Powered by NVIDIA Turing Tensor Cores, T4 brings revolutionary multi-precision inference performance to accelerate the diverse applications of modern AI. Packaged in an energy-efficient 75-watt, small PCIe form factor, T4 is optimized for scale-out servers and is purpose-built to deliver state-of-the-art inference in real time.

As the volume of online videos continues to grow exponentially, demand for solutions to efficiently search and gain insights from video continues to grow as well. Tesla T4 delivers breakthrough performance for AI video applications, with dedicated hardware transcoding engines that bring twice the decoding performance of prior-generation GPUs. T4 can decode up to 38 full-HD video streams, making it easy to integrate scalable deep learning into video pipelines to deliver innovative, smart video services.

via NVIDIA

The specifications inside the Tesla T4 are very impressive given its single-slot PCI-e form factor. The graphics card packs the Turing TU104 GPU with 2560 CUDA cores and 320 Tensor Cores. It delivers 8.1 TFLOPs of FP32 performance, 65 TFLOPs of FP16 mixed-precision, 130 TOPs of INT8 and 260 TOPs of INT4 performance. All of this compute performance is achieved with a TDP of just 75W. It means that you don’t need any external power source as the graphics card will be pulling the juice from the PCIe slot and can be put inside a 1U, 4U or any rack since the small form factor design will allow for large-scale compatibility in many servers.

pc-hardware-trends-2017-intel-amd-featurenvidiaRelated Weekly Roundup – Top 10 posts of the week, AMD took the top spot

Additionally, the graphics card would be coupled with 16 GB of GDDR6 memory which will deliver a bandwidth of more than 320 GB/s which is just stunning. The NV TensorRT Hyperscale Platform includes a comprehensive set of hardware and software offerings optimized for powerful, highly efficient inference. Key elements include:

  • NVIDIA Tesla T4 GPU – Featuring 320 Turing Tensor Cores and 2,560 CUDA cores, this new GPU provides breakthrough performance with flexible, multi-precision capabilities, from FP32 to FP16 to INT8, as well as INT4. Packaged in an energy-efficient, 75-watt, small PCIe form factor that easily fits into most servers, it offers 65 teraflops of peak performance for FP16, 130 teraflops for INT8 and 260 teraflops for INT4.
  • NVIDIA TensorRT 5 – An inference optimizer and runtime engine, NVIDIA TensorRT 5 supports Turing Tensor Cores and expands the set of neural network optimizations for multi-precision workloads.
  • NVIDIA TensorRT inference server – This containerized microservice software enables applications to use AI models in data center production. Freely available from the NVIDIA GPU Cloud container registry, it maximizes data center throughput and GPU utilization, supports all popular AI models and frameworks, and integrates with Kubernetes and Docker.

NVIDIA Tesla T4 GPU Specifications

There’s no word on pricing or availability yet but we will keep you updated as we get more info on the new Tesla T4 graphics card.