The Pascal architecture is continuing to find its way through Nvidia's product line-up, with today marking the introduction of its Tesla P4 and P40 GPUs.
According to Nvidia's specs sheet, the P40 clocks in at 12 teraflops for single precision calculation and 47 trillion int8 operations per second thanks to 24GB GDDR5 memory with 346GBps bandwidth, and 3,840 CUDA cores. The less powerful P4 offers 5.5 teraflops for single precision and 22 trillion int8 operations per second, backed by 2,560 CUDA cores and 8GB GDDR5 memory with bandwidth of 192GBps.
Designed for artificial intelligence and running neural networks, the company said the chips offer four times the performance of its M40 and M4 launched last year. Compared to using an Intel Xeon E5-2690v4, which was launched earlier this year, Nvidia claimed its offering is 40 times more power efficient while being 45 times faster to respond.
Alongside its hardware, Nvidia is releasing its TensorRT library and DeepStream SDK. TensorRT takes 16-bit or 32-bit trained neural networks and "optimises" for reduced-precision 8-bit operations, while DeepStream allows for analysing up to 93 HD video streams simultaneously in real time.