Are GPUs about to enter IT’s mainstream?

By Charles Brett

Until relatively recently most in IT looked upon Graphics Processing Units (GPUs[1]) as specialty technology with dozens or hundreds of cores optimized to run complex mathematical modelling and analysis, numerically intense calculations or for driving sophisticated graphics and rendering. In effect GPUs were for where conventional processors did not have the horsepower to solve specific types of ‘scientific’ problems that could exploit parallel processing. The implicit specialisation that this represented is, however, changing. GPUs are becoming general purpose, with so-called GPGPUs, and these are likely to enter IT’s mainstream because of three separate yet coincidental developments which complement each other. All three were in evidence at Nvidia’s recent GTC conference.

The first concerns hardware. Modern x86 servers for datacenters are packed densely in the smallest practical space, as blades or similar. Normally IT views these as servers and being about data (rather than graphics). As such these servers are optimised to process, store and communicate that data. The ability to add more than simple graphic capabilities in these dense servers has been regarded as wasteful of valuable datacenter real estate, as well as unnecessary when results can be delivered to end user workstations which possess the power to do any high end presentation or calculation, etc.

This view of servers is evolving. Cisco, Dell, HP, IBM and SuperMicro, among others, all now offer blade or other densely packaged server designs which include support for GPUs. This is necessary because GPUs are both electrically and heat intensive and need special designs in order to cope with the additional heat and the power requirements. With such GPU-enabled servers, GPU hardware is moving into IT’s mainstream.

The second indicator that GPUs will be of increasing relevance to IT comes with their introduction into IT infrastructure and in particular how they can support VDI (Virtual Desktop Infrastructure[1]) for design and power users. The key lies in how hypervisors and GPUs work together. There are various different flavours, from hypervisor pass-through to hypervisor-GPU-sharing (where one GPU complex capabilities may be split across several or many users using vGPU or virtual GPU[2]). Citrix’s XenServer already has this. In 2015 VMware will add vGPU support into ESX thereby opening up the ability to share GPUs in virtualised data centres like the Nvidia Titan Z which has some 5760 Cuda[3]cores, 12GB of memory and an alleged 8 teraflops of performance per card. By adding native driver support within the hypervisor, traditional GPU functions that previously were only delivered by high end workstations can move from that desktop into the data centre.

The third indicator of increasing IT relevance comes with new types of software able to take advantage of GPUs. While Hadoop and MapReduce have not yet quite made it into the IT mainstream, Hadoop’s ability to handle huge datasets is becoming ever more attractive. That GPUs and Hadoop can combine to provide massively parallel processing on relatively inexpensive (each Titan Z will cost c GBP2K) brings both Hadoop and GPUs closer to IT usage. Already various vendors are talking of using clusters of eight GPUs with 96GB of main memory that is accessible by thousands upon thousands of dedicated cores.

Yet that is not all. At GTC other software for exploiting GPUs was ‘on show’ (mostly this was in the early stages of development but is indicative of what will be possible). For example Israel-based SQREAM provides SQL on a GPU and, by using compression as well as removing conventional database indices to reduce a 100TB database to 7TB to run on a GPU, it delivers brute force access to warehouse data. (It also reduces the load time from over fifteen hours to less than an hour). Map-D, an MIT spin-off, takes a different approach. It has built a database in GPU memory into which it can load years of, say, Twitter data which it can then analyse in seconds if not milliseconds. The reason for such high performance is partly that the need for data indices processing disappears when you can make use of the many GPU cores and substantial memory. In effect all the data is processed in raw form.

While general purpose GPU-enabled software remains relatively underdeveloped (compared to what is available in hardware and for infrastructure) it is clear that broader GPU acceptance is making headway. No longer will GPUs be primarily the province of rocket scientists or protein modelers or algorithmic traders. The combination of inexpensive GPU hardware capable of installation in standard data centre server enclosures provides a platform for broader GPU exploitation. With hypervisors supporting vGPU the infrastructure exists on which new big data analysis software can arrive. For IT, GPUs have the potential to add big data capabilities that previously seemed either outlandish, impossible or simply unaffordable.

That said, IT will need to take care. One simple example demonstrates why. You cannot just add a GPU card, never mind multiple ones, to an existing blade or other dense form factor server – even if the PCIe slots are available. Failing to plan for the GPU-related electrical and heat loads has already proven expensive for those who did not think ahead.

[1] Desktop Virtualization – Aligning options with user and business requirements

[2] Virtual GPU definition

[3] Nvidia Cuda

Content Contributors: Charles Brett