GPU Inference Momentum Continues to Build

AI algorithms trained on NVIDIA GPUs have proven their mettle to draw insights from huge swaths of data. They have enabled researchers and companies to gain new, deeper insights, and deliver more insights in less time. This evolution has taken training times from days to minutes, and researchers have invented sophisticated techniques that use multiple networks in combination to solve knotty problems. These range from image search to real-time voice-driven services to sentiment analysis and recommendation engines.  

As researchers continue to push the boundaries of impossible, networks and data-sets continue to rapidly expand. At the same time, many new services have real-time requirements, needing an answer delivered in a matter of milliseconds. These two trend-lines are driving the demand for accelerated inference, and NVIDIA GPUs are the go-to solution to meet these challenges.

Recently, PayPal was looking to deploy a new fraud detection system. The team working on it set a high bar: this system had to operate worldwide 24/7, and work in real-time to protect customer transactions from potential fraud. In spec’ing the system, it became evident that CPU-only servers couldn’t meet these requirements. According to Sri Shivananda, PayPal’s CTO and Senior VP, “PayPal needed GPUs to accelerate the deployment of our newest worldwide system, and to enable capabilities that were previously impossible.” Using NVIDIA T4 GPUs, PayPal delivered a new level of service, improving real-time fraud detection by 10 percent while lowering server capacity by nearly 8x.  

Companies like Microsoft, Snap, Pinterest and Twitter are also turning to GPU inference to accelerate their newest services, and the list keeps growing. We’ll be bringing you their stories in upcoming posts.


By Dave Salvator, Senior Manager, Product Marketing