Developer Blog: Optimizing the Deep Learning Recommendation Model on NVIDIA GPUs

Recommender systems help people find what they’re looking for among an exponentially growing number of options. They are a critical component for driving user engagement on many online platforms.

With the rapid growth in scale of industry datasets, deep learning (DL) recommender models, which capitalize on large amounts of training data, have started to show advantages over traditional methods. Current DL–based models for recommender systems include the Wide and Deep model, Deep Learning Recommendation Model (DLRM), neural collaborative filtering (NCF), Variational Autoencoder (VAE) for Collaborative Filtering, and BERT4Rec among others.

There are multiple challenges when it comes to performance of large-scale recommender systems solutions: huge datasets, complex data preprocessing and feature engineering pipelines, as well as extensive repeated experimentation. To meet the computational demands for large-scale DL recommender systems training and inference, recommender-on-GPU solutions aim to provide fast feature engineering and high training throughput (to enable both fast experimentation and production retraining), as well as low latency, high-throughput inference.

In this post, we discuss our reference implementation of DLRM, which is part of the NVIDIA GPU-accelerated DL model portfolio. It covers a wide range of network architectures and applications in many different domains, including image, text and speech analysis, and recommender systems. With DLRM, we systematically tackle the challenges mentioned.

Read the full post on the NVIDIA Developer Blog.