Data Science

Accelerating Automated and Explainable Machine Learning with RAPIDS and NVIDIA GPUs

RAPIDS aims to democratize accelerated data science through accessibility and innovation. The most recent release reflects major strides in these efforts through integrations with TPOT, a popular tool for AutoML, and innovations in SHAP, a method for providing deep interpretability to machine learning.

Faster AutoML for Data-Driven Enterprises

AutoML makes machine learning accessible for non-experts and improves the efficiency of it’s practice. By automating the most tedious parts of the machine learning process, such as feature engineering, model selection, and hyperparameter optimization, practitioners and enterprises can apply machine learning to solve business problems more easily. TPOT makes AutoML even easier by learning patterns to only test the most effective models, reducing computation time and costs. 

While it reduces the effort of building models, AutoML adds heavy computational demands to automate building machine learning pipelines. Recent integrations between RAPIDS and TPOT help to solve this issue by powering computationally intensive processes with NVIDIA GPUs. In two tests conducted on publicly available datasets, NVIDIA GPUs allowed TPOT to find a better model in just one hour than a CPU-based implementation achieved in eight hours. These improved speeds make the application of AutoML at scale feasible for enterprises across industries. Additionally, faster speeds allow for more iterations, and in turn more accurate models. Between the increased model accuracy and the costs saved from faster run times, enterprises can add millions to their bottomline using TPOT and NVIDIA GPUs.

Explainable Machine Learning without Sacrificing Speed

As Machine Learning continues to grow in practice across enterprises and industries, the need for explainability has become critical to understand automated decisioning. SHapley Additive exPlanations (SHAP) provides well-justified analyses of how features contribute to given predictions. Due to heavy computational intensiveness, data scientists have avoided SHAP as it can add substantial cost to a modeling workload. 

To address these pain points, RAPIDS 0.16 includes a snapshot of XGBoost that incorporates GPUTreeSHAP to accelerate SHAP-based model explanations with a few lines of code. Using a single GPU, GPUTreeSHAP can provide explanations 20x faster than a 40-core CPU node for moderate-sized tree models, with even further acceleration possible for explanations of feature interactions. These speeds are transformative for enterprises in regulated industries, such as financial institutions, to explain their machine learning operations and improve efficiencies and costs as well.

RAPIDS 0.17 will expand upon these advancements, including support for non-tree models. 

These new developments in the RAPIDS ecosystem highlight the innovations that accelerated computing fuels in Data Science.

For a deeper understanding of the latest RAPIDS features and integrations, read more about the 0.16 release here.

Discuss (0)

Tags