At GTC Europe in Munich Germany, NVIDIA announced RAPIDS, a suite of open-source software libraries for executing end-to-end data science and analytics pipelines entirely on GPUs.
RAPIDS aims to accelerate the entire data science pipeline including data loading, ETL, model training, and inference. This will enable more productive, interactive, and exploratory workflows.
The RAPIDS libraries are written in Python, and built on Apache Arrow. The software is being developed as open source software in partnership with enterprises globally.
RAPIDS is now available as a container image on NVIDIA GPU Cloud (NGC) and Docker Hub for use on-premises or on public cloud services such as AWS, Azure, and GCP. The RAPIDS source code is also available on github. Visit the RAPIDS site for more information.
For a walk through of how to download the RAPIDS container, run it, visit the original post on the NVIDIA Developer Blog.
Related resources
- DLI course: Accelerating End-to-End Data Science Workflows
- DLI course: Speed Up DataFrame Operations With RAPIDS cuDF
- GTC session: Reducing the Cost of your Data Science Workloads on the Cloud
- GTC session: RAPIDS in 2024: Accelerated Data Science Everywhere
- GTC session: RAPIDS Accelerator for Apache Spark Propels Data Center Efficiency and Cost Savings
- SDK: RAPIDS