As more and more deep learning models are being deployed into production environments, there is a growing need for a separation between the work on the model itself, and the work of integrating it into a production pipeline. Windows ML caters to this demand by addressing efficient deployment of pretrained deep learning models into Windows applications.
Developing and training the model itself requires being involved with the science as well as know-hows behind it. However, when a pretrained model is being used in a pipeline for inference, it can be treated as simply a series of arbitrary computations on incoming data. These computations are fully described by an ONNX file representing the deep learning model. The ONNX model can be edited and processed to make some simple but often-needed tweaks and optimizations at the deployment stage.
Windows ML Overview
Introduced by Facebook and Microsoft, ONNX is an open interchange format for ML models that allows you to more easily move between frameworks such as PyTorch, TensorFlow, and Caffe2. An actively evolving ecosystem is built around ONNX.
By combining a straightforward, robust, and efficient machine learning inferencing framework, as well as a comprehensive and richly supported neural net model data format like ONNX, Windows ML allows you to integrate state-of-the-art AI models developed by research scientists, directly into real-world applications.
Read the full blog, Using Windows ML, ONNX, and NVIDIA Tensor Cores, on the NVIDIA Developer Blog.