Model Optimizer is a cross-platform command-line tool that facilitates the transition between the training and deployment environment, performs static model analysis, and adjusts deep learning models for optimal execution on end-point target devices.

Model Optimizer process assumes you have a network model trained using a supported deep learning framework. The scheme below illustrates the typical workflow for deploying a trained deep learning model:

Model Optimizer produces an Intermediate Representation (IR) of the network, which can be read, loaded, and inferred with the Inference Engine. The Inference Engine API offers a unified API across a number of supported Intel® platforms. The Intermediate Representation is a pair of files describing the model:

.xml - Describes the network topology
.bin - Contains the weights and biases binary data.

What's New in the Model Optimizer in this Release?

TensorFlow*
- Improved workflow for converting TensorFlow* Objection Detection API models conversion:
  - For this type of topologies, the Model Optimizer generates input layer dimensions based on two parameters: the --input_shape CLI parameter and image resizer type that is defined in the pipeline.config file.
  - Model Optimizer generates a number of PiorBoxClustered nodes instead of the Const node with priorboxes so that you can reshape SSD models in the Inference Engine.
  - Non-square input image sizes are supported.
  - Added support for the RFCN topology.
- Added support for more TensorFlow* operations. The full list of supported operations is defined in the Supported Framework Layers.
- Added the command line parameter --tensorflow_custom_layer_libraries to load shared libraries with custom TensorFlow* operations to reuse shape inference function.
MXNet*
- Added support for more MXNet* operations. The full list of supported operations is defined in the Supported Framework Layers.
ONNX*
- Added support for more ONNX* operations. The full list of supported operations is defined in the Supported Framework Layers.
Common
- The default IR version is increased from 2 to 3. Since the IR version 3 introduces new layers and attributes, previous versions of the Inference Engine may not be able to infer them. The IR with version 2 can be generated using the --generate_deprecated_IR_V2 command line parameter.
- Duplicated weights are removed from the IR, so several layers can share the same blob.
- Meta information is added to the IR with information about command line parameters used and the Model Optimizer version.

Notice that certain topology-specific layers (like DetectionOutput used in the SSD*) are now shipped in a source code, which assumes the extensions library is compiled/loaded. The extensions are also required for the pre-trained models inference.

Table of Content

Typical Next Step: Introduction to Intel® Deep Learning Deployment Toolkit