Converting a Model to Intermediate Representation (IR)

General Conversion Parameters¶

To adjust the conversion process, you can also use the general (framework-agnostic) parameters:

optional arguments:
  -h, --help            show this help message and exit
  --framework {tf,caffe,mxnet,kaldi,onnx}
                        Name of the framework used to train the input model.

Framework-agnostic parameters:
  --input_model INPUT_MODEL, -w INPUT_MODEL, -m INPUT_MODEL
                        Tensorflow*: a file with a pre-trained model (binary
                        or text .pb file after freezing). Caffe*: a model
                        proto file with model weights
  --model_name MODEL_NAME, -n MODEL_NAME
                        Model_name parameter passed to the final create_ir
                        transform. This parameter is used to name a network in
                        a generated IR and output .xml/.bin files.
  --output_dir OUTPUT_DIR, -o OUTPUT_DIR
                        Directory that stores the generated IR. By default, it
                        is the directory from where the Model Optimizer is
                        launched.
  --input_shape INPUT_SHAPE
                        Input shape(s) that should be fed to an input node(s)
                        of the model. Shape is defined as a comma-separated
                        list of integer numbers enclosed in parentheses or
                        square brackets, for example [1,3,227,227] or
                        (1,227,227,3), where the order of dimensions depends
                        on the framework input layout of the model. For
                        example, [N,C,H,W] is used for Caffe* models and
                        [N,H,W,C] for TensorFlow* models. Model Optimizer
                        performs necessary transformations to convert the
                        shape to the layout required by Inference Engine
                        (N,C,H,W). The shape should not contain undefined
                        dimensions (? or -1) and should fit the dimensions
                        defined in the input operation of the graph. If there
                        are multiple inputs in the model, --input_shape should
                        contain definition of shape for each input separated
                        by a comma, for example: [1,3,227,227],[2,4] for a
                        model with two inputs with 4D and 2D shapes.
                        Alternatively, specify shapes with the --input
                        option.
  --scale SCALE, -s SCALE
                        All input values coming from original network inputs
                        will be divided by this value. When a list of inputs
                        is overridden by the --input parameter, this scale is
                        not applied for any input that does not match with the
                        original input of the model.
  --reverse_input_channels
                        Switch the input channels order from RGB to BGR (or
                        vice versa). Applied to original inputs of the model
                        if and only if a number of channels equals 3. Applied
                        after application of --mean_values and --scale_values
                        options, so numbers in --mean_values and
                        --scale_values go in the order of channels used in the
                        original model.
  --log_level {CRITICAL,ERROR,WARN,WARNING,INFO,DEBUG,NOTSET}
                        Logger level
  --input INPUT         Quoted list of comma-separated input nodes names with
                        shapes, data types, and values for freezing. The shape
                        and value are specified as space-separated lists. The
                        data type of input node is specified in braces and can
                        have one of the values: f64 (float64), f32 (float32),
                        f16 (float16), i64 (int64), i32 (int32), u8 (uint8),
                        boolean. For example, use the following format to set
                        input port 0 of the node `node_name1` with the shape
                        [3 4] as an input node and freeze output port 1 of the
                        node `node_name2` with the value [20 15] of the int32
                        type and shape [2]: "0:node_name1[3
                        4],node_name2:1[2]{i32}->[20 15]".
  --output OUTPUT       The name of the output operation of the model. For
                        TensorFlow*, do not add :0 to this name.
  --mean_values MEAN_VALUES, -ms MEAN_VALUES
                        Mean values to be used for the input image per
                        channel. Values to be provided in the (R,G,B) or
                        [R,G,B] format. Can be defined for desired input of
                        the model, for example: "--mean_values
                        data[255,255,255],info[255,255,255]". The exact
                        meaning and order of channels depend on how the
                        original model was trained.
  --scale_values SCALE_VALUES
                        Scale values to be used for the input image per
                        channel. Values are provided in the (R,G,B) or [R,G,B]
                        format. Can be defined for desired input of the model,
                        for example: "--scale_values
                        data[255,255,255],info[255,255,255]". The exact
                        meaning and order of channels depend on how the
                        original model was trained.
  --data_type {FP16,FP32,half,float}
                        Data type for all intermediate tensors and weights. If
                        original model is in FP32 and --data_type=FP16 is
                        specified, all model weights and biases are quantized
                        to FP16.
  --disable_fusing      Turn off fusing of linear operations to Convolution
  --disable_resnet_optimization
                        Turn off resnet optimization
  --finegrain_fusing FINEGRAIN_FUSING
                        Regex for layers/operations that won't be fused.
                        Example: --finegrain_fusing Convolution1,.*Scale.*
  --disable_gfusing     Turn off fusing of grouped convolutions
  --enable_concat_optimization
                        Turn on Concat optimization.
  --extensions EXTENSIONS
                        Directory or a comma separated list of directories
                        with extensions. To disable all extensions including
                        those that are placed at the default location, pass an
                        empty string.
  --batch BATCH, -b BATCH
                        Input batch size
  --version             Version of Model Optimizer
  --silent              Prevent any output messages except those that
                        correspond to log level equals ERROR, that can be set
                        with the following option: --log_level. By default,
                        log level is already ERROR.
  --freeze_placeholder_with_value FREEZE_PLACEHOLDER_WITH_VALUE
                        Replaces input layer with constant node with provided
                        value, for example: "node_name->True". It will be
                        DEPRECATED in future releases. Use --input option to
                        specify a value for freezing.
  --static_shape        Enables IR generation for fixed input shape (folding
                        `ShapeOf` operations and shape-calculating sub-graphs
                        to `Constant`). Changing model input shape using
                        the Inference Engine API in runtime may fail for such an IR.
  --disable_weights_compression
                        Disable compression and store weights with original
                        precision.
  --progress            Enable model conversion progress display.
  --stream_output       Switch model conversion progress display to a
                        multiline mode.
  --transformations_config TRANSFORMATIONS_CONFIG
                        Use the configuration file with transformations
                        description.

The sections below provide details on using particular parameters and examples of CLI commands.

When to Specify Mean and Scale Values¶

Usually neural network models are trained with the normalized input data. This means that the input data values are converted to be in a specific range, for example, [0, 1] or [-1, 1]. Sometimes the mean values (mean images) are subtracted from the input data values as part of the pre-processing. There are two cases how the input data pre-processing is implemented.

The input pre-processing operations are a part of a topology. In this case, the application that uses the framework to infer the topology does not pre-process the input.
The input pre-processing operations are not a part of a topology and the pre-processing is performed within the application which feeds the model with an input data.

In the first case, the Model Optimizer generates the IR with required pre-processing layers and Inference Engine samples may be used to infer the model.

In the second case, information about mean/scale values should be provided to the Model Optimizer to embed it to the generated IR. Model Optimizer provides a number of command line parameters to specify them: --scale, --scale_values, --mean_values, --mean_file.

If both mean and scale values are specified, the mean is subtracted first and then scale is applied. Input values are divided by the scale value(s).

There is no a universal recipe for determining the mean/scale values for a particular model. The steps below could help to determine them:

Read the model documentation. Usually the documentation describes mean/scale value if the pre-processing is required.
Open the example script/application executing the model and track how the input data is read and passed to the framework.
Open the model in a visualization tool and check for layers performing subtraction or multiplication (like Sub, Mul, ScaleShift, Eltwise etc) of the input data. If such layers exist, pre-processing is probably part of the model.

When to Specify Input Shapes¶

There are situations when the input data shape for the model is not fixed, like for the fully-convolutional neural networks. In this case, for example, TensorFlow* models contain -1 values in the shape attribute of the Placeholder operation. Inference Engine does not support input layers with undefined size, so if the input shapes are not defined in the model, the Model Optimizer fails to convert the model. The solution is to provide the input shape(s) using the --input or --input_shape command line parameter for all input(s) of the model or provide the batch size using the -b command line parameter if the model contains just one input with undefined batch size only. In the latter case, the Placeholder shape for the TensorFlow* model looks like this [-1, 224, 224, 3].

When to Reverse Input Channels¶

Input data for your application can be of RGB or BRG color input order. For example, Inference Engine samples load input images in the BGR channels order. However, the model may be trained on images loaded with the opposite order (for example, most TensorFlow* models are trained with images in RGB order). In this case, inference results using the Inference Engine samples may be incorrect. The solution is to provide --reverse_input_channels command line parameter. Taking this parameter, the Model Optimizer performs first convolution or other channel dependent operation weights modification so these operations output will be like the image is passed with RGB channels order.

When to Specify Command Line Parameter¶

If the --static_shape command line parameter is specified the Model Optimizer evaluates shapes of all operations in the model (shape propagation) for a fixed input(s) shape(s). During the shape propagation the Model Optimizer evaluates operations Shape and removes them from the computation graph. With that approach, the initial model which can consume inputs of different shapes may be converted to IR working with the input of one fixed shape only. For example, consider the case when some blob is reshaped from 4D of a shape [N, C, H, W] to a shape [N, C, H * W]. During the model conversion the Model Optimize calculates output shape as a constant 1D blob with values [N, C, H * W]. So if the input shape changes to some other value [N,C,H1,W1] (it is possible scenario for a fully convolutional model) then the reshape layer becomes invalid. Resulting Intermediate Representation will not be resizable with the help of Inference Engine.