This is a deprecated page. Please, consider reading this page describing new approach to convert Object Detection API models giving closer to TensorFlow inference results.
As explained in the Sub-graph Replacement in Model Optimizer section, there are multiple ways to setup the sub-graph matching. In this example we are focusing on the defining the sub-graph via a set of "start" and "end" nodes. The result of matching is two buckets of nodes:
A distinct layer of any SSD topology is the
DetectionOutput layer. This layer is implemented with a dozens of primitive operations in TensorFlow, while in Inference Engine, it is one layer. Thus, to convert a SSD model from the TensorFlow, the Model Optimizer should replace the entire sub-graph of operations that implement the
DetectionOutput layer with a single well-known
The Inference Engine
DetectionOutput layer consumes three tensors in the following order:
DetectionOutput layer produces one tensor with seven numbers for each actual detection. There are more output tensors in the TensorFlow Object Detection API, but the values in them are consistent with the Inference Engine ones.
The difference with other examples is that here the
DetectionOutput sub-graph is replaced with a new sub-graph (not a single layer).
Look at sub-graph replacement configuration file
<INSTALL_DIR>/deployment_tools/model_optimizer/extensions/front/tf/legacy_ssd_support.json that is used to enable two models listed above:
The second sub-graph replacer with identifier
PreprocessorReplacement is used to remove the
Preprocessing block from the graph. The replacer removes all nodes from this scope except nodes performing mean value subtraction and scaling (if applicable). Implementation of the replacer is in the
Now let's analyze the structure of the topologies generated with the Object Detection API. There are several blocks in the graph performing particular task:
Preprocessorblock resizes, scales and subtracts mean values from the input image.
FeatureExtractorblock is a MobileNet or other backbone to extract features.
MultipleGridAnchorGeneratorblock creates initial bounding boxes locations (anchors).
Postprocessorblock acts as a
DetectionOutputlayer. So we need to replace
DetectionOutputlayer. It is necessary to add all input nodes of the
Postprocessorscope to the list
start_points. Consider inputs of each of these nodes:
Postprocessor/Shapeconsumes tensor with locations.
Postprocessor/Sliceconsumes tensor with confidences.
Postprocessor/ExpandDimsconsumes tensor with prior boxes.
Postprocessor/Reshape_1consumes tensor with locations similarly to the
Postprocessor/Shapenode. Despite the fact that the last node
Postprocessor/Reshape_1gets the same tensor as node
Postprocessor/Shape, it must be explicitly put to the list.
Object Detection API
Postprocessor block generates output nodes:
Now consider the implementation of the sub-graph replacer, available in the
<INSTALL_DIR>/deployment_tools/model_optimizer/extensions/front/tf/SSDs.py. The file is rather big, so only some code snippets are used:
These lines define the new
PostprocessorReplacement class inherited from
FrontReplacementFromConfigFileSubGraph is designed to replace sub-graph of operations described in the configuration file. There are methods to override for implementing custom replacement logic that we need:
generate_sub_graphperforms new sub-graph generation and returns dictionary where key is an alias name for the node and value is a Node objects. The dictionary has the same format as parameter
replace_sub_graphmethod in the example with networkx sub-graph isomorphism pattern. This dictionary is passed as argument to the next three methods, so it should contain entries the for nodes that the functions need.
input_edges_matchspecifies mapping between input edges to sub-graph before replacement and after replacement. The key of the dictionary is a tuple specifying input tensor of the sub-graph before replacement: sub-graph input node name and input port number for this node. The value for this key is also a tuple specifying the node where this tensor should be attached during replacement: the node name (or alias name of the node) and the input port for this node. If the port number is zero, the parameter could be omitted so the key or value is just a node name (alias). Default implementation of the method returns an empty dictionary, so Model Optimizer does not create new edges.
output_edges_matchreturns mapping between old output edges of the matched nodes and new sub-graph node and output edge index. The format is similar to the dictionary returned in the
input_edges_matchmethod. The only difference is that instead of specifying input port numbers for the nodes it is necessary to specify output port number. Of course, this mapping is needed for the output nodes only. Default implementation of the method returns an empty dictionary, so the Model Optimizer does not create new edges.
nodes_to_removespecifies list of nodes that Model Optimizer should remove after sub-graph replacement. Default implementation of the method removes all sub-graph nodes.
Review of the replacer code, considering details of the
DetectionOutput layer implementation in the Inference Engine. There are several constraints to the input tensors of the
[#batch, #prior_boxes * 4]or
[#batch, #prior_boxes * 5]depending on shared locations between different batches or not.
[#batch, #prior_boxes * #classes]and confidences values are in range [0, 1], that is passed through
[#batch, 2, #prior_boxes * 4]. Inference Engine expects that it contains variance values which TensorFlow Object Detection API does not add.
To enable these models, add
Reshape operations for locations and confidences tensors and update the values for the prior boxes to include the variance constants (they are not there in TensorFlow Object Detection API).
Look at the
The method has two inputs: the graph to operate on and the instance of
SubgraphMatch object, which describes matched sub-graph. The latter class has several useful methods to get particular input/output node of the sub-graph by input/output index or by node name pattern. Examples of these methods usage are given below.
Softmaxand graph Node object corresponding to that operation.
Reshapeto reshape locations and confidences tensors correspondingly.
DetectionOutputand graph Node object corresponding to that operation.
reshapenode and connect two reshaped locations and confidences tensors with
input_edges_match method is the following:
The method has three parameters: input
match object describing matched sub-graph and
new_sub_graph dictionary with alias names returned from the
match.input_nodes(ind)returns list of tuples where the first element is a Node object and the second is the input port for this node which consumes the ind-th input tensor of the sub-graph.
input_pointslist in the configuration file defines the order of input tensors to the sub-graph. For example, the
locs_consumer_nodeobject of type Node is a node that consumes tensor with locations in the port with number
idof the Node object contains the name of the node in the graph.
output_edges_match method is the following:
The method has the same three parameters as
input_edges_match method. The returned dictionary contains mapping just for one tensor initially produces by the first output node of the sub-graph (which is
detection_boxes according to the configuration file) to a single output tensor of the created
DetectionOutput node. In fact, it is possible to use any output node of the initial sub-graph in mapping, because the sub-graph output nodes are the output nodes of the whole graph (their output is not consumed by any other nodes).
Now, the Model Optimizer knows how to replace the sub-graph. The last step to enable the model is to cut-off some parts of the graph not needed during inference.
It is necessary to remove the
Preprocessor block where image is resized. Inference Engine does not support dynamic input shapes, so the Model Optimizer must froze the input image size, and thus, resizing of the image is not necessary. This is achieved by replacer
<INSTALL_DIR>/deployment_tools/model_optimizer/extensions/front/tf/Preprocessor.py which is executed automatically.
There are several
Switch operations in the
Postprocessor block without output edges. For example:
Model Optimizer marks these nodes as output nodes of the topology. Some parts of the
Posprocessor blocks are not removed during sub-graph replacement because of that. In order to fix this issue, it is necessary to specify output nodes of the graph manually using the
--output command line parameter.
The final command line to convert SSDs from the TensorFlow Object Detection API Zoo is:
The MobileNet V2 model differs from the previous version, so converting the model requires a new sub-graph replacement configuration file and new command line parameters. The major differences are:
Preprocessorblock has two outputs: the pre-processed image and the pre-processed image size.
Postprocessorblock has one more input (in comparison with models created with TensorFlow Object Detection API version 1.6 or lower): the pre-processed image size.
The updated sub-graph replacement configuration file
extensions/front/tf/ssd_v2_support.json reflecting these changes is the following:
The final command line to convert MobileNet SSD V2 from the TensorFlow Object Detection Zoo is the following: