Starting from the OpenVINO™ release 2020.1, the Inference Engine integrates the nGraph Core, which is a part of the nGraph compiler stack. That implies that the Inference Engine uses a new way to represent a model in run time underneath of the conventional
CNNNetwork API, which is an instance of
Besides the representation update, nGraph integration resulted in the following changes and new features:
CNNNetwork, there was created a new set of operations called `opset1`, which covered both interfaces except several not very important cases. Operations from
opset1are generated by the Model Optimizer and are accepted in the Inference Engine.
ngraph::Functionpassing it to
CNNNetwork. It is a replacement for the NNBuilder API.
The conventional flow that is not based on nGraph is still available. The complete picture of co-existence of legacy and new flows is presented below. The rest of the document describes the coexistence of legacy and new flows showed in the picture below:
As the new operation set is introduced, the Model Optimizer generates the IR version 10 using the new operations by default. Each layer generated in the IR has a semantics matching to the corresponding operation from the nGraph namespace
opset1. The IR version 10 automatically triggers the nGraph flow inside the Inference Engine. When such IR is read in an application, the Inference Engine IR reader produces
CNNNetwork that encapsulates the
ngraph::Function instance underneath. Thus the OpenVINO IR becomes a new serialization format for the nGraph IR, and it can be deserialized reading the
IMPORTANT: Conventional interfaces are used (
CNNNetwork, the reader), so no changes required in most applications.
NOTE: While you still can use old APIs, there is an independent process of continuous improvements in the Inference Engine API. For example, the Core::Read API is recommended to use instead of
CNNNetworkReader. These changes are independent of nGraph integration and do not enable or disable new features.
Interpretation of the IR version 10 differs from the old IR version. Besides having a different operations set, the IR version 10 ignores the shapes and data types assigned to the ports in an XML file. Both shapes and types are reinferred while loading to the Inference Engine using the nGraph shape and type propagation function that is a part of each nGraph operation.
You can read old versions of the IR in the Inference Engine. Each version below or equal to 7 is treated as an old one. When the Inference Engine reader reads an old version of the IR, it does not use the nGraph representation. There is no way to activate nGraph flow with an old IR version. The rest of this document is not applied in this case.
Model Optimizer generates the IR version 10 by default, and there is the command line key
--generate_deprecated_IR_V7 which switches generation to the legacy IR version 7. It is useful when the new nGraph flow does not work for some reason.
Alternative method to feed the Inference Engine with a model is to create the model in the run time. It is achieved by creation of the
ngraph::Function construction using nGraph operation classes and optionally user-defined operations. For details, see Add Custom nGraph Operations and examples. At this stage, the code is completely independent of the rest of the Inference Engine code and can be built separately. After you construct an instance of
ngraph::Function, you can use it to create
CNNNetwork by passing it to the new constructor for this class.
CNNNetwork from the nGraph Function means encapsulating the object and not converting it to a conventional representation. Going to low-level details, technically it is achieved by using another class for the
CNNNetwork internals. The old representation that is used for former versions of IR before version 10 uses
CNNNetworkImpl. The new representation that is built around nGraph uses
The old representation is still required in the cases listed below. When old representation is required, the conversion from the
ngraph::Function to the old representation is called automatically. The following methods lead to the automatic conversion:
Using the old API, which is expected to produce an old representation. Guaranteed to be read-only. Once you call such a method, the original nGraph representation is preserved and continues to be used in the successive calls.
CNNNetwork::serialize. Dumps the old representation after automatically called conversion. Cannot be used to dump IR V10. For details, see Graph Debug Capabilities.
CNNNetwork methods that modify the model. After that nGraph representation is lost and cannot be used afterwards.
1.2. CNNNetwork::setBatchSize. Still implemented through old logic for backward compatibility without using nGraph capabilities. For details, see Using Shape Inference.
1.1. `Data::getInputTo` 1.2. `Data::getCreatorLayer` 1.3. `CNNNetwork::getLayerByName` 1.4. Iterating over `CNNLayer` objects in `CNNNetwork`: `CNNNetwork::begin`, `details::CNNNetworkIterator` class.
Though the conversion is always a one-way process, which means there is no method to convert back, there are important caveats.
In the cases  and , both representations are held underneath and you should use the old representation in the read-only mode only from the caller side. It is hard to track from the Inference Engine side whether the API is used in the read-only mode or for modification of the model.
That is why when using potentially modifying methods listed in section  above, you should not modify the model via those methods. Use a direct manipulation of the nGraph function instead.
Inference Engine implements the conversion function that is used when the nGraph function is transformed to the old
CNNNetworkImpl representation. This conversion function is hidden and you cannot call it directly from the application. Nevertheless, it is an important component of the model transformation pipeline in the Inference Engine. Some issues of models may be caught during the conversion process in this function. Exceptions are thrown in this function, and you should know what this function does to find a root cause.
The conversion function performs the following steps:
opset1to legacy layer semantics described in the Legacy Layers Catalog. The model is still represented as the nGraph function at this stage, but the operation set is completely different.
CNNNetworkImplwithout changing its semantics. You can see the result of the conversion by calling the
CNNNetwork::serializemethod, which produces legacy IR semantics, which is not nGraph-based even if it is applied to
CNNNetworkconstructed from the nGraph Function. It may help in debugging, see Graph Debug Capabilities to view all options for dumping new and old IR representations.