After you have run an initial inference, and your performance data is visible on the dashboard, you can evaluate performance and tune your model. The data appears in the Model Performance Summary on the Configuration Settings page. When you have multiple inference results, you can click on specific data points to view model performance details.
The Layers Table at the bottom of the page shows each layer of the executed graph of a model:
For each layer, the table displays the parameters listed below.
To see details about a layer:
TIP: To download a
.csvinference report for your model, click Download report.
You can sort layers by any parameter by clicking the name of the corresponding column.
To filter layers, select a column and a filter in the boxes above the table. Some filters by the Execution Order and Execution Time columns require providing a numerical value in the box that is opened automatically:
To filter by multiple columns, click Add new filter after you specify all the data for the the current column. To remove a filter, click the red remove symbol on the left to it:
NOTE: The filters you select are applied simultaneously.
Once you configure the filters, press Apply Filter. To apply a different filter, press Clear Filter and configure new filters.
To compare layers of a model before and after calibration, follow the steps described in Compare Performance between Two Versions of Models. After that, find the Layers Table at the bottom of the page:
NOTE: Make sure you select points on both graphs.
Each row of a table represents a layer of executed graphs of different model versions. The table displays execution time and precision. If a layer was executed in both versions, the table shows the difference between the execution time values of different model versions layers .
Click the layer name to see the details that appear on the right to the table. Switch between tabs to see parameters of layers that differ between the versions of the model:
In case a layer was not executed in one of the versions, the tool notifies you:
On the right to the Layers table, find the visualization of your model when it is executed by the Inference Engine. Click Visualize Original IR to see the graph of the original model in the OpenVINO™ IR format before it is executed by the Inference Engine.
Layers in the runtime graph and the IR (Intermediate Representation) graph have different meanings. The IR graph reflects the structure of a model, while the runtime graph shows how a specific version of the model was executed on a specific device. The runtime graphs usually have different structures for different model versions and for the same model run on different devices, because every device executes models in a certain way to achieve the best performance.
To adjust the scale, use magnifying glass icons or your mouse scroll wheel. To quickly find a layer, use the Search Layer button. Enter a layer name or input dimensions in the FIND field that opens:
To learn details about a layer, select the layer and click Show Node Properties name. The Node Properties window appears on the right:
DL Workbench supports layer mapping between the table, the runtime graph, and the original IR graph, which visually represents whether a layer was fused, tiled, or stayed intact. Once you click a layer in the table or in any of the graphs, the same layer or the layers corresponding to it are highlighted in other places:
If an original IR layer does not have a corresponding runtime layer, nothing in the table or in the Runtime Graph is highlighted:
You can also visualize execution time of layers. In the Coloring drop-down menu, select Execution Time Coloring. The graph gets colored according to the scale that appears above it. To return to the generic view, select No Coloring.
To learn about graph optimization algorithms supported on different plugins, see the Inference Engine CPU, Intel® Processor Graphics, and Intel® Movidius™ Neural Compute Stick 2 and Intel® Vision Accelerator Design with Intel® Movidius™ VPUs supported plugins documentation. For additional details on reading graphs of a model executed on a VPU plugin, see the section below.
If layers are joined, the Runtime graph displays an HwOp layer with two input and two output layers:
In the Runtime graph, FullyConnected, GEMM, and 3D Convolution layers are expressed as a sequence of 2D Convolution layers.
Certain graph features may be the signs of a low-performance model:
NOTE: Intel® Neural Compute Stick 2 does not support asymmetrical paddings. Therefore, asymmetrical paddings in the Intermediate Representation (IR) graph result in several Pad layers in the Runtime Graph, which causes lower performance of a model.