This demo shows how to run Text Spotting models. Text Spotting models allow us to simultaneously detect and recognize text.
NOTE: Only batch size of 1 is supported.
The demo application expects a text spotting model that is split into three parts. Every model part must be in the Intermediate Representation (IR) format.
First model is Mask-RCNN like text detector with the following constraints:
im_datafor input image and
im_infofor meta-information about the image (actual height, width and scale).
boxeswith absolute bounding box coordinates of the input image
scoreswith confidence scores for all bounding boxes
classeswith object class IDs for all bounding boxes
raw_maskswith fixed-size segmentation heat maps for all classes of all bounding boxes
text_featureswith text features which are fed to Text Recognition Head further
Second model is Text Recognition Encoder that takes
text_features as input and produces
Third model is Text Recognition Decoder that takes
encoded text from Text Recognition Encoder ,
previous symbol and
hidden state. On the first step special
Start Of Sequence (SOS) symbol and zero
hidden state are fed to Text Recognition Decoder. The decoder produces
current hidden state each step until
End Of Sequence (EOS) symbol is generated.
Examples of valid inputs to specify with a command-line argument
-i are a path to a video file or a numeric ID of a web camera.
The demo workflow is the following:
im_infoinput blob passes resulting resolution and scale of a pre-processed image to the network to perform inference of Mask-RCNN-like text detector.
--show_scoresarguments, bounding boxes and confidence scores are also shown.
NOTE: By default, Open Model Zoo demos expect input with BGR channels order. If you trained your model to work with RGB order, you need to manually rearrange the default channels order in the demo application or reconvert your model using the Model Optimizer tool with
--reverse_input_channelsargument specified. For more information about the argument, refer to When to Reverse Input Channels section of Converting a Model Using General Conversion Parameters.
Run the application with the
-h option to see the following usage message:
Running the application with an empty list of options yields the short version of the usage message and an error message.
NOTE: Before running the demo with a trained model, make sure the model is converted to the Inference Engine format (
*.bin) using the Model Optimizer tool.
To run the demo, please provide paths to the model in the IR format and to an input with images:
The application uses OpenCV to display resulting text instances and current inference performance.