This is a network for handwritten Japanese text recognition scenario. It consists of a VGG16-like backbone, reshape layer and a fully connected layer. The network is able to recognize Japanese text consisting of characters in the Kondate and Nakayosi datasets.
|Accuracy on Kondate test set and test set generated from Nakayosi||98.16%|
This demo adopts label error rate as the metric for accuracy.
Shape: [1x1x96x2000] - An input image in the format [BxCxHxW], where:
Note that the source image should be converted to grayscale, resized to specific height (such as 96) while keeping aspect ratio, and right-bottom padded.
The net outputs a blob with the shape [186, 1, 4442] in the format [WxBxL], where:
The network output can be decoded by CTC Greedy Decoder.
[*] Other names and brands may be claimed as the property of others.