8. CONCLUSION
In this work, we have proposed a method for the embedded neural network training for Korean hieroglyphs
recognition. Our study shows that the quality highly depends not only on the presence of different symbols in the training
data, but also on the similarity of the imposter pairs. The neural network, trained with the suggested method, has a very
small number of weights in comparison to the heavy state-of-the-art methods that require huge computational power. On
the contrary, the neural network trained with the suggested method can be used on mobile devices and in other special
cases with limited computational power. Still, in comparison with the Tesseract open-source OCR engine, our neural
network shows a very good quality.
In our future works, we plan to study this field further and generalize the pair generation method for different
languages and will try to rely not on the hieroglyph keys but the training process since in some tasks it is impossible to
obtain them. We will use the trained net answers instead of the information from the keys of hierogliphs, to extend and
improve our method for not systematic data. Along with that, we plan to improve the quality by using more complex
final space and loss function.
ACKNOWLEDGMENTS
This work is partially supported by the Russian Foundation for Basic Research (projects 17-29-03370, 17-29-07093).
REFERENCES
[1] K. Wang, B. Babenko, and S. Belongie, “End-to-end scene text recognition,” in 2011 International Conference on
Computer Vision, 1457–1464, IEEE (2011).
[2] M. Liao, J. Zhang, Z. Wan, F. Xie, J. Liang, P. Lyu, C. Yao, and X. Bai, “Scene text recognition from
two-dimensional perspective,” in Proceedings of the AAAI Conference on Artificial Intelligence, 33, 8714–8721
(2019).
[3] K. Bulatov, “A method to reduce errors of string recognition based on combination of several recognition results
with per-character alternatives,” Bulletin of the South Ural State University. Ser. Mathematical Modelling,
Programming & Computer Software 12(3), 74–88 (2019). 10.14529/mmp190307.
[4] X. Zhang, J. Zhao, and Y. LeCun, “Character-level convolutional networks for text classification,” in Advances in
neural information processing systems, 649–657 (2015)
[5] A. Noordeen, K. Kannan, H. Ravi, B. Venkatraman, and R. Milton, “Hierarchical ocr for printed tamil text,” in
Eleventh International Conference on Machine Vision (ICMV 2018), 11041, 110411G, International Society for
Optics and Photonics (2019).
[6] S. Gladilin, D. Nikolaev, D. Polevoi, and N. Sokolova, “Study of multilayer perceptron accuracy improvement
under fixed number of neuron,” Informatsionnye tekhnologii I ychislitelnye sistemy (1), 96–105 (2016).
[7] G. Koch, R. Zemel, and R. Salakhutdinov, “Siamese neural networks for one-shot image recognition,” in ICML
deep learning workshop, 2 (2015).
[8] Y. Taigman, M. Yang, M. Ranzato, and L. Wolf, “Deepface: Closing the gap to human-level performance in face
verification,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 1701–1708 (2014).
[9] A. P. Tafti, A. Baghaie, M. Assefi, H. R. Arabnia, Z. Yu, and P. Peissig, “Ocr as a service: an experimental
evaluation of google docs ocr, tesseract, abbyy finereader, and transym,” in International Symposium on Visual
Computing, 735–746, Springer (2016).
[10] M. Franken and J. C. van Gemert, “Automatic egyptian hieroglyph recognition by retrieving images as texts,” in
Proceedings of the 21st ACM international conference on Multimedia, 765–768, ACM (2013).
[11] S.-G. Lee, Y. Sung, Y.-G. Kim, and E.-Y. Cha, “Variations of alexnet and googlenet to improve korean character
recognition performance.,” Journal of Information Processing Systems 14(1) (2018).
[12] Y.-g. Kim and E.-y. Cha, “Learning of large-scale korean character data through the convolutional neural network,”
in Proceedings of the Korean Institute of Information and Commucation Sciences Conference, 97–100, The Korea
Institute of Information and Commucation Engineering (2016).