Nguyen Thi Thanh Tan, Le Hong Lam, Nguyen Ha Nam

Main Article Content

Abstract

The problem of optical character and handwriting recognition has been interested in researchers in long time ago. It has obtained great results in theory as well as practical applications. However, the accuracy of identification is still limited, especially in the case of low-quality input images. In this article, we propose an efficient method to recognize information fields for identification in ID card using Convolutional Neural Network (CNN) and Long Short-Term Memory networks (LSTM). The proposed method was trained in a large, various quality dataset including over three thousand ID card image samples. The implementation achieved better results compared to previous studies with the precision, recall, and f-measure from over 95 up to over 99% out of all information fields to be recognized.

Keywords: HPC, academic, industrial applications, calculations.

References

[1] T.M. Breuel, A.U. Hasan, M.A. Azawi, F. Shafait, High-performance ocr for printed english and fraktur using lstm networks, Proc. 12th Int. Conf. on Document Analysis and Recognition (2013) 683 - 687.
[2] N.T.T. Tan, N.T. Khanh, A Method for Segmentation of Vietnamese Identification Card Text Fields, Advanced Computer Science and Applications, 10 (2019) 415-421.
[3] E. Sabir, S. Rawls, P. Natarajan, Implicit language model in lstmf or ocr, Proc. 14th IAPR Int. Conf. Document Analysis and Recognition, (2017) 27–31.
[4] M.R. Yousefi, M.R. Soheili, T.M. Breuel, D. Stricker, A comparison of 1d and 2d lstm architectures for the recognition of handwritten Arabic, Proc. of SPIE-IS&T Electronic Imaging, (2015), doi 10.1117/12.2075930.
[5] P. Lyu, M. Liao, C. Yao, W. Wu, X. Bai, Mask textspotter: An end to-end trainable neural network for spotting text with arbitrary shapes, Proc. European Conf. on Computer Vision, (2018) 1-16.
[6] K. He, X. Zhang, S. Ren, J. Sun, Deep residual learning for image recognition, Proc. IEEE Conf. on Computer Vision and Pattern Recognition, (2016) 770 - 778.
[7] M.R. Yousefi, M.R. Soheili, T.M. Breuel, D. Stricker, A comparison of 1d and 2d lstm architectures for the recognition of handwritten Arabic, Proc. of SPIE-IS&T Electronic Imaging, (2015), doi 10.1117/12.2075930.
[8] W. Satyawan, M.O. Pratama, R. Jannati, G. Muhammad, B. Fajar, H. Hamzah, R. Fikri, K. Kristian, Citizen Id Card Detection using Image Processing and Optical Character Recognition, IOP Conf. Series: Journal of Physics, (2019) 1-6, doi: 10.1088/1742-6596/1235/1/012049.
[9] T.M. Breuel, A.U. Hasan, M.A. Azawi, F. Shafait, High-performance ocr for printed english and fraktur using lstm networks, Proc. 12th Int. Conf. on Document Analysis and Recognition (2013) 683 - 687.
[10] R. Smith, Limits on the application of frequency-based language models to ocr, Proc. Int. Conf. Document Analysis and Recognition, (2011) 538–542.
[11] B. Shi, X. Bai, C. Yao, An end-to-end trainable neural network for image-based sequence recognition and its application to scene text recognition, IEEE Transactions on Pattern Analysis and Machine Intelligence 39 (2017) 2298–2304.
[12] S. Ioffe, C. Szegedy, Batch normalization: Accelerating deep network training by reducing internal covariate shift, Proc. of the 32nd Int.Conf. on Machine Learning, (2015) 448–456.
[13] D. Kingma, J.B. Adam, A method for stochastic optimization, Proc. ICLR Int.Conf. on Learning Representations, (2015) 1 - 15.
[14] ABBYY FineReader Engine for OCR. https://www.abbyy.com/en-eu/finereader, 2019 (accessed 05 October 2019).
[15] Tesseract Open Source OCR Engine (main repository). https://tesseract-ocr.github.io, 2019 (accessed 03 September 2019).
[16] Python-based tools for document analysis and OCR. https://github.com/tmbdev/ocropy, 2019 (accessed 25 September 2019).