Boise State University Boise State University
ScholarWorks ScholarWorks
2020 Undergraduate Research Showcase
Undergraduate Research and Scholarship
Showcases
4-24-2020
Segmentation-Free Korean Handwriting Recognition Using Neural Segmentation-Free Korean Handwriting Recognition Using Neural
Network Training Network Training
Steven Kim
Boise State University
Elisa H. Barney Smith
Boise State University
Nishatul Majid
Boise State University
Segmentation-Free Korean Handwriting Recognition Using Neural Network Segmentation-Free Korean Handwriting Recognition Using Neural Network
Training Training
Abstract Abstract
The idea of segmentation-free handwriting recognition has been introduced within the rise of deep
learning. This technique is designed to recognize any script language/symbols as long as feedable
training image set exists. The VGG-16 convolutional neural network model is used as a character spotting
network using Faster R-CNN. Through the process of manual tagging, the location, size, and types of
recognizable symbols are provided to train the network. This approach has been tested previously on text
written in the Bangla script, where it has shown over 90% of accuracy overall. For Bangla, the network is
trained and tested on Boise State Bangla Handwriting dataset. For Korean, the network is trained using
the PE_92 Handwritten Korean character image database and shows promising results.
This student presentation is available at ScholarWorks: https://scholarworks.boisestate.edu/under_showcase_2020/
91
Segmentation-free Bangla/Korean handwriting
recognition using Neural Network training
by Steven Kim, Nishatul Majid, Advisor: Dr. Elisa H. Barney Smith
Department of Electrical and Computer Engineering and Department of Computer Science
INTRODUCTION
The purpose of this research is to recognize handwriting. We
implemented the method for Bangla/Korean. This research work
has been submitted to the International Conference of Frontier in
Handwriting Recognition (ICFHR) 2020.
Image recognition is often used for transcription, where storing
data as a file is much more reliable than data written in a tangible
object.
In order to achieve high accuracy and productivity, we
implemented a neural network named VGG-16, which provides
Faster R-CNN and transfer learning.
The Bangla network uses two individual networks in order to
detect the char
acter and the diacritics.
The Korean network uses one network to detect the Hangul
characters
.
Figure 1. Example of Bangla Network
Detection
Figure 2. Example of Korean Network
Detection
Image detection tools
Faster R-CNN
Detects images of different sizes of boxes as figures 1 and 2.
An efficient method for detecting objects that are overlapped with
another object, which is often found within different types of
handwriting.
VGG-16
A pre-trained network which was converted into a character
detecting network.
Efficiently creates a network instead of building the network from
scratch.
Has 16 layers of convolutional network, developed by the Oxford
research team VGG.
PE92 Korean
Handwriting Database
Our Team used the PE92 Database that contains various
compositions of handwritten Korean Characters. This was collected
by POSTECH, funded by ETRI in 1992.
The Dataset contains 2350 classes with about 100 samples per
class.
Korean handwriting recognition was known as a difficult problem
due to the existence of various compositions and patterns
So far, researcher In-Jung Kim appli
ed a convolutional neural
network and achieved 92.92% accuracy using PE92.
Figure 3. Example classes and samples
Manual/Automated Tagging
In order to teach a network how to recognize different objects, a
process of tagging is required to draw a bounding box around each
character and label its class. Covering every Korean composition is
crucial, since we want the machine to recognize every possible case;
Overall, we have manually tagged:
133 classes of different compositions
1468 samples
In the meantime, Dr. Barney Smith created an automated tagging
framework. This framework uses existing tagged groundtruth in order
to produce more groundtruth, which allowed us to obtain 133 classes
with about 100 samples each.
Figure 4. Example of manual tagging and automated tagging process
K-Net
Using the manual/automated tagging, we have created a network
called “K-Net”. K-Net detects each individual characters and
comes up with a compound of letter. In this research, we took into
account of the individual and the compound detection accuracy, to
make sure that the network is creating the combination in correct
order.
Figure 5. Example of K-Net detection
Training Parameters
All images were resized to 600 pixels at their smallest dimension
during training.
Stochastic Gradient Descent with Momentum (SGDM) was used
for traini
ng.
The initial Learning Rate was set to 0.001 and maximum number
of epochs to 10.
Overlap ratios up-to 0.6 were used for negative training.
Number of region proposals to randomly sample from was set to
64.
Increasing the number of epochs, regions, or decreasing the
learning rate usually makes the training process better but slower.
Training/Testing
Environment (R2 Cluster)
Using VGG-16 implies the fact that a cheap laptop without an
external graphics card may take quite some time to finish the job.
However, thanks to R2 Computer Cluster provided by the Research
Computing Department, we were able to efficiently divide
computational power by simply uploading the network and running it
on the server.
Testing Result
Since we used a subset of the whole database, we were able to
finish this process quickly, but with some loss of accuracy. Using
the autonomous tagging, we got results of JRA (Jamo
Recognition Accuracy) of 91.22%, and SRA (Syllable Recognition
Accuracy) of 84.66%.
Figure 6. Result comparison data from article Autonomous Data Tagging for Offline Handwriting
Recognition: Tested with Bangla and Korean Scripts written by Nishatul Majid and Elisa H Barney Smith
Conclusion
This project is a derivative work from Nishatul Majid’s framework of
Bangla Offline Handwriting Recognition. This is a new direction of
approach for him.
Most languages write side to side, but Korean forms individual
letters using different compositions then write from left to right.
Our maximum recognition accuracy was 84.66%. We haven’t done
any post processing or finished implementing the whole dataset yet,
which opens up more possibilities of this framework may be able to
get a higher accuracy.
Reference
1. N. Majid and E. H. Barney Smith, “Segmentation-Free Bangla
Offline Handwriting Recognition using Sequential Detection of
Characters and Diacritics with a Faster R-CNN” in International
Conference on Document Analysis and Recognition (ICDAR),
September 2019.
2. Park, Gyu-Ro, In-Jung Kim, and Cheng-Lin Liu. "An evaluation of
statistical methods in handwritten hangul recognition." International
Journal on Document Analysis and Recognition (IJDAR) 16.3
(2013): 273-283.
3. Kim, In-Jung, and Xiaohui Xie. "Handwritten Hangul recognition
using deep convolutional neural networks." International Journal on
Document Analysis and Recognition (IJDAR) 18.1 (2015): 1-13.