Off-line Cursive Handwriting Recognition Using Synthetic Training Data
The problem of automatic recognition of scanned handwritten documents is of great significance in numerous scientific, business, industrial and personal applications that require the reading and processing of human written texts. The ultimate goal is that computers approach, or even surpass, the text recognition performance of humans. Despite the enormous amount of research activities that already have been carried out to study this problem, it is considered very difficult and still not satisfactorily solved. Novel methods are proposed to improve the performance of today's handwriting recognition technology. In one important part of the work novel methods are developed to generate synthetic handwriting by computer, which enable handwriting recognizers to be better trained with the help of synthetically expanded sets of training samples. The two approaches presented include the geometrical perturbation of existing human written text images as well as the generation of handwriting from ASCII transcriptions using ideal character templates and a motor model of handwriting generation. It is impressively demonstrated in a number of experiments that, by means of the procedure of synthetic training set expansion, the performance of current handwriting recognizers can be significantly improved.