Saturday, December 12, 2015

Project Reading 8: Automatic License Plate Recognition

Citation

Chang, Shyang-Lih, et al. "Automatic license plate recognition." Intelligent Transportation Systems, IEEE Transactions on 5.1 (2004): 42-53.

PDF

Summary

License Plate Number Recognition is an important application of optical character recognition. It is used at automated toll booths, gate entries and many other points of authentication by car number plate. The key difference from sketch recognition techniques here is that the input is never available in the form of a sketch or stroke points. The input is always in the form of an image capture of the vehicle. So the task of number recognition is divided into two phases:
1) Locating the number plate of a vehicle in a given capture of the car.
2) Predicting the number present on the number plate.

Overall the process flowchart is as follows:




















Once the input license plate image is formed, it is passed through preprocessing steps where first image is converted to a binary pixel image, i.e. only white and black colors are kept. After that character segmentation is done. After character segmentation we achieve images of individual characters which can be used to extract useful features.

Discussion

The first step in feature extraction is constructing the contour of the character. Once the contour is formed. The contour lines are then used to form a set of points equidistantly spaced on the contour lines. These points are then used to match the features against a set of templates. Kohonen SO neural models is used to form the classification step. An example of feature extraction and template matching is shown below:


Project Reading 7: A study on the use of 8-directional features for online handwritten Chinese character recognition

Citation

Bai, Zhen-Long, and Qiang Huo. "A study on the use of 8-directional features for online handwritten Chinese character recognition." Document Analysis and Recognition, 2005. Proceedings. Eighth International Conference on. IEEE, 2005.

PDF

Summary

This paper presents a new way for Chinese Character Recognition using directional features only. The input sketch is passed through some preprocessing steps. These steps include linear size normalization, adding imaginary strokes, nonlinear shape normalization, equidistance resampling, and smoothing. After these pre processing step a 64x64 normalized character sample is obtained. Then at 8x8 uniformly sampled location are computed using a filter similar to Gaussian envelop of a Gabor filter. Then 8 directional features are computed from each online trajectory point. This gives a total vector of 512 data points which are then used to do the classification. The system is tested extensively on 3755 level-1 Chinese characters in GB2312-80 standard.

Discussion

The overall flowchart of the process described above is shown below:













The authors use two simple character classifiers at the classification step. The first one is a maximum discriminant function based classifier with a single prototype. The prototype is the mean of the training feature vectors, and the discriminant function is the negative Euclidean distance between a testing feature vector and the prototype feature vector.

Sunday, December 6, 2015

Project Reading 6: Improving Offline Handwritten Text Recognition with Hybrid HMM/ANN Models

Citation

Espana-Boquera, Salvador, et al. "Improving offline handwritten text recognition with hybrid HMM/ANN models." Pattern Analysis and Machine Intelligence, IEEE Transactions on 33.4 (2011): 767-779.

PDF


Summary

The paper presents a hybrid approach to optical character recognition. The idea is to use Hidden Markov Models (HMMs) to model the structural part of the optical input and use a multi layer perceptron to estimate the different classification probabilities. The paper also presents new methods to pre process the input images in terms of slope correction, size normalization and slant correction. The system was tested on IAM database.

Discussion

The image shows the hybrid Hidden Markov Model and Artificial Neural Networks technique used in this paper. First the images are pre-processed and resulting feature vector and a contextual vector from left and right of the character is processed by a Multi Layer Perceptron (MLP). The MLP outputs are then used as emission probabilities in HMMs.

Thursday, December 3, 2015

Project Reading 5: Structural Offline Handwriting Character Recognition Using Levenshtein Distance

Citation

Putra, Made Edwin Wira, and Iping Supriana. "Structural Offline Handwriting Character Recognition Using Levenshtein Distance."

PDF


Summary

This is one of the very latest research papers and discussed offline handwriting recognition using a new metric. Earlier methods of preprocessing are very expensive and use a lot of computing resources. The paper significantly improves recognition accuracy without relying on normalization techniques. The similarity metric used is Levenshtein Distance. The method was tested on digits and character images taken from ETL-1 and AIST databases. The Levenshtein distance gives accuracy of 84.69% on digits and 67.01% on alphabets. 

Discussion

In the preprocessing step the images is passed through a thresholding stage, then a thinning, and then slant correction. Features are extracted based on curve extraction, string feature representation and string graph representation. Then a string edit distance algorithm is used in this paper which is based on Levenshtein distance. The algorithm makes use of dynamic programming by using a 2-D array technique for calculating edit distance thereby speeding up the computation.  Levenshtein distance is the minimum distance required to change one string into another. The change operations are insertion, substitution and deletion.

Tuesday, December 1, 2015

Project Reading 4: Historical review of OCR research and development

Citation

Mori, Shunji, Ching Y. Suen, and Kazuhiko Yamamoto. "Historical review of OCR research and development." Proceedings of the IEEE 80.7 (1992): 1029-1058.

PDF


Summary

This paper is a review paper targeted specifically to OCR (Optical Character Recognition). OCR is a branch in character recognition where the input character which might be in the form of stroke points is first converted into an image to perform recognition. The paper discusses the research and development in the OCR field and what are the commercial uses of such techniques. 

Discussion

First generation Optical Character Recognition (OCR) systems can be dated back to 1930's. It was only a dream at that time till the advent of computers in the 1950's. The main research started in 1960's and started with template matching approach. The second generation applied structural analytical approach. The third generation of OCR brought a lot of commercial application of the academic research particularly in zip code recognition in postal services. As the time proceeded the R&D of OCR moved towards word recognition using contextual knowledge such as addresses and names. In such development postal address name reading machines came into trend. The history tells us that OCR technology has been researched over a long time. Lack of computing resources produced a major hindrance in testing many techniques but the situation improved with the advent of fast processors and computers.