Friday, November 27, 2015

Project Reading 3: Online and offline handwritten recognition: a comprehensive survey

Citation

Plamondon, Réjean, and Sargur N. Srihari. "Online and off-line handwriting recognition: a comprehensive survey." Pattern Analysis and Machine Intelligence, IEEE Transactions on 22.1 (2000): 63-84.

PDF


Summary

This is a survey paper providing an breath view of different online and offline handwriting recognition techniques, discussing the state of the art and many other techniques specific to certain domains. For offline handwriting recognition the paper discusses a variety of preprocessing steps, which include thresholding, noise removal, line segmentation (which is an important problem already thoroughly discussed in previous readings), word and character segmentation. Then comes the character recognition step, which can be divided into two main areas, mainly OCR (Optical Character Recognition) treating the input character as an image rather than a set of sketched points and performing recognition on the image. The second area is doing sketch based recognition, which takes the input character as a set of points with x, y coordinates or possible a timestamp.

Discussion

In the online character recognition space, the problem is even more complex as we have to process the same input character and give the recognition result within a certain timeframe to support its use in a realtime application. A lot of structural and rule based methods have been explored in this area, some being discussed in previous readings like paper from Rubine or Long. Another type of method applied are statistical methods using sketch as a realtime application of providing more information by giving more strokes and using this information to make a prediction. Markov Models have been used to model this process.

Thursday, November 26, 2015

Project Reading 2: Template-based online character recognition

Citation

Connell, Scott D., and Anil K. Jain. "Template-based online character recognition." Pattern Recognition 34.1 (2001): 1-14.

PDF


Summary

The paper presents and demonstrates a template-based system for online character recognition where the number of representative templates is determined automatically. These templates can be viewed as representing different styles of writing a particular character. These templates are then used to classify the character particularly using decision trees. Overall the system produced accuracy of 86.9% on a dataset of around 18000 characters which included the 26 lower case characters, 10 numerical digits.

Discussion

The input data was resampled to produce a points which were equidistant in space rather than in time. Then Gaussian filtering was also applied on both x and y coordinates. The paper discussed two different methods for classification, the first being nearest neighbor classification and the other being decision tree based classification. The paper uses a custom distance metric to compute the distance between an input sketch and a set of template sketches.

Wednesday, November 25, 2015

Project Reading 1: Online handwriting recognition: the NPen++ recognizer

Citation

Jaeger, Stefan, et al. "Online handwriting recognition: the NPen++ recognizer."International Journal on Document Analysis and Recognition 3.3 (2001): 169-180.

PDF


Summary

The paper presents an on line handwriting recognition system called NPen++. This recognition engine is based on multi state time delay neural networks. The recognition accuracy was found to be from 96 percent for a dictionary of size 5000 and around 93 percent for a dictionary of around 20000 words. The preprocessing state has various steps from normalizing size, normalizing rotations, interpolating missing points, smoothing, normalizing inclination, resampling and removing delayed strokes.

Discussion

The features computed for recognition are writing direction, curvature, pen-up/pen-down times, hat feature, aspect, curliness, line-ness, slope, ascenders/descenders, context bitmaps. The Multi-State Time Delay Neural Networks (MS-TDNN). The system was evaluated on many datasets from UKA, CMU and MIT which included both printed and cursive writings.


Sunday, November 15, 2015

Reading 27: A Visual Approach to Sketched Symbol Recognition

Citation

Ouyang, Tom Y., and Randall Davis. "A visual approach to sketched symbol recognition." (2009).



Summary

The paper presents a image based method for sketch recognition. Earlier methods have used geometric or gesture based features for sketch recognition. This paper tries to apply ideas of computer vision to sketch recognition. The paper presents 5 novel image based features and and efficient metric for recognition which improves the accuracy significantly compared to state of the art systems.
The intuition is that images are perceived more as images by humans than as stroke points and their geometric properties. The authors present 5 new features:
1) 4 features based on the stroke direction with respect to 4 reference angles (0, 45, 90 and 135 degrees). The feature values are calculated as the difference between the stroke angle and the reference angle, and vary linearly between 1.0 (if the two are equal) and 0.0 (if they differ by more than 45 degrees)
2) 1 feature based on the endpoints of the stroke. It is equal to 1.0 if the point is at the beginning or end of a stroke and 0.0 otherwise.

The overall process is as follows:


Discussion

The distance metric the authors use to find the distance between an input sketch and a template sketch is as follows:


dx represents the shift in of a point inside a 3x3 box. The minimum distance among these shifts is taken to be the distance for that x, y. This is done for all x, y boxes. The image below shows this:

The paper applies two pruning methods:
1) Coarse Candidate Pruning, based on taking first k Principle Component features and Euclidean L2 distance metric.
2) Hierarchical Clustering.

Saturday, November 14, 2015

Reading 26: Envisioning sketch recognition: a local feature based approach to recognizing informal sketches

Citation

Oltmans, Michael. Envisioning sketch recognition: a local feature based approach to recognizing informal sketches. Diss. Massachusetts Institute of Technology, 2007.

PDF


Summary

This thesis presents a novel way to recognize sketches not by their geometric features but by introducing a new type of features named the bullseye feature by the authors. The idea is overlay the input sketch on a bullseye concentric circles shape and count the number of dots in each part of the shape. Then match shapes by finding how close are the number of points in each of the bullseye part. The bullseye features look like these:
The bullseye feature typically is the number of ink points lying in each of the part of the bullseye. The radius of the concentric circles increases on a log scale. The inner parts are smaller and the outer parts are larger, the intuition behind which is central parts of a sketch are much more important than the outer parts.

Discussion

By calculating the bullseye's histogram relative to the stroke direction, instead of relative to the x-axis, the bullseyes are rotationally invariant and do not change when the shape is drawn at a different orientation.
Each part is also divided into 4 sub-bins containing number of points at particular stroke angle. Doing this encodes some sense of direction in the bullseye features. The bins look like this:
The distance metric used to find the close ness of two bullseye features is weighted such that inner circles are weighted higher than the outer circles, thereby complementing the larger size of the outer bins. The distance metric is a modified version of the common X square distance. 
The shapes are represented in the form of Match Vectors in the Codebook.

Reading 25: Who dotted that 'i'?: context free user differentiation through pressure and tilt pen data

Citation

Eoff, Brian David, and Tracy Hammond. "Who dotted that'i'?: context free user differentiation through pressure and tilt pen data." Proceedings of Graphics Interface 2009. Canadian Information Processing Society, 2009.

PDF


Summary

The paper presents a novel way to distinguishing who is using the pen on a sketch input surface by using features based on pen tilt, pressure and speed. The paper presents two experiments: 
1) First experiment proves that tilt, pressure and speed features are consistent for a user within a certain sketch and also consistent in time varying in hours or days. 
2) The second experiment tries to match a given set of tilt, pressure and speed features to a unique user. The paper uses k nearest neighbor classifier and gets an accuracy of 97.5 percent to distinguish two users and an accuracy of 83.5 percent to distinguish ten users simultaneously.

Discussion

The paper presents 24 features based on pen tilt, pressure and speed. 14 are based on pen tilt, 7 based on pressure and 3 based on speed. The features used in the paper are as follows:

The authors found out as the set size of the number of users to distinguish increases, the accuracy decreases.

Sunday, November 8, 2015

Reading 24: SketchREAD: a multi-domain sketch recognition engine

Citation

Alvarado, Christine, and Randall Davis. "SketchREAD: a multi-domain sketch recognition engine." Proceedings of the 17th annual ACM symposium on User interface software and technology. ACM, 2004.

PDF

Summary

SketchREAD (Sketch recognition ending for many domains) is a multi domain sketch recognition system capable of understanding freely drawn, messy, two dimensional diagrammatic sketches. The system provides a method for users to describe shapes which can be used to setup appropriate recognizers for the same. The system builds a Bayesian network and assigns probabilities for each interpretation given by the system.

Discussion

The system tries to model each shape by the set of rules and these rules can be used for recognizing objects later. It is much more intuitive than earlier methods as writing domain specific recognizer for each domain can be very tedious tasks. Rather the system lets the user to describe the shapes. An example showing the pattern used to describe a new shape:

Reading 23: A Tutorial on Hidden Markov Models and Selected Applications in Speech Recognition

Citation

Rabiner, Lawrence R. "A tutorial on hidden Markov models and selected applications in speech recognition." Proceedings of the IEEE 77.2 (1989): 257-286.

PDF


Summary

The paper presents an in-depth overview of Hidden Markov Models (HMMs) and then gives a real world example of usage of HMM's in speech recognition application. In Hidden Markov Models the system being modeled is assumed to be a Markov Process with unobserved or hidden states. HMMs can be presented as simplest dynamic Bayesian network. For example, a HMM with 4 states and 3 observed events is shown below (probabilities are not mentioned along the edges):


Discussion

Forward chaining is a method in which we find probabilities of upcoming events in forward direction of movement of vents. At each step we calculate the probabilities by multiplying the probabilities of current events with the probabilities of past events. At each stage we can prune out the states which cannot be maximum in any case. In backward chaining we do the same thing but in reverse direction, that is from current event to events in the past.