Saturday, September 26, 2015

Reading 9: Gesture recognition features overview

Citation

Class handout (not for distribution).

Summary

The chapter covers the basic features which can be used to classify gestures or sketches. These features include the 13 features introduced by Rubine in the paper discussed in the previous reading. Apart from features introduced by Rubine 11 more features introduced by Long are mentioned in the chapter. These features are more or less derived or similar features from Rubine's features, such as Density Metric or Area of bounding box, etc. These extra features help in classifying more complex gestures but do not do much improvement in accuracy over Rubine's feature set.

Discussion

Certain features such as Area of bounding box can have dramatic effects in classification as area is a multiplicative feature and the delta's get multiplied. So a log of area is taken to map this dramatically changing feature to a less dramatic feature. Most of the features are derived features from Rubine's features. Example: One of the density metric is total stroke length divided by the length of the bounding box diagonal.

Friday, September 25, 2015

Reading 8: Specifying Gestures by Example

Citation

Rubine, Dean. Specifying gestures by example. Vol. 25. No. 4. ACM, 1991.

PDF

Summary

Rubine has 13 features which can be used to classify sketches. The 13 features are as follows:
1) Cosine of starting angle
2) Sine of starting angle
3) Length of bounding box diagonal
4) Angle of bounding box diagonal
5) Length of start to end point
6) Cosine of angle from start to end point
7) Sine of angle from start to end point
8) Stroke length
9) Total relative rotation
10) Total absolute rotation
11) Total squared rotation (sharpness)
12) Maximum speed
13) Total time

Discussion

A linear classifier is also specified in which each new class is trained by creating a new linear function. When doing classification, the function value is calculated for all possible classes and the class with the maximum function value is chosen as the selected class. This provides a very easy and efficient way to classify sketches but might be influenced hugely by a bad initial training example, thereby degrading the accuracy of classification step later.

Thursday, September 24, 2015

Reading 7: Gestures without libraries, toolkits or training: a $1 recognizer for user interface prototypes

Citation

Wobbrock, Jacob O., Andrew D. Wilson, and Yang Li. "Gestures without libraries, toolkits or training: a $1 recognizer for user interface prototypes."Proceedings of the 20th annual ACM symposium on User interface software and technology. ACM, 2007.

PDF

Summary

The paper introduces a relatively simple algorithm to classify gestures. This is intended for people who have little or no knowledge about pattern recognition. The algorithm as 3 preprocessing steps and one step for training or classification. The training can be done by a single example. The three preprocessing steps are:
1) Resampling to points to a set of 64 points.
2) Getting rotational invariance. This is done by making the angle between the first point and the centroid of the points to be zero.
3) The next step is to do uniform scaling and aligning the centroid with a reference point.

The last step is different for training or classification. In training the the positions of points is stored for later retrieval. During classification, average distance between the points is computed and a final similarity match from all possible classes is choose based on this distance. 

Discussion

The algorithm is called $1 algorithm because it is very cheap and easy. The code itself can be implemented in less than 100 lines of code. Also the idea behind eliminating rotational invariance is very interesting and works in most easy cases. Golden state search which is a variation of hill climbing algorithm is used to find the indicative rotation angle.

Wednesday, September 16, 2015

Reading 6: Using Entropy to Distinguish Shape Versus Text in Hand-Drawn Diagrams

Citation

Bhat, Akshay, and Tracy Hammond. "Using Entropy to Distinguish Shape Versus Text in Hand-Drawn Diagrams." IJCAI. Vol. 9. 2009.

PDF

Summary

Sketches in mechanical design or UML diagrams consist of both shapes and text. Existing systems are good at recognizing each separately but not together. The paper presents a single feature (zero order entropy) which can be used to distinguish between shapes and text with accuracy of 92.06%. The main underlying idea is that entropy (or randomness) of text strokes is much higher than shape strokes.

Discussion

The series of strokes are converted into a string of characters from (A, B, C, D, E, F, X) where A to F represent the angle a point in the stroke makes with the two neighboring points. A to F represent angles from 0 to pie. X for starting points.  This string of characters is pre-processed and entropy value is calculated. The entropy of text is visibly higher than other shapes. One question that arises is there could be shapes that are very complex, their entropy could also be higher.

Sunday, September 13, 2015

Reading 5: An image-based, trainable symbol recognizer for hand-drawn sketches

Citation

Kara, Levent Burak, and Thomas F. Stahovich. "An image-based, trainable symbol recognizer for hand-drawn sketches." Computers & Graphics 29.4 (2005): 501-517.

PDF

Summary

Paper presents a new way to recognize hand-drawn figures in a sketch. The paper uses image-based approach. Achieve rotational in-variance (using polar coordinates), which is difficult in template matching systems. Current systems achieve rotation in-variance but at high cost which is not suited for interactive use. The system requires single prototype example to learn an new symbol. 

Discussion

First the definitions which are markedly dissimilar using polar coordinates are pruned. The remaining are passed through 4 classifiers, the results of which are pooled and final decision is made. The 4 classifiers use Hausdorff Distance, Modified Hausdorff Distance, Tanimoto Similarity Coefficient and Yule Coefficient. The output of the 4 classifiers is different, they are converted into a similar unit of measurement, normalized and then combined.

Thursday, September 10, 2015

Reading 4: K-sketch: a 'kinetic' sketch pad for novice animators

Citation

Davis, Richard C., Brien Colwell, and James A. Landay. "K-sketch: a'kinetic'sketch pad for novice animators." Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. ACM, 2008.

PDF

Summary

The paper presents a 2-D animation sketching system called K-sketch. Provides simple and fast interface. Comparison made with PowerPoint. Test participants showed work done three times faster and with less cognitive load. 

Discussion

The interface design was considered as an optimization problem. There are many operations that are supported such as rotating the sketch, moving it, transforming it, rotating it, etch. The final output is a playable video which shows the animation. I think such tools are not rich enough to replace professional animation tools where accuracy and a method to define the animation accurately is more needed than being able to describe the animation easily and vaguely.

Wednesday, September 9, 2015

Reading 3: iCanDraw? – Using Sketch Recognition and Corrective Feedback to Assist a User in Drawing Human Faces

Citation

Dixon, Daniel, Manoj Prasad, and Tracy Hammond. "iCanDraw: using sketch recognition and corrective feedback to assist a user in drawing human faces."Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. ACM, 2010.

PDF

Summary

System designed to help people draw human faces. Human face in an image is first pre-processed using facial recognition algorithms to extract major feature points of the face. These facial points are then used to provide suggestions and similarity scores to users. Huge emphasis is on teaching style and how to guide user to draw better images. The teaching style adopted in this paper is step-by-step instruction coupled with timely corrective feedback.

Discussion

The reference images is pre-processed to extract 40 facial features. These 40 point features are used to set a template which is used to match and evaluate the user's drawn sketch. A similarity measures is established by considering a sliding window of 4 points in the user drawn sketch and calculating the distance between the closest 4 points in the set template. The user is given feedback based on the similarity score.

Tuesday, September 8, 2015

Reading 2: Sketch Recognition Algorithms for Comparing Complex and Unpredictable Shapes

Citation

Field, Martin, et al. "Sketch recognition algorithms for comparing complex and unpredictable shapes." IJCAI. 2011.

PDF

Summary

This paper provides further insights into the algorithms discussed in Mechanix paper discussed in Reading 1. This paper more focuses on the algorithms used in Mechanix to compare two sketches. The three metrics used for comparison are Hausdorff Distance, Modified Hausdorff Distance and a variant of Tanimoto Coefficient.

Discussion

Hausdorff Distance gives the maximum of all the minimum distances between any two points in the two sketches being compared. Modified Hausdorff Distance takes the mean of Hausdorff distances from A to B and B to A. Then a match probability is calculated. The third metric is a modified version of Tanimoto Coefficient which takes into account the matching from sketch A to B as well from B to A. An average of the above three metrics is computed and if it is above a certain threshold (0.65) the sketches are considered to be matched.

Reading 1: Mechanix: A Sketch-Based Tutoring and Grading System for Free-Body Diagrams

Citation

Valentine, Stephanie, et al. "Mechanix: a sketch-based tutoring and grading system for free-body diagrams." AI Magazine 34.1 (2012): 55.


Summary

The paper introduces a tool named Mechanix which is used for teaching courses which require sketch based assignments. When number of students is very large, it becomes difficult for the instructor to review and grade individual's assignments. Also the final result doesn't really show if the student went through the right steps to come with the outcome. Mechanix provides instructors to create questions and expected sketched solution. Primary focus is drawing Truss and applying Forces and reaction forces at different points. From the students's sketch, first Truss is detected and forces are matched at different nodes. Then the student's response with instructor's expected solution using Hausdorff Distance, Modified Hausdorff Distance and Tanimoto Coefficient. Student's are also provided by a creative response if their response is wrong at certain step. Student has to go through all the required steps to complete the question.

Discussion

First the sketch is segmented into line segments. Then two nodes with shared edge and two paths between them are considered as Truss. To detect this Breadth First Search is used to first find the shortest path, and then if there exists any other path between the two nodes, that implies Truss is detected.