Baseline Structure Analysis
of
Handwritten Mathematics Notation
Richard Zanibbi
Dorothea Blostein
James R. Cordy
Department
Computing
&
Information Science,
Queen's University, Kingston, Ontario, Canada
{
zanibbi, blostein, cordy
}
@cs.queensu.ca
Abstract
The structure
of mathematics notation is particularly
dificult to recognize in handwritten notation because irregular
symbol placements are common. We present an eficient and
robust method of parsing handwritten and typeset mathematics
notation without backtracking. The system is designed to be
easily adaptable to various dialects of mathematics notation.
The following strategies are used: (I) separate the analysis
layout, syntax, and semantics. (2) recursively apply search
functions and image partitioning to recognize dominant and
nested baselines, and
(3)
use tree transformations to express
computations in a compact, eficiently executable form.
1. Introduction
Mathematics notation conveys information using a two
dimensional arrangement of symbols.
Recognition software
must analyze this spatial structure, in order to convert from a
document image to a structural representation such as LaTeX or
a semantic representation such as an operator tree
or
Maple.
However, it is difficult to define robust, general and efficient
methods for analyzing the spatial structure of mathematics
notation. This problem is particularly difficult in handwritten
mathematics notation (obtained from scanned document images,
or from data tablet input), where irregular placement of symbols
is common.
1.1 Summary of Existing Work
Research
into
automatic
recognition
of mathematical
expressions has been ongoing for over thirty years
[3,5].
Methods developed for recognizing the twodimensional layout
of symbols in
a
math expression can be roughly categorized into
syntactic
(grammarbased)
and
algorithmic
approaches.
Syntactic methods have been used extensively, including
coordinate
grammars
[
1,251,
attributed
string
grammars
[2,12,13,14,34],
stochastic
grammars
[
10,261,
structure
specification
schemes
[6],
and
graph
transformation
[
17,22,23,28].
Algorithmic
approaches
have
included
recursively locating vertically stacked groups of symbols using
procedural
[24]
and blackboardstyle methods
[
15,301,
recursive
baseline location
[20],
and minimization of penalty functions on
symbol relations
[
161.
Another algorithmic approach, projection
profile cutting with subsequent adjustments, has been used to
0769512631/01/$10.00
0
2001 IEEE
768
obtain expression structure directly from pixel maps
[
18,27,29].
Ambiguities of symbol layout and identity have been handled by
constructing
multiple
interpretations
and then eliminating
unsyntactic
[3
I] or unlikely
[26]
interpretations.
We obtain two insights from this literature. First, almost all
authors use trees to describe the spatial structure of mathematics
notation. In many cases the tree is an explicit data structure; in
other cases an implicit parse tree is created.
Second,
mathematical
expressions
have
a preferred
direction
of
interpretarion,
as used by human readers; this directionality can
be exploited by a recognition system
(1,14,25].
This preview has intentionally blurred sections. Sign up to view the full version.
View Full Document
This is the end of the preview.
Sign up
to
access the rest of the document.
 Spring '08
 Staff
 Graph Theory, Expression, Table of mathematical symbols, Mathematical notation, Mathematical Expression

Click to edit the document details