Info iconThis preview shows pages 1–3. Sign up to view the full content.

View Full Document Right Arrow Icon
R EPRESENTING AND DESCRIBING WORDS FLEXIBLY WITH THE DICTIONARY APPLICATION T SHWANE L EX by D AVID J OFFE G ILLES -M AURICE DE S CHRYVER 109 Representing and describing words flexibly with the dictionary application TshwaneLex David Joffe TshwaneDJe HLT Gilles-Maurice de Schryver Ghent University & TshwaneDJe HLT gillesmaurice.deschryver@{,} Abstract This paper describes the fully customisable and built-in DTD (Document Type Definition) editor of TshwaneLex. This powerful tool, which is based on XML (eXtensible Markup Language) standards, allows lexicographers to tailor the dictionary grammar of any project, and thus to truly represent and describe words flexibly. The demands placed on modern dictionary databases A lexicographer typically summarises each word’s analysis in the form of a meticulously constructed dictionary article. Thousands of such analyses are then brought together in reference works. Today’s reference works are electronic, so are the databases underlying them. With representations and descriptions of words becoming increasingly multifaceted and interlinked, the demands placed on modern dictionary databases grow exponentially. In this paper it is shown how one of the main challenges, namely that of providing a customisable DTD to lexicographers, was met in the dictionary application TshwaneLex. TshwaneLex is a true hybrid in that it allows for the creation of monolingual, bilingual and semi-bilingual dictionaries, for virtually any language, thanks to full Unicode support, as well as support for the Windows Input Method Editors (“soft keyboards”). At the heart of TshwaneLex lies a fully customisable DTD with which the dictionary grammar may be modified by the lexicographers themselves, and this without the need for an IT expert. Although the mindset that creating and modifying a DTD is “something you give to your IT expert to do for you” is in the dictionary industry already, TshwaneLex offers users the choice of doing this themselves if they want / need to without an IT expert. Basics of hierarchical dictionary data modelling in TshwaneLex The basic structure of each dictionary article is hierarchical. One lemma may contain several word senses, and each of those word senses may in turn contain sub- senses. Any of the senses or sub-senses may contain usage examples or MWUs, where the latter may again in turn contain one or more senses, etc. When creating a dictionary article, the lexicographer’s task can be seen as consisting of two essentially separate but
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
A SIALEX 2005 W ORDS IN A SIAN C ULTURAL C ONTEXTS 110 closely related sub-tasks: (1) to specify the basic skeletal structure or layout of each dictionary article (represented as a Tree View in TshwaneLex), and (2) to flesh out that basic structure with content (which may be inputted in or selected from sub-windows accessible with the function keys F1 and F2 in TshwaneLex). Borrowing terminology from XML, the root and branches of the Tree View may
Background image of page 2
Image of page 3
This is the end of the preview. Sign up to access the rest of the document.

This note was uploaded on 09/27/2010 for the course 6511 5487 taught by Professor Chaohue during the Spring '10 term at Mackenzie.

Page1 / 6


This preview shows document pages 1 - 3. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online