MIT CogNet, The Brain Sciences ConnectionFrom the MIT Press, Link to Online Catalog
SPARC Communities
Subscriber : Stanford University Libraries » LOG IN

space

Powered By Google 
Advanced Search

 

The Java Computational Linguistics Environment (JaCLE)

 Duffy Gillman, William Lewis and D. Terence Langendoen University of Arizona, Tucson
  
 

Abstract:
Research in computational linguistics requires the continual development and redevelopment of data sources and tools. A number of existing tools are theory specific and force users to model within the constraints of a given theory. The Java Computational Linguistics Environment (JaCLE) is designed to provide linguists with tools to test and model theoretic assumptions, without forcing commitments to a particular theory. JaCLE makes use of annotation graphs (Bird and Liberman, 1999), and features and feature bundles as developed for the Text Encoding Initiative (Langendoen and Simons, 1995) to provide flexibility in modeling linguistic data and formalisms in tandem with varying degrees of unification in the modeling of grammar productions.

The competition between the roles of the grammar and the lexicon is a major issue in the implementation of linguistic models, and presents some of the most significant differences between various syntactic theories. Depending on how much power is given to either the grammar or lexicon, one can determine the efficacy of any given theory or implementation. Likewise, to what degree there is overlap between the grammar and lexicon, one can determine how much ambiguity is permitted within an implementation. JaCLE allows the user to test varying degrees of unification on specific lexical and grammatical fragments. This represents the major impetus for the design of JaCLE, which allowed the authors to readily test specific theories or formalisms on small fragments of lexicon and grammar without developing different systems specific to different theories.

Designed in Java, within the framework of Open Source typically seen in the Linux environment, JaCLE is designed as a foundation for testing theories of syntax as applied to parsing, with flexibility provided by altering its grammar and lexicon. Because of its modular design, and its framework built from a core set of primitive data types -- segments and features -- research of varying degrees of sophistication can use JaCLE for testing theories of syntax and parsing.

Bird, Steven and Mark Liberman. (1999) A Formal Framework for Linguistic Annotation. Technical Report MS-CIS-99-01, Department of Computer and Information Systems.
Langendoen, D. Terence and Gary F. Simons. (1995) A rationale for the TEI Recommendations for Feature-Structure Markup. Computers and the Humanitities. V. 29. pp. 191-209

 
 


© 2010 The MIT Press
MIT Logo