MIT CogNet, The Brain Sciences ConnectionFrom the MIT Press, Link to Online Catalog
SPARC Communities
Subscriber : Stanford University Libraries » LOG IN

space

Powered By Google 
Advanced Search

 

Agnostic Classification of Markovian Sequences

 Ran El-Yaniv, Shai Fine and Naftali Tishby
  
 

Abstract:
Classification of finite sequences without explicit knowledge of their statistical origins is of great importance to various applications, such as text categorization and biological sequence analysis. We propose a new information theoretic algorithm for this problem based on two important ingredients. The first, motivated by the "two sample problem", provides an (almost) optimal model independent criterion for comparing two statistical sources. The second ingredient is a novel idea for agnostic "statistical merging" of sequences in an asymptotically optimal way.

In this paper we introduce the algorithm and illustrate its application for synthesis of text sources and for hierarchical classification of short text segments.

 
 


© 2010 The MIT Press
MIT Logo