MIT CogNet, The Brain Sciences ConnectionFrom the MIT Press, Link to Online Catalog
SPARC Communities
Subscriber : Stanford University Libraries » LOG IN

space

Powered By Google 
Advanced Search

 

The Parallel Problems Server: an Interactive Tool for Large Scale Machine Learning

 Charles Lee Isbell, Jr. and Parry Husbands
  
 

Abstract:
Imagine that you wish to classify data consisting of tens of thousands of examples residing in a twenty thousand dimensional space. How can one apply standard machine learning algorithms? We describe the Parallel Problems Server (PPServer) and MATLAB*P. In tandem they allow users of networked computers to work transparently on large data sets from within Matlab. This work is motivated by the desire to bring the many benefits of scientific computing algorithms and computational power to machine learning researchers. We demonstrate the usefulness of the system on a number of tasks. For example, we perform {\em independent components analysis} on very large text corpora consisting of tens of thousands of documents, making minimal changes to the original Bell and Sejnowski Matlab source. Applying ML techniques to data previously beyond their reach leads to interesting analyses of both data and algorithms.

 
 


© 2010 The MIT Press
MIT Logo