| |
Abstract:
We prove that the Canonical Distortion Measure (CDM) is the
optimal distance measure to use for 1 nearest-neighbour (1-NN)
classification, and show that it reduces to squared Euclidean
distance in feature space for function classes that can be
expressed as linear combinations of a fixed set of features.
PAC-like bounds are given on the sample-complexity required to
learn the CDM. An experiment is presented in which a neural network
CDM was learnt for a Japanese OCR environment and then used to do
1-NN classification.
|