| |
Abstract:
The rule-based bootstrapping introduced by Yarowsky, and its
co-training variant by Blum and Mitchell, have met with
considerable empirical success. Earlier work on the theory of
co-training has been only loosely related to empirically useful
co-training algorithms. Here we give a new PAC-style bound on
generalization error which justifies both the use of confidences
-- partial rules and partial labeling of the unlabeled data --
and the use of an agreement-based objective function as suggested
by Collins and Singer. Our bounds apply to the multiclass case,
i.e., where instances are to be assigned one of
k
labels for
k
≥ 2.
|