| |
Abstract:
Structure in a visual scene can be described at many levels
of granularity. At a coarse level, the scene is composed of
objects; at a finer level, each object is made up of parts, and the
parts of subparts. In this work, I propose a simple principle by
which such hierarchical structure can be extracted from visual
scenes: Regularity in the relations among different parts of an
object is weaker than in the internal structure of a part. This
principle can be applied recursively to define part-whole
relationships among elements in a scene. The principle does not
make use of object models, categories, or other sorts of
higher-level knowledge; rather, part-whole relationships can be
established based on the statistics of a set of sample visual
scenes. I illustrate with a model that performs unsupervised
decomposition of simple scenes. The model can account for the
results from a human learning experiment on the ontogeny of
part-whole relationships.
|