## Neural Computation

From a smooth, strictly convex function Φ: R* ^{n}* → R, a parametric family of divergence function

*D*

_{Φ}

^{(α)}may be introduced:

for *x*, *y* ∈ int dom(Φ) ⊂ R*n*, and for α ∈ R, with *D*_{Φ}^{(±1)} defined through taking the limit of *α*. Each member is shown to induce an *α*-independent Riemannian metric, as well as a pair of dual *α*-connections, which are generally nonflat, except for *α* = ±1. In the latter case, *D*(±1)_{Φ} reduces to the (nonparametric) Bregman divergence, which is representable using Φ and its convex conjugate Φ* and becomes the canonical divergence for dually flat spaces (Amari, 1982, 1985; Amari & Nagaoka, 2000). This formulation based on convex analysis naturally extends the information-geometric interpretation of divergence functions (Eguchi, 1983) to allow the distinction between two different kinds of duality: referential duality (*α* ⟷-*α*) and representational duality (Φ ⟷ Φ*). When applied to (not necessarily normalized) probability densities, the concept of conjugated representations of densities is introduced, so that ±*α*-connections defined on probability densities embody both referential and representational duality and are hence themselves bidual. When restricted to a finite-dimensional affine submanifold, the natural parameters of a certain representation of densities and the expectation parameters under its conjugate representation form biorthogonal coordinates. The alpha representation (indexed by *β* now, *β* ∈ [−1, 1]) is shown to be the only measure-invariant representation. The resulting two-parameter family of divergence functionals *D*^{(α, β)}, (*α*, *β*) ∈ [−1, 1] × [-1, 1] induces identical Fisher information but bidual alpha-connection pairs; it reduces in form to Amari's alpha-divergence family when *α* =±1 or when *β* = 1, but to the family of Jensen difference (Rao, 1987) when *β* = -1.