Loading…

Chained Kullback-Leibler divergences

We define and characterize the "chained" Kullback-Leibler divergence min w D(p||w) + D(w||q) minimized over all intermediate distributions w and the analogous k-fold chained K-L divergence min D(p||w k -1 ) + ... + D(w 2 ||w 1 ) + D(w 1 ||q) minimized over the entire path (w 1 , ... w k -1...

Full description

Saved in:
Bibliographic Details
Published in:2016 IEEE International Symposium on Information Theory (ISIT) 2016-07, Vol.2016, p.580-584
Main Authors: Pavlichin, Dmitri S., Weissman, Tsachy
Format: Article
Language:English
Subjects:
Online Access:Request full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:We define and characterize the "chained" Kullback-Leibler divergence min w D(p||w) + D(w||q) minimized over all intermediate distributions w and the analogous k-fold chained K-L divergence min D(p||w k -1 ) + ... + D(w 2 ||w 1 ) + D(w 1 ||q) minimized over the entire path (w 1 , ... w k -1 ). This quantity arises in a large deviations analysis of a Markov chain on the set of types - the Wright-Fisher model of neutral genetic drift: a population with allele distribution q produces offspring with allele distribution w, which then produce offspring with allele distribution p, and so on. The chained divergences enjoy some of the same properties as the K-L divergence (like joint convexity in the arguments) and appear in k-step versions of some of the same settings as the K-L divergence (like information projections and a conditional limit theorem). We further characterize the optimal k-step "path" of distributions appearing in the definition and apply our findings in a large deviations analysis of the Wright-Fisher process. We make a connection to information geometry via the previously studied continuum limit, where the number of steps tends to infinity, and the limiting path is a geodesic in the Fisher information metric. Finally, we offer a thermodynamic interpretation of the chained divergence (as the rate of operation of an appropriately defined Maxwell's demon) and we state some natural extensions and applications (a k-step mutual information and k-step maximum likelihood inference). We release code for computing the objects we study.
ISSN:2157-8095
2157-8117
DOI:10.1109/ISIT.2016.7541365