Loading…

Empirical measure large deviations for reinforced chains on finite spaces

Let A be a transition probability kernel on a finite state space Δo={1,…,d} such that A(x,y)>0 for all x,y∈Δo. Consider a reinforced chain given as a sequence {Xn,n∈N0} of Δo-valued random variables, defined recursively according to, Ln=1n∑i=0n−1δXi,P(Xn∈⋅∣X0,…,Xn−1)=LnA(⋅). We establish a large...

Full description

Saved in:

Bibliographic Details
Published in:	Systems & control letters 2022-11, Vol.169, p.105379, Article 105379
Main Authors:	Budhiraja, Amarjit, Waterbury, Adam
Format:	Article
Language:	English
Subjects:	Empirical measure Infinite horizon discounted cost Large deviation principle Reinforced random walks Stochastic approximation Time-reversal
Citations:	Items that this one cites
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	Let A be a transition probability kernel on a finite state space Δo={1,…,d} such that A(x,y)>0 for all x,y∈Δo. Consider a reinforced chain given as a sequence {Xn,n∈N0} of Δo-valued random variables, defined recursively according to, Ln=1n∑i=0n−1δXi,P(Xn∈⋅∣X0,…,Xn−1)=LnA(⋅). We establish a large deviation principle for {Ln,n∈N}. The rate function takes a strikingly different form than the Donsker–Varadhan rate function associated with the empirical measure of the Markov chain with transition kernel A and is described in terms of a novel deterministic infinite horizon discounted cost control problem with an associated linear controlled dynamics and a nonlinear running cost involving the relative entropy function. Proofs are based on an analysis of time-reversal of controlled dynamics in representations for log-transforms of exponential moments, and on weak convergence methods.
ISSN:	0167-6911 1872-7956
DOI:	10.1016/j.sysconle.2022.105379