Loading…

SUPPORT POINTS

This paper introduces a new way to compact a continuous probability distribution F into a set of representative points called support points. These points are obtained by minimizing the energy distance, a statistical potential measure initially proposed by Székely and Rizzo [InterStat 5 (2004) 1–6]...

Full description

Saved in:
Bibliographic Details
Published in:The Annals of statistics 2018-12, Vol.46 (6A), p.2562-2592
Main Authors: Mak, Simon, Joseph, V. Roshan
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
cited_by cdi_FETCH-LOGICAL-c317t-a1a41fa3635e7fed7646c8b6fbb34c614d2118f2a58f0fb92a63efa04ea12c923
cites cdi_FETCH-LOGICAL-c317t-a1a41fa3635e7fed7646c8b6fbb34c614d2118f2a58f0fb92a63efa04ea12c923
container_end_page 2592
container_issue 6A
container_start_page 2562
container_title The Annals of statistics
container_volume 46
creator Mak, Simon
Joseph, V. Roshan
description This paper introduces a new way to compact a continuous probability distribution F into a set of representative points called support points. These points are obtained by minimizing the energy distance, a statistical potential measure initially proposed by Székely and Rizzo [InterStat 5 (2004) 1–6] for testing goodness-of-fit. The energy distance has two appealing features. First, its distance-based structure allows us to exploit the duality between powers of the Euclidean distance and its Fourier transform for theoretical analysis. Using this duality, we show that support points converge in distribution to F, and enjoy an improved error rate to Monte Carlo for integrating a large class of functions. Second, the minimization of the energy distance can be formulated as a difference-of-convex program, which we manipulate using two algorithms to efficiently generate representative point sets. In simulation studies, support points provide improved integration performance to both Monte Carlo and a specific quasi-Monte Carlo method. Two important applications of support points are then highlighted: (a) as a way to quantify the propagation of uncertainty in expensive simulations and (b) as a method to optimally compact Markov chain Monte Carlo (MCMC) samples in Bayesian computation.
doi_str_mv 10.1214/17-AOS1629
format article
fullrecord <record><control><sourceid>jstor_proqu</sourceid><recordid>TN_cdi_proquest_journals_2155912617</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><jstor_id>26542875</jstor_id><sourcerecordid>26542875</sourcerecordid><originalsourceid>FETCH-LOGICAL-c317t-a1a41fa3635e7fed7646c8b6fbb34c614d2118f2a58f0fb92a63efa04ea12c923</originalsourceid><addsrcrecordid>eNo9z81LwzAYBvAgCs6pB--C4E2I5s3Hm-Q4hh-DYYvtziHtErConUl38L-30uHpufx4Hh5CroDdAwf5AJouigqQ2yMy44CGGot4TGaMWUaVQHlKznLuGGPKSjEjl9WmLIu3-qYsVq91dU5Oov_I4eKQc7J5eqyXL3RdPK-WizVtBeiBevASohcoVNAxbDVKbE2DsWmEbBHklgOYyL0ykcXGco8iRM9k8MBby8Wc3E69u9R_70MeXNfv09c46TgoZYEj6FHdTapNfc4pRLdL758-_Thg7u-vA-0Of0d8PeEuD336lxyV5EYr8QvNU02q</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2155912617</pqid></control><display><type>article</type><title>SUPPORT POINTS</title><source>JSTOR Archival Journals and Primary Sources Collection</source><creator>Mak, Simon ; Joseph, V. Roshan</creator><creatorcontrib>Mak, Simon ; Joseph, V. Roshan</creatorcontrib><description>This paper introduces a new way to compact a continuous probability distribution F into a set of representative points called support points. These points are obtained by minimizing the energy distance, a statistical potential measure initially proposed by Székely and Rizzo [InterStat 5 (2004) 1–6] for testing goodness-of-fit. The energy distance has two appealing features. First, its distance-based structure allows us to exploit the duality between powers of the Euclidean distance and its Fourier transform for theoretical analysis. Using this duality, we show that support points converge in distribution to F, and enjoy an improved error rate to Monte Carlo for integrating a large class of functions. Second, the minimization of the energy distance can be formulated as a difference-of-convex program, which we manipulate using two algorithms to efficiently generate representative point sets. In simulation studies, support points provide improved integration performance to both Monte Carlo and a specific quasi-Monte Carlo method. Two important applications of support points are then highlighted: (a) as a way to quantify the propagation of uncertainty in expensive simulations and (b) as a method to optimally compact Markov chain Monte Carlo (MCMC) samples in Bayesian computation.</description><identifier>ISSN: 0090-5364</identifier><identifier>EISSN: 2168-8966</identifier><identifier>DOI: 10.1214/17-AOS1629</identifier><language>eng</language><publisher>Hayward: Institute of Mathematical Statistics</publisher><subject>Bayesian analysis ; Computer simulation ; Energy conservation ; Euclidean geometry ; Fourier transforms ; Goodness of fit ; Markov chains ; Monte Carlo simulation ; Optimization ; Statistical analysis ; Statistical methods</subject><ispartof>The Annals of statistics, 2018-12, Vol.46 (6A), p.2562-2592</ispartof><rights>Institute of Mathematical Statistics, 2018</rights><rights>Copyright Institute of Mathematical Statistics Dec 2018</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c317t-a1a41fa3635e7fed7646c8b6fbb34c614d2118f2a58f0fb92a63efa04ea12c923</citedby><cites>FETCH-LOGICAL-c317t-a1a41fa3635e7fed7646c8b6fbb34c614d2118f2a58f0fb92a63efa04ea12c923</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://www.jstor.org/stable/pdf/26542875$$EPDF$$P50$$Gjstor$$H</linktopdf><linktohtml>$$Uhttps://www.jstor.org/stable/26542875$$EHTML$$P50$$Gjstor$$H</linktohtml><link.rule.ids>314,776,780,27901,27902,58213,58446</link.rule.ids></links><search><creatorcontrib>Mak, Simon</creatorcontrib><creatorcontrib>Joseph, V. Roshan</creatorcontrib><title>SUPPORT POINTS</title><title>The Annals of statistics</title><description>This paper introduces a new way to compact a continuous probability distribution F into a set of representative points called support points. These points are obtained by minimizing the energy distance, a statistical potential measure initially proposed by Székely and Rizzo [InterStat 5 (2004) 1–6] for testing goodness-of-fit. The energy distance has two appealing features. First, its distance-based structure allows us to exploit the duality between powers of the Euclidean distance and its Fourier transform for theoretical analysis. Using this duality, we show that support points converge in distribution to F, and enjoy an improved error rate to Monte Carlo for integrating a large class of functions. Second, the minimization of the energy distance can be formulated as a difference-of-convex program, which we manipulate using two algorithms to efficiently generate representative point sets. In simulation studies, support points provide improved integration performance to both Monte Carlo and a specific quasi-Monte Carlo method. Two important applications of support points are then highlighted: (a) as a way to quantify the propagation of uncertainty in expensive simulations and (b) as a method to optimally compact Markov chain Monte Carlo (MCMC) samples in Bayesian computation.</description><subject>Bayesian analysis</subject><subject>Computer simulation</subject><subject>Energy conservation</subject><subject>Euclidean geometry</subject><subject>Fourier transforms</subject><subject>Goodness of fit</subject><subject>Markov chains</subject><subject>Monte Carlo simulation</subject><subject>Optimization</subject><subject>Statistical analysis</subject><subject>Statistical methods</subject><issn>0090-5364</issn><issn>2168-8966</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2018</creationdate><recordtype>article</recordtype><recordid>eNo9z81LwzAYBvAgCs6pB--C4E2I5s3Hm-Q4hh-DYYvtziHtErConUl38L-30uHpufx4Hh5CroDdAwf5AJouigqQ2yMy44CGGot4TGaMWUaVQHlKznLuGGPKSjEjl9WmLIu3-qYsVq91dU5Oov_I4eKQc7J5eqyXL3RdPK-WizVtBeiBevASohcoVNAxbDVKbE2DsWmEbBHklgOYyL0ykcXGco8iRM9k8MBby8Wc3E69u9R_70MeXNfv09c46TgoZYEj6FHdTapNfc4pRLdL758-_Thg7u-vA-0Of0d8PeEuD336lxyV5EYr8QvNU02q</recordid><startdate>20181201</startdate><enddate>20181201</enddate><creator>Mak, Simon</creator><creator>Joseph, V. Roshan</creator><general>Institute of Mathematical Statistics</general><scope>AAYXX</scope><scope>CITATION</scope><scope>JQ2</scope></search><sort><creationdate>20181201</creationdate><title>SUPPORT POINTS</title><author>Mak, Simon ; Joseph, V. Roshan</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c317t-a1a41fa3635e7fed7646c8b6fbb34c614d2118f2a58f0fb92a63efa04ea12c923</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2018</creationdate><topic>Bayesian analysis</topic><topic>Computer simulation</topic><topic>Energy conservation</topic><topic>Euclidean geometry</topic><topic>Fourier transforms</topic><topic>Goodness of fit</topic><topic>Markov chains</topic><topic>Monte Carlo simulation</topic><topic>Optimization</topic><topic>Statistical analysis</topic><topic>Statistical methods</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Mak, Simon</creatorcontrib><creatorcontrib>Joseph, V. Roshan</creatorcontrib><collection>CrossRef</collection><collection>ProQuest Computer Science Collection</collection><jtitle>The Annals of statistics</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Mak, Simon</au><au>Joseph, V. Roshan</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>SUPPORT POINTS</atitle><jtitle>The Annals of statistics</jtitle><date>2018-12-01</date><risdate>2018</risdate><volume>46</volume><issue>6A</issue><spage>2562</spage><epage>2592</epage><pages>2562-2592</pages><issn>0090-5364</issn><eissn>2168-8966</eissn><abstract>This paper introduces a new way to compact a continuous probability distribution F into a set of representative points called support points. These points are obtained by minimizing the energy distance, a statistical potential measure initially proposed by Székely and Rizzo [InterStat 5 (2004) 1–6] for testing goodness-of-fit. The energy distance has two appealing features. First, its distance-based structure allows us to exploit the duality between powers of the Euclidean distance and its Fourier transform for theoretical analysis. Using this duality, we show that support points converge in distribution to F, and enjoy an improved error rate to Monte Carlo for integrating a large class of functions. Second, the minimization of the energy distance can be formulated as a difference-of-convex program, which we manipulate using two algorithms to efficiently generate representative point sets. In simulation studies, support points provide improved integration performance to both Monte Carlo and a specific quasi-Monte Carlo method. Two important applications of support points are then highlighted: (a) as a way to quantify the propagation of uncertainty in expensive simulations and (b) as a method to optimally compact Markov chain Monte Carlo (MCMC) samples in Bayesian computation.</abstract><cop>Hayward</cop><pub>Institute of Mathematical Statistics</pub><doi>10.1214/17-AOS1629</doi><tpages>31</tpages><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 0090-5364
ispartof The Annals of statistics, 2018-12, Vol.46 (6A), p.2562-2592
issn 0090-5364
2168-8966
language eng
recordid cdi_proquest_journals_2155912617
source JSTOR Archival Journals and Primary Sources Collection
subjects Bayesian analysis
Computer simulation
Energy conservation
Euclidean geometry
Fourier transforms
Goodness of fit
Markov chains
Monte Carlo simulation
Optimization
Statistical analysis
Statistical methods
title SUPPORT POINTS
url http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-08T13%3A50%3A55IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-jstor_proqu&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=SUPPORT%20POINTS&rft.jtitle=The%20Annals%20of%20statistics&rft.au=Mak,%20Simon&rft.date=2018-12-01&rft.volume=46&rft.issue=6A&rft.spage=2562&rft.epage=2592&rft.pages=2562-2592&rft.issn=0090-5364&rft.eissn=2168-8966&rft_id=info:doi/10.1214/17-AOS1629&rft_dat=%3Cjstor_proqu%3E26542875%3C/jstor_proqu%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c317t-a1a41fa3635e7fed7646c8b6fbb34c614d2118f2a58f0fb92a63efa04ea12c923%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=2155912617&rft_id=info:pmid/&rft_jstor_id=26542875&rfr_iscdi=true