Loading…
SUPPORT POINTS
This paper introduces a new way to compact a continuous probability distribution F into a set of representative points called support points. These points are obtained by minimizing the energy distance, a statistical potential measure initially proposed by Székely and Rizzo [InterStat 5 (2004) 1–6]...
Saved in:
Published in: | The Annals of statistics 2018-12, Vol.46 (6A), p.2562-2592 |
---|---|
Main Authors: | , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
cited_by | cdi_FETCH-LOGICAL-c317t-a1a41fa3635e7fed7646c8b6fbb34c614d2118f2a58f0fb92a63efa04ea12c923 |
---|---|
cites | cdi_FETCH-LOGICAL-c317t-a1a41fa3635e7fed7646c8b6fbb34c614d2118f2a58f0fb92a63efa04ea12c923 |
container_end_page | 2592 |
container_issue | 6A |
container_start_page | 2562 |
container_title | The Annals of statistics |
container_volume | 46 |
creator | Mak, Simon Joseph, V. Roshan |
description | This paper introduces a new way to compact a continuous probability distribution F into a set of representative points called support points. These points are obtained by minimizing the energy distance, a statistical potential measure initially proposed by Székely and Rizzo [InterStat 5 (2004) 1–6] for testing goodness-of-fit. The energy distance has two appealing features. First, its distance-based structure allows us to exploit the duality between powers of the Euclidean distance and its Fourier transform for theoretical analysis. Using this duality, we show that support points converge in distribution to F, and enjoy an improved error rate to Monte Carlo for integrating a large class of functions. Second, the minimization of the energy distance can be formulated as a difference-of-convex program, which we manipulate using two algorithms to efficiently generate representative point sets. In simulation studies, support points provide improved integration performance to both Monte Carlo and a specific quasi-Monte Carlo method. Two important applications of support points are then highlighted: (a) as a way to quantify the propagation of uncertainty in expensive simulations and (b) as a method to optimally compact Markov chain Monte Carlo (MCMC) samples in Bayesian computation. |
doi_str_mv | 10.1214/17-AOS1629 |
format | article |
fullrecord | <record><control><sourceid>jstor_proqu</sourceid><recordid>TN_cdi_proquest_journals_2155912617</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><jstor_id>26542875</jstor_id><sourcerecordid>26542875</sourcerecordid><originalsourceid>FETCH-LOGICAL-c317t-a1a41fa3635e7fed7646c8b6fbb34c614d2118f2a58f0fb92a63efa04ea12c923</originalsourceid><addsrcrecordid>eNo9z81LwzAYBvAgCs6pB--C4E2I5s3Hm-Q4hh-DYYvtziHtErConUl38L-30uHpufx4Hh5CroDdAwf5AJouigqQ2yMy44CGGot4TGaMWUaVQHlKznLuGGPKSjEjl9WmLIu3-qYsVq91dU5Oov_I4eKQc7J5eqyXL3RdPK-WizVtBeiBevASohcoVNAxbDVKbE2DsWmEbBHklgOYyL0ykcXGco8iRM9k8MBby8Wc3E69u9R_70MeXNfv09c46TgoZYEj6FHdTapNfc4pRLdL758-_Thg7u-vA-0Of0d8PeEuD336lxyV5EYr8QvNU02q</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2155912617</pqid></control><display><type>article</type><title>SUPPORT POINTS</title><source>JSTOR Archival Journals and Primary Sources Collection</source><creator>Mak, Simon ; Joseph, V. Roshan</creator><creatorcontrib>Mak, Simon ; Joseph, V. Roshan</creatorcontrib><description>This paper introduces a new way to compact a continuous probability distribution F into a set of representative points called support points. These points are obtained by minimizing the energy distance, a statistical potential measure initially proposed by Székely and Rizzo [InterStat 5 (2004) 1–6] for testing goodness-of-fit. The energy distance has two appealing features. First, its distance-based structure allows us to exploit the duality between powers of the Euclidean distance and its Fourier transform for theoretical analysis. Using this duality, we show that support points converge in distribution to F, and enjoy an improved error rate to Monte Carlo for integrating a large class of functions. Second, the minimization of the energy distance can be formulated as a difference-of-convex program, which we manipulate using two algorithms to efficiently generate representative point sets. In simulation studies, support points provide improved integration performance to both Monte Carlo and a specific quasi-Monte Carlo method. Two important applications of support points are then highlighted: (a) as a way to quantify the propagation of uncertainty in expensive simulations and (b) as a method to optimally compact Markov chain Monte Carlo (MCMC) samples in Bayesian computation.</description><identifier>ISSN: 0090-5364</identifier><identifier>EISSN: 2168-8966</identifier><identifier>DOI: 10.1214/17-AOS1629</identifier><language>eng</language><publisher>Hayward: Institute of Mathematical Statistics</publisher><subject>Bayesian analysis ; Computer simulation ; Energy conservation ; Euclidean geometry ; Fourier transforms ; Goodness of fit ; Markov chains ; Monte Carlo simulation ; Optimization ; Statistical analysis ; Statistical methods</subject><ispartof>The Annals of statistics, 2018-12, Vol.46 (6A), p.2562-2592</ispartof><rights>Institute of Mathematical Statistics, 2018</rights><rights>Copyright Institute of Mathematical Statistics Dec 2018</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c317t-a1a41fa3635e7fed7646c8b6fbb34c614d2118f2a58f0fb92a63efa04ea12c923</citedby><cites>FETCH-LOGICAL-c317t-a1a41fa3635e7fed7646c8b6fbb34c614d2118f2a58f0fb92a63efa04ea12c923</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://www.jstor.org/stable/pdf/26542875$$EPDF$$P50$$Gjstor$$H</linktopdf><linktohtml>$$Uhttps://www.jstor.org/stable/26542875$$EHTML$$P50$$Gjstor$$H</linktohtml><link.rule.ids>314,776,780,27901,27902,58213,58446</link.rule.ids></links><search><creatorcontrib>Mak, Simon</creatorcontrib><creatorcontrib>Joseph, V. Roshan</creatorcontrib><title>SUPPORT POINTS</title><title>The Annals of statistics</title><description>This paper introduces a new way to compact a continuous probability distribution F into a set of representative points called support points. These points are obtained by minimizing the energy distance, a statistical potential measure initially proposed by Székely and Rizzo [InterStat 5 (2004) 1–6] for testing goodness-of-fit. The energy distance has two appealing features. First, its distance-based structure allows us to exploit the duality between powers of the Euclidean distance and its Fourier transform for theoretical analysis. Using this duality, we show that support points converge in distribution to F, and enjoy an improved error rate to Monte Carlo for integrating a large class of functions. Second, the minimization of the energy distance can be formulated as a difference-of-convex program, which we manipulate using two algorithms to efficiently generate representative point sets. In simulation studies, support points provide improved integration performance to both Monte Carlo and a specific quasi-Monte Carlo method. Two important applications of support points are then highlighted: (a) as a way to quantify the propagation of uncertainty in expensive simulations and (b) as a method to optimally compact Markov chain Monte Carlo (MCMC) samples in Bayesian computation.</description><subject>Bayesian analysis</subject><subject>Computer simulation</subject><subject>Energy conservation</subject><subject>Euclidean geometry</subject><subject>Fourier transforms</subject><subject>Goodness of fit</subject><subject>Markov chains</subject><subject>Monte Carlo simulation</subject><subject>Optimization</subject><subject>Statistical analysis</subject><subject>Statistical methods</subject><issn>0090-5364</issn><issn>2168-8966</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2018</creationdate><recordtype>article</recordtype><recordid>eNo9z81LwzAYBvAgCs6pB--C4E2I5s3Hm-Q4hh-DYYvtziHtErConUl38L-30uHpufx4Hh5CroDdAwf5AJouigqQ2yMy44CGGot4TGaMWUaVQHlKznLuGGPKSjEjl9WmLIu3-qYsVq91dU5Oov_I4eKQc7J5eqyXL3RdPK-WizVtBeiBevASohcoVNAxbDVKbE2DsWmEbBHklgOYyL0ykcXGco8iRM9k8MBby8Wc3E69u9R_70MeXNfv09c46TgoZYEj6FHdTapNfc4pRLdL758-_Thg7u-vA-0Of0d8PeEuD336lxyV5EYr8QvNU02q</recordid><startdate>20181201</startdate><enddate>20181201</enddate><creator>Mak, Simon</creator><creator>Joseph, V. Roshan</creator><general>Institute of Mathematical Statistics</general><scope>AAYXX</scope><scope>CITATION</scope><scope>JQ2</scope></search><sort><creationdate>20181201</creationdate><title>SUPPORT POINTS</title><author>Mak, Simon ; Joseph, V. Roshan</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c317t-a1a41fa3635e7fed7646c8b6fbb34c614d2118f2a58f0fb92a63efa04ea12c923</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2018</creationdate><topic>Bayesian analysis</topic><topic>Computer simulation</topic><topic>Energy conservation</topic><topic>Euclidean geometry</topic><topic>Fourier transforms</topic><topic>Goodness of fit</topic><topic>Markov chains</topic><topic>Monte Carlo simulation</topic><topic>Optimization</topic><topic>Statistical analysis</topic><topic>Statistical methods</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Mak, Simon</creatorcontrib><creatorcontrib>Joseph, V. Roshan</creatorcontrib><collection>CrossRef</collection><collection>ProQuest Computer Science Collection</collection><jtitle>The Annals of statistics</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Mak, Simon</au><au>Joseph, V. Roshan</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>SUPPORT POINTS</atitle><jtitle>The Annals of statistics</jtitle><date>2018-12-01</date><risdate>2018</risdate><volume>46</volume><issue>6A</issue><spage>2562</spage><epage>2592</epage><pages>2562-2592</pages><issn>0090-5364</issn><eissn>2168-8966</eissn><abstract>This paper introduces a new way to compact a continuous probability distribution F into a set of representative points called support points. These points are obtained by minimizing the energy distance, a statistical potential measure initially proposed by Székely and Rizzo [InterStat 5 (2004) 1–6] for testing goodness-of-fit. The energy distance has two appealing features. First, its distance-based structure allows us to exploit the duality between powers of the Euclidean distance and its Fourier transform for theoretical analysis. Using this duality, we show that support points converge in distribution to F, and enjoy an improved error rate to Monte Carlo for integrating a large class of functions. Second, the minimization of the energy distance can be formulated as a difference-of-convex program, which we manipulate using two algorithms to efficiently generate representative point sets. In simulation studies, support points provide improved integration performance to both Monte Carlo and a specific quasi-Monte Carlo method. Two important applications of support points are then highlighted: (a) as a way to quantify the propagation of uncertainty in expensive simulations and (b) as a method to optimally compact Markov chain Monte Carlo (MCMC) samples in Bayesian computation.</abstract><cop>Hayward</cop><pub>Institute of Mathematical Statistics</pub><doi>10.1214/17-AOS1629</doi><tpages>31</tpages><oa>free_for_read</oa></addata></record> |
fulltext | fulltext |
identifier | ISSN: 0090-5364 |
ispartof | The Annals of statistics, 2018-12, Vol.46 (6A), p.2562-2592 |
issn | 0090-5364 2168-8966 |
language | eng |
recordid | cdi_proquest_journals_2155912617 |
source | JSTOR Archival Journals and Primary Sources Collection |
subjects | Bayesian analysis Computer simulation Energy conservation Euclidean geometry Fourier transforms Goodness of fit Markov chains Monte Carlo simulation Optimization Statistical analysis Statistical methods |
title | SUPPORT POINTS |
url | http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-08T13%3A50%3A55IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-jstor_proqu&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=SUPPORT%20POINTS&rft.jtitle=The%20Annals%20of%20statistics&rft.au=Mak,%20Simon&rft.date=2018-12-01&rft.volume=46&rft.issue=6A&rft.spage=2562&rft.epage=2592&rft.pages=2562-2592&rft.issn=0090-5364&rft.eissn=2168-8966&rft_id=info:doi/10.1214/17-AOS1629&rft_dat=%3Cjstor_proqu%3E26542875%3C/jstor_proqu%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c317t-a1a41fa3635e7fed7646c8b6fbb34c614d2118f2a58f0fb92a63efa04ea12c923%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=2155912617&rft_id=info:pmid/&rft_jstor_id=26542875&rfr_iscdi=true |