Loading…

An EM-Like Algorithm for Semi- and Nonparametric Estimation in Multivariate Mixtures

We propose an algorithm for nonparametric estimation for finite mixtures of multivariate random vectors that strongly resembles a true EM algorithm. The vectors are assumed to have independent coordinates conditional upon knowing from which mixture component they come, but otherwise their density fu...

Full description

Saved in:

Bibliographic Details
Published in:	Journal of computational and graphical statistics 2009-06, Vol.18 (2), p.505-526
Main Authors:	Benaglia, Tatiana, Chauveau, Didier, Hunter, David R.
Format:	Article
Language:	English
Subjects:	Algorithms Coordinate systems Datasets Density estimation EM algorithm EM-Type Algorithms Estimating techniques Estimation methods Euclidean space Identifiability Kernel density estimation Modeling Multivariate analysis Multivariate mixture Nonparametric mixture Parametric models Product labeling Sample size Simulation Stochastic models Studies
Citations:	Items that this one cites Items that cite this one
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

cited_by	cdi_FETCH-LOGICAL-c382t-331d67e02559865fd4cb05267eb18aa68518961eef19156bfb127438fcef119e3
cites	cdi_FETCH-LOGICAL-c382t-331d67e02559865fd4cb05267eb18aa68518961eef19156bfb127438fcef119e3
container_end_page	526
container_issue	2
container_start_page	505
container_title	Journal of computational and graphical statistics
container_volume	18
creator	Benaglia, Tatiana Chauveau, Didier Hunter, David R.
description	We propose an algorithm for nonparametric estimation for finite mixtures of multivariate random vectors that strongly resembles a true EM algorithm. The vectors are assumed to have independent coordinates conditional upon knowing from which mixture component they come, but otherwise their density functions are completely unspecified. Sometimes, the density functions may be partially specified by Euclidean parameters, a case we call semiparametric. Our algorithm is much more flexible and easily applicable than existing algorithms in the literature; it can be extended to any number of mixture components and any number of vector coordinates of the multivariate observations. Thus it may be applied even in situations where the model is not identifiable, so care is called for when using it in situations for which identifiability is difficult to establish conclusively. Our algorithm yields much smaller mean integrated squared errors than an alternative algorithm in a simulation study. In another example using a real dataset, it provides new insights that extend previous analyses. Finally, we present two different variations of our algorithm, one stochastic and one deterministic, and find anecdotal evidence that there is not a great deal of difference between the performance of these two variants. The computer code and data used in this article are available online.
doi_str_mv	10.1198/jcgs.2009.07175
format	article
fullrecord	<record><control><sourceid>jstor_infor</sourceid><recordid>TN_cdi_jstor_primary_25651257</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><jstor_id>25651257</jstor_id><sourcerecordid>25651257</sourcerecordid><originalsourceid>FETCH-LOGICAL-c382t-331d67e02559865fd4cb05267eb18aa68518961eef19156bfb127438fcef119e3</originalsourceid><addsrcrecordid>eNp1kL1PwzAQxSMEEqUwMyFZ7Gl9Tm0nbFVVPqQWBspsOYlTHJK42A7Q_x6XIDamO7179073i6JLwBOALJ3WxdZNCMbZBHPg9CgaAU14TDjQ49BjBnHKMD6NzpyrMcbAMj6KNvMOLdfxSr8pNG-2xmr_2qLKWPSsWh0j2ZXo0XQ7aWWrvNUFWjqvW-m16ZDu0LpvvP6QVkuv0Fp_-d4qdx6dVLJx6uK3jqOX2-VmcR-vnu4eFvNVXCQp8XGSQMm4woTSLGW0KmdFjikJUg6plCylkGYMlKogA8ryKgfCZ0laFUGBTCXj6HrI3Vnz3ivnRW1624WTgiSU0xlhEEzTwVRY45xVldjZ8IDdC8DiQE4cyIkDOfFDLmxcDRu188b-2QllFAjlYX4zzHUXQLXy09imFF7uG2MrK7tCO5H8F_4Nhgl-Nw</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>235754261</pqid></control><display><type>article</type><title>An EM-Like Algorithm for Semi- and Nonparametric Estimation in Multivariate Mixtures</title><source>JSTOR Archival Journals and Primary Sources Collection</source><source>Taylor and Francis Science and Technology Collection</source><creator>Benaglia, Tatiana ; Chauveau, Didier ; Hunter, David R.</creator><creatorcontrib>Benaglia, Tatiana ; Chauveau, Didier ; Hunter, David R.</creatorcontrib><description>We propose an algorithm for nonparametric estimation for finite mixtures of multivariate random vectors that strongly resembles a true EM algorithm. The vectors are assumed to have independent coordinates conditional upon knowing from which mixture component they come, but otherwise their density functions are completely unspecified. Sometimes, the density functions may be partially specified by Euclidean parameters, a case we call semiparametric. Our algorithm is much more flexible and easily applicable than existing algorithms in the literature; it can be extended to any number of mixture components and any number of vector coordinates of the multivariate observations. Thus it may be applied even in situations where the model is not identifiable, so care is called for when using it in situations for which identifiability is difficult to establish conclusively. Our algorithm yields much smaller mean integrated squared errors than an alternative algorithm in a simulation study. In another example using a real dataset, it provides new insights that extend previous analyses. Finally, we present two different variations of our algorithm, one stochastic and one deterministic, and find anecdotal evidence that there is not a great deal of difference between the performance of these two variants. The computer code and data used in this article are available online.</description><identifier>ISSN: 1061-8600</identifier><identifier>EISSN: 1537-2715</identifier><identifier>DOI: 10.1198/jcgs.2009.07175</identifier><language>eng</language><publisher>Alexandria: Taylor & Francis</publisher><subject>Algorithms ; Coordinate systems ; Datasets ; Density estimation ; EM algorithm ; EM-Type Algorithms ; Estimating techniques ; Estimation methods ; Euclidean space ; Identifiability ; Kernel density estimation ; Modeling ; Multivariate analysis ; Multivariate mixture ; Nonparametric mixture ; Parametric models ; Product labeling ; Sample size ; Simulation ; Stochastic models ; Studies</subject><ispartof>Journal of computational and graphical statistics, 2009-06, Vol.18 (2), p.505-526</ispartof><rights>2009 American Statistical Association, Institute of Mathematical Statistics, and Interface Foundation of North America 2009</rights><rights>2009 The American Statistical Association, the Institute of Mathematical Statistics, and the Interface Foundation of North America</rights><rights>Copyright American Statistical Association Jun 2009</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c382t-331d67e02559865fd4cb05267eb18aa68518961eef19156bfb127438fcef119e3</citedby><cites>FETCH-LOGICAL-c382t-331d67e02559865fd4cb05267eb18aa68518961eef19156bfb127438fcef119e3</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://www.jstor.org/stable/pdf/25651257$$EPDF$$P50$$Gjstor$$H</linktopdf><linktohtml>$$Uhttps://www.jstor.org/stable/25651257$$EHTML$$P50$$Gjstor$$H</linktohtml><link.rule.ids>314,780,784,27924,27925,58238,58471</link.rule.ids></links><search><creatorcontrib>Benaglia, Tatiana</creatorcontrib><creatorcontrib>Chauveau, Didier</creatorcontrib><creatorcontrib>Hunter, David R.</creatorcontrib><title>An EM-Like Algorithm for Semi- and Nonparametric Estimation in Multivariate Mixtures</title><title>Journal of computational and graphical statistics</title><description>We propose an algorithm for nonparametric estimation for finite mixtures of multivariate random vectors that strongly resembles a true EM algorithm. The vectors are assumed to have independent coordinates conditional upon knowing from which mixture component they come, but otherwise their density functions are completely unspecified. Sometimes, the density functions may be partially specified by Euclidean parameters, a case we call semiparametric. Our algorithm is much more flexible and easily applicable than existing algorithms in the literature; it can be extended to any number of mixture components and any number of vector coordinates of the multivariate observations. Thus it may be applied even in situations where the model is not identifiable, so care is called for when using it in situations for which identifiability is difficult to establish conclusively. Our algorithm yields much smaller mean integrated squared errors than an alternative algorithm in a simulation study. In another example using a real dataset, it provides new insights that extend previous analyses. Finally, we present two different variations of our algorithm, one stochastic and one deterministic, and find anecdotal evidence that there is not a great deal of difference between the performance of these two variants. The computer code and data used in this article are available online.</description><subject>Algorithms</subject><subject>Coordinate systems</subject><subject>Datasets</subject><subject>Density estimation</subject><subject>EM algorithm</subject><subject>EM-Type Algorithms</subject><subject>Estimating techniques</subject><subject>Estimation methods</subject><subject>Euclidean space</subject><subject>Identifiability</subject><subject>Kernel density estimation</subject><subject>Modeling</subject><subject>Multivariate analysis</subject><subject>Multivariate mixture</subject><subject>Nonparametric mixture</subject><subject>Parametric models</subject><subject>Product labeling</subject><subject>Sample size</subject><subject>Simulation</subject><subject>Stochastic models</subject><subject>Studies</subject><issn>1061-8600</issn><issn>1537-2715</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2009</creationdate><recordtype>article</recordtype><recordid>eNp1kL1PwzAQxSMEEqUwMyFZ7Gl9Tm0nbFVVPqQWBspsOYlTHJK42A7Q_x6XIDamO7179073i6JLwBOALJ3WxdZNCMbZBHPg9CgaAU14TDjQ49BjBnHKMD6NzpyrMcbAMj6KNvMOLdfxSr8pNG-2xmr_2qLKWPSsWh0j2ZXo0XQ7aWWrvNUFWjqvW-m16ZDu0LpvvP6QVkuv0Fp_-d4qdx6dVLJx6uK3jqOX2-VmcR-vnu4eFvNVXCQp8XGSQMm4woTSLGW0KmdFjikJUg6plCylkGYMlKogA8ryKgfCZ0laFUGBTCXj6HrI3Vnz3ivnRW1624WTgiSU0xlhEEzTwVRY45xVldjZ8IDdC8DiQE4cyIkDOfFDLmxcDRu188b-2QllFAjlYX4zzHUXQLXy09imFF7uG2MrK7tCO5H8F_4Nhgl-Nw</recordid><startdate>20090601</startdate><enddate>20090601</enddate><creator>Benaglia, Tatiana</creator><creator>Chauveau, Didier</creator><creator>Hunter, David R.</creator><general>Taylor & Francis</general><general>JCGS Management Committee of the American Statistical Association, Institute of Mathematical Statistics, and Interface Foundation of North America</general><general>Taylor & Francis Ltd</general><scope>AAYXX</scope><scope>CITATION</scope><scope>JQ2</scope></search><sort><creationdate>20090601</creationdate><title>An EM-Like Algorithm for Semi- and Nonparametric Estimation in Multivariate Mixtures</title><author>Benaglia, Tatiana ; Chauveau, Didier ; Hunter, David R.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c382t-331d67e02559865fd4cb05267eb18aa68518961eef19156bfb127438fcef119e3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2009</creationdate><topic>Algorithms</topic><topic>Coordinate systems</topic><topic>Datasets</topic><topic>Density estimation</topic><topic>EM algorithm</topic><topic>EM-Type Algorithms</topic><topic>Estimating techniques</topic><topic>Estimation methods</topic><topic>Euclidean space</topic><topic>Identifiability</topic><topic>Kernel density estimation</topic><topic>Modeling</topic><topic>Multivariate analysis</topic><topic>Multivariate mixture</topic><topic>Nonparametric mixture</topic><topic>Parametric models</topic><topic>Product labeling</topic><topic>Sample size</topic><topic>Simulation</topic><topic>Stochastic models</topic><topic>Studies</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Benaglia, Tatiana</creatorcontrib><creatorcontrib>Chauveau, Didier</creatorcontrib><creatorcontrib>Hunter, David R.</creatorcontrib><collection>CrossRef</collection><collection>ProQuest Computer Science Collection</collection><jtitle>Journal of computational and graphical statistics</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Benaglia, Tatiana</au><au>Chauveau, Didier</au><au>Hunter, David R.</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>An EM-Like Algorithm for Semi- and Nonparametric Estimation in Multivariate Mixtures</atitle><jtitle>Journal of computational and graphical statistics</jtitle><date>2009-06-01</date><risdate>2009</risdate><volume>18</volume><issue>2</issue><spage>505</spage><epage>526</epage><pages>505-526</pages><issn>1061-8600</issn><eissn>1537-2715</eissn><abstract>We propose an algorithm for nonparametric estimation for finite mixtures of multivariate random vectors that strongly resembles a true EM algorithm. The vectors are assumed to have independent coordinates conditional upon knowing from which mixture component they come, but otherwise their density functions are completely unspecified. Sometimes, the density functions may be partially specified by Euclidean parameters, a case we call semiparametric. Our algorithm is much more flexible and easily applicable than existing algorithms in the literature; it can be extended to any number of mixture components and any number of vector coordinates of the multivariate observations. Thus it may be applied even in situations where the model is not identifiable, so care is called for when using it in situations for which identifiability is difficult to establish conclusively. Our algorithm yields much smaller mean integrated squared errors than an alternative algorithm in a simulation study. In another example using a real dataset, it provides new insights that extend previous analyses. Finally, we present two different variations of our algorithm, one stochastic and one deterministic, and find anecdotal evidence that there is not a great deal of difference between the performance of these two variants. The computer code and data used in this article are available online.</abstract><cop>Alexandria</cop><pub>Taylor & Francis</pub><doi>10.1198/jcgs.2009.07175</doi><tpages>22</tpages><oa>free_for_read</oa></addata></record>
fulltext	fulltext
identifier	ISSN: 1061-8600
ispartof	Journal of computational and graphical statistics, 2009-06, Vol.18 (2), p.505-526
issn	1061-8600 1537-2715
language	eng
recordid	cdi_jstor_primary_25651257
source	JSTOR Archival Journals and Primary Sources Collection; Taylor and Francis Science and Technology Collection
subjects	Algorithms Coordinate systems Datasets Density estimation EM algorithm EM-Type Algorithms Estimating techniques Estimation methods Euclidean space Identifiability Kernel density estimation Modeling Multivariate analysis Multivariate mixture Nonparametric mixture Parametric models Product labeling Sample size Simulation Stochastic models Studies
title	An EM-Like Algorithm for Semi- and Nonparametric Estimation in Multivariate Mixtures
url	http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-01T08%3A40%3A39IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-jstor_infor&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=An%20EM-Like%20Algorithm%20for%20Semi-%20and%20Nonparametric%20Estimation%20in%20Multivariate%20Mixtures&rft.jtitle=Journal%20of%20computational%20and%20graphical%20statistics&rft.au=Benaglia,%20Tatiana&rft.date=2009-06-01&rft.volume=18&rft.issue=2&rft.spage=505&rft.epage=526&rft.pages=505-526&rft.issn=1061-8600&rft.eissn=1537-2715&rft_id=info:doi/10.1198/jcgs.2009.07175&rft_dat=%3Cjstor_infor%3E25651257%3C/jstor_infor%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c382t-331d67e02559865fd4cb05267eb18aa68518961eef19156bfb127438fcef119e3%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=235754261&rft_id=info:pmid/&rft_jstor_id=25651257&rfr_iscdi=true