Loading…
Wallenius Bayes
This paper introduces a new event model appropriate for classifying (binary) data generated by a “destructive choice” process, such as certain human behavior. In such a process, making a choice removes that choice from future consideration yet does not influence the relative probability of other cho...
Saved in:
Published in: | Machine learning 2018-06, Vol.107 (6), p.1013-1037 |
---|---|
Main Authors: | , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
cited_by | cdi_FETCH-LOGICAL-c359t-4b20dbd36d5572d920eb492096f3d5681a0f3a47927cc68be8c1122606c691b73 |
---|---|
cites | cdi_FETCH-LOGICAL-c359t-4b20dbd36d5572d920eb492096f3d5681a0f3a47927cc68be8c1122606c691b73 |
container_end_page | 1037 |
container_issue | 6 |
container_start_page | 1013 |
container_title | Machine learning |
container_volume | 107 |
creator | Junqué de Fortuny, Enric Martens, David Provost, Foster |
description | This paper introduces a new event model appropriate for classifying (binary) data generated by a “destructive choice” process, such as certain human behavior. In such a process, making a choice removes that choice from future consideration yet does not influence the relative probability of other choices in the choice set. The proposed Wallenius event model is based on a somewhat forgotten non-central hypergeometric distribution introduced by Wallenius (Biased sampling: the non-central hypergeometric probability distribution. Ph.D. thesis, Stanford University,
1963
). We discuss its relationship with models of how human choice behavior is generated, highlighting a key (simple) mathematical property. We use this background to describe specifically why traditional multivariate Bernoulli naive Bayes and multinomial naive Bayes each are suboptimal for such data. We then present an implementation of naive Bayes based on the Wallenius event model, and show experimentally that for data where we would expect the features to be generated via destructive choice behavior Wallenius Bayes indeed outperforms the traditional versions of naive Bayes for prediction based on these features. Furthermore, we also show that it is competitive with non-naive methods (in particular, support-vector machines). In contrast, we also show that Wallenius Bayes underperforms when the data generating process is not based on destructive choice. |
doi_str_mv | 10.1007/s10994-018-5699-z |
format | article |
fullrecord | <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_journals_2010929887</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2010929887</sourcerecordid><originalsourceid>FETCH-LOGICAL-c359t-4b20dbd36d5572d920eb492096f3d5681a0f3a47927cc68be8c1122606c691b73</originalsourceid><addsrcrecordid>eNp1jzFPwzAQhS0EEqEgZjYkZsOdHZ_tESoKSJVYQIyW4zioVWiK3QztrydRkJhY7i3ve6ePsSuEWwTQdxnB2pIDGq7IWn44YgUqLTkoUsesAGMUJxTqlJ3lvAYAQYYKdvnh2zZuVn2-fvD7mM_ZSePbHC9-c8beF49v82e-fH16md8veZDK7nhZCairWlKtlBa1FRCrcriWGlkrMuihkb7UVugQyFTRBEQhCCiQxUrLGbuZdrep--5j3rl116fN8NIJGFyENWZs4dQKqcs5xcZt0-rLp71DcKO3m7zd4O1Gb3cYGDExeehuPmP6W_4f-gG6o1g-</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2010929887</pqid></control><display><type>article</type><title>Wallenius Bayes</title><source>Springer Nature</source><creator>Junqué de Fortuny, Enric ; Martens, David ; Provost, Foster</creator><creatorcontrib>Junqué de Fortuny, Enric ; Martens, David ; Provost, Foster</creatorcontrib><description>This paper introduces a new event model appropriate for classifying (binary) data generated by a “destructive choice” process, such as certain human behavior. In such a process, making a choice removes that choice from future consideration yet does not influence the relative probability of other choices in the choice set. The proposed Wallenius event model is based on a somewhat forgotten non-central hypergeometric distribution introduced by Wallenius (Biased sampling: the non-central hypergeometric probability distribution. Ph.D. thesis, Stanford University,
1963
). We discuss its relationship with models of how human choice behavior is generated, highlighting a key (simple) mathematical property. We use this background to describe specifically why traditional multivariate Bernoulli naive Bayes and multinomial naive Bayes each are suboptimal for such data. We then present an implementation of naive Bayes based on the Wallenius event model, and show experimentally that for data where we would expect the features to be generated via destructive choice behavior Wallenius Bayes indeed outperforms the traditional versions of naive Bayes for prediction based on these features. Furthermore, we also show that it is competitive with non-naive methods (in particular, support-vector machines). In contrast, we also show that Wallenius Bayes underperforms when the data generating process is not based on destructive choice.</description><identifier>ISSN: 0885-6125</identifier><identifier>EISSN: 1573-0565</identifier><identifier>DOI: 10.1007/s10994-018-5699-z</identifier><language>eng</language><publisher>New York: Springer US</publisher><subject>Artificial Intelligence ; Bayesian analysis ; Computer Science ; Control ; Human behavior ; Mathematical models ; Mechatronics ; Natural Language Processing (NLP) ; Robotics ; Simulation and Modeling ; Text editing</subject><ispartof>Machine learning, 2018-06, Vol.107 (6), p.1013-1037</ispartof><rights>The Author(s) 2018</rights><rights>Machine Learning is a copyright of Springer, (2018). All Rights Reserved.</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c359t-4b20dbd36d5572d920eb492096f3d5681a0f3a47927cc68be8c1122606c691b73</citedby><cites>FETCH-LOGICAL-c359t-4b20dbd36d5572d920eb492096f3d5681a0f3a47927cc68be8c1122606c691b73</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>314,780,784,27924,27925</link.rule.ids></links><search><creatorcontrib>Junqué de Fortuny, Enric</creatorcontrib><creatorcontrib>Martens, David</creatorcontrib><creatorcontrib>Provost, Foster</creatorcontrib><title>Wallenius Bayes</title><title>Machine learning</title><addtitle>Mach Learn</addtitle><description>This paper introduces a new event model appropriate for classifying (binary) data generated by a “destructive choice” process, such as certain human behavior. In such a process, making a choice removes that choice from future consideration yet does not influence the relative probability of other choices in the choice set. The proposed Wallenius event model is based on a somewhat forgotten non-central hypergeometric distribution introduced by Wallenius (Biased sampling: the non-central hypergeometric probability distribution. Ph.D. thesis, Stanford University,
1963
). We discuss its relationship with models of how human choice behavior is generated, highlighting a key (simple) mathematical property. We use this background to describe specifically why traditional multivariate Bernoulli naive Bayes and multinomial naive Bayes each are suboptimal for such data. We then present an implementation of naive Bayes based on the Wallenius event model, and show experimentally that for data where we would expect the features to be generated via destructive choice behavior Wallenius Bayes indeed outperforms the traditional versions of naive Bayes for prediction based on these features. Furthermore, we also show that it is competitive with non-naive methods (in particular, support-vector machines). In contrast, we also show that Wallenius Bayes underperforms when the data generating process is not based on destructive choice.</description><subject>Artificial Intelligence</subject><subject>Bayesian analysis</subject><subject>Computer Science</subject><subject>Control</subject><subject>Human behavior</subject><subject>Mathematical models</subject><subject>Mechatronics</subject><subject>Natural Language Processing (NLP)</subject><subject>Robotics</subject><subject>Simulation and Modeling</subject><subject>Text editing</subject><issn>0885-6125</issn><issn>1573-0565</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2018</creationdate><recordtype>article</recordtype><recordid>eNp1jzFPwzAQhS0EEqEgZjYkZsOdHZ_tESoKSJVYQIyW4zioVWiK3QztrydRkJhY7i3ve6ePsSuEWwTQdxnB2pIDGq7IWn44YgUqLTkoUsesAGMUJxTqlJ3lvAYAQYYKdvnh2zZuVn2-fvD7mM_ZSePbHC9-c8beF49v82e-fH16md8veZDK7nhZCairWlKtlBa1FRCrcriWGlkrMuihkb7UVugQyFTRBEQhCCiQxUrLGbuZdrep--5j3rl116fN8NIJGFyENWZs4dQKqcs5xcZt0-rLp71DcKO3m7zd4O1Gb3cYGDExeehuPmP6W_4f-gG6o1g-</recordid><startdate>20180601</startdate><enddate>20180601</enddate><creator>Junqué de Fortuny, Enric</creator><creator>Martens, David</creator><creator>Provost, Foster</creator><general>Springer US</general><general>Springer Nature B.V</general><scope>AAYXX</scope><scope>CITATION</scope><scope>3V.</scope><scope>7SC</scope><scope>7XB</scope><scope>88I</scope><scope>8AL</scope><scope>8AO</scope><scope>8FD</scope><scope>8FE</scope><scope>8FG</scope><scope>8FK</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>ARAPS</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>GNUQQ</scope><scope>HCIFZ</scope><scope>JQ2</scope><scope>K7-</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>M0N</scope><scope>M2P</scope><scope>P5Z</scope><scope>P62</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>Q9U</scope></search><sort><creationdate>20180601</creationdate><title>Wallenius Bayes</title><author>Junqué de Fortuny, Enric ; Martens, David ; Provost, Foster</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c359t-4b20dbd36d5572d920eb492096f3d5681a0f3a47927cc68be8c1122606c691b73</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2018</creationdate><topic>Artificial Intelligence</topic><topic>Bayesian analysis</topic><topic>Computer Science</topic><topic>Control</topic><topic>Human behavior</topic><topic>Mathematical models</topic><topic>Mechatronics</topic><topic>Natural Language Processing (NLP)</topic><topic>Robotics</topic><topic>Simulation and Modeling</topic><topic>Text editing</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Junqué de Fortuny, Enric</creatorcontrib><creatorcontrib>Martens, David</creatorcontrib><creatorcontrib>Provost, Foster</creatorcontrib><collection>CrossRef</collection><collection>ProQuest Central (Corporate)</collection><collection>Computer and Information Systems Abstracts</collection><collection>ProQuest Central (purchase pre-March 2016)</collection><collection>Science Database (Alumni Edition)</collection><collection>Computing Database (Alumni Edition)</collection><collection>ProQuest Pharma Collection</collection><collection>Technology Research Database</collection><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>ProQuest Central (Alumni) (purchase pre-March 2016)</collection><collection>ProQuest Central (Alumni)</collection><collection>ProQuest Central</collection><collection>Advanced Technologies & Aerospace Collection</collection><collection>ProQuest Central Essentials</collection><collection>AUTh Library subscriptions: ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central</collection><collection>ProQuest Central Student</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Computer Science Collection</collection><collection>Computer Science Database</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>Computing Database</collection><collection>ProQuest Science Database</collection><collection>Advanced Technologies & Aerospace Database</collection><collection>ProQuest Advanced Technologies & Aerospace Collection</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>ProQuest Central Basic</collection><jtitle>Machine learning</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Junqué de Fortuny, Enric</au><au>Martens, David</au><au>Provost, Foster</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Wallenius Bayes</atitle><jtitle>Machine learning</jtitle><stitle>Mach Learn</stitle><date>2018-06-01</date><risdate>2018</risdate><volume>107</volume><issue>6</issue><spage>1013</spage><epage>1037</epage><pages>1013-1037</pages><issn>0885-6125</issn><eissn>1573-0565</eissn><abstract>This paper introduces a new event model appropriate for classifying (binary) data generated by a “destructive choice” process, such as certain human behavior. In such a process, making a choice removes that choice from future consideration yet does not influence the relative probability of other choices in the choice set. The proposed Wallenius event model is based on a somewhat forgotten non-central hypergeometric distribution introduced by Wallenius (Biased sampling: the non-central hypergeometric probability distribution. Ph.D. thesis, Stanford University,
1963
). We discuss its relationship with models of how human choice behavior is generated, highlighting a key (simple) mathematical property. We use this background to describe specifically why traditional multivariate Bernoulli naive Bayes and multinomial naive Bayes each are suboptimal for such data. We then present an implementation of naive Bayes based on the Wallenius event model, and show experimentally that for data where we would expect the features to be generated via destructive choice behavior Wallenius Bayes indeed outperforms the traditional versions of naive Bayes for prediction based on these features. Furthermore, we also show that it is competitive with non-naive methods (in particular, support-vector machines). In contrast, we also show that Wallenius Bayes underperforms when the data generating process is not based on destructive choice.</abstract><cop>New York</cop><pub>Springer US</pub><doi>10.1007/s10994-018-5699-z</doi><tpages>25</tpages><oa>free_for_read</oa></addata></record> |
fulltext | fulltext |
identifier | ISSN: 0885-6125 |
ispartof | Machine learning, 2018-06, Vol.107 (6), p.1013-1037 |
issn | 0885-6125 1573-0565 |
language | eng |
recordid | cdi_proquest_journals_2010929887 |
source | Springer Nature |
subjects | Artificial Intelligence Bayesian analysis Computer Science Control Human behavior Mathematical models Mechatronics Natural Language Processing (NLP) Robotics Simulation and Modeling Text editing |
title | Wallenius Bayes |
url | http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-05T08%3A11%3A49IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Wallenius%20Bayes&rft.jtitle=Machine%20learning&rft.au=Junqu%C3%A9%20de%20Fortuny,%20Enric&rft.date=2018-06-01&rft.volume=107&rft.issue=6&rft.spage=1013&rft.epage=1037&rft.pages=1013-1037&rft.issn=0885-6125&rft.eissn=1573-0565&rft_id=info:doi/10.1007/s10994-018-5699-z&rft_dat=%3Cproquest_cross%3E2010929887%3C/proquest_cross%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c359t-4b20dbd36d5572d920eb492096f3d5681a0f3a47927cc68be8c1122606c691b73%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=2010929887&rft_id=info:pmid/&rfr_iscdi=true |