Loading…

Predicting Short-Term Subway Ridership and Prioritizing Its Influential Factors Using Gradient Boosting Decision Trees

Understanding the relationship between short-term subway ridership and its influential factors is crucial to improving the accuracy of short-term subway ridership prediction. Although there has been a growing body of studies on short-term ridership prediction approaches, limited effort is made to in...

Full description

Saved in:
Bibliographic Details
Published in:Sustainability 2016-10, Vol.8 (11), p.1100-1100
Main Authors: Ding, Chuan, Wang, Donggen, Ma, Xiaolei, Li, Haiying
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
cited_by cdi_FETCH-LOGICAL-c391t-4c2bc67bc910563d69bd471d06db156ba714a375c745a0cd02e8661dfd9931ff3
cites cdi_FETCH-LOGICAL-c391t-4c2bc67bc910563d69bd471d06db156ba714a375c745a0cd02e8661dfd9931ff3
container_end_page 1100
container_issue 11
container_start_page 1100
container_title Sustainability
container_volume 8
creator Ding, Chuan
Wang, Donggen
Ma, Xiaolei
Li, Haiying
description Understanding the relationship between short-term subway ridership and its influential factors is crucial to improving the accuracy of short-term subway ridership prediction. Although there has been a growing body of studies on short-term ridership prediction approaches, limited effort is made to investigate the short-term subway ridership prediction considering bus transfer activities and temporal features. To fill this gap, a relatively recent data mining approach called gradient boosting decision trees (GBDT) is applied to short-term subway ridership prediction and used to capture the associations with the independent variables. Taking three subway stations in Beijing as the cases, the short-term subway ridership and alighting passengers from its adjacent bus stops are obtained based on transit smart card data. To optimize the model performance with different combinations of regularization parameters, a series of GBDT models are built with various learning rates and tree complexities by fitting a maximum of trees. The optimal model performance confirms that the gradient boosting approach can incorporate different types of predictors, fit complex nonlinear relationships, and automatically handle the multicollinearity effect with high accuracy. In contrast to other machine learning methods-or "black-box" procedures-the GBDT model can identify and rank the relative influences of bus transfer activities and temporal features on short-term subway ridership. These findings suggest that the GBDT model has considerable advantages in improving short-term subway ridership prediction in a multimodal public transportation system.
doi_str_mv 10.3390/su8111100
format article
fullrecord <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_miscellaneous_1859474635</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>4279789581</sourcerecordid><originalsourceid>FETCH-LOGICAL-c391t-4c2bc67bc910563d69bd471d06db156ba714a375c745a0cd02e8661dfd9931ff3</originalsourceid><addsrcrecordid>eNpdkUFLAzEUhIMoWLQH_0HAix5Wk81utjlqtbVQsNj2vGSTrE3ZbmpeVqm_3tSKiHN5D-ZjGBiELii5YUyQW-gGNIqQI9RLSUETSnJy_Oc_RX2ANYlijArKe-h95o22Ktj2Fc9XzodkYfwGz7vqQ-7wi9XGw8pusWw1nnnrvA32cw9PAuBJWzedaYOVDR5JFZwHvIS9O_ZS2-jge-fgO_zBKAvWtXjhjYFzdFLLBkz_556h5ehxMXxKps_jyfBumigmaEgylVaKF5USsT1nmotKZwXVhOuK5rySBc0kK3JVZLkkSpPUDDinutZCMFrX7AxdHXK33r11BkK5saBM08jWuA5KOshFVmSc5RG9_IeuXefb2C5SmUgHgrM0UtcHSnkH4E1dbr3dSL8rKSn3I5S_I7AvIrR53A</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>1849289632</pqid></control><display><type>article</type><title>Predicting Short-Term Subway Ridership and Prioritizing Its Influential Factors Using Gradient Boosting Decision Trees</title><source>Publicly Available Content Database</source><creator>Ding, Chuan ; Wang, Donggen ; Ma, Xiaolei ; Li, Haiying</creator><creatorcontrib>Ding, Chuan ; Wang, Donggen ; Ma, Xiaolei ; Li, Haiying</creatorcontrib><description>Understanding the relationship between short-term subway ridership and its influential factors is crucial to improving the accuracy of short-term subway ridership prediction. Although there has been a growing body of studies on short-term ridership prediction approaches, limited effort is made to investigate the short-term subway ridership prediction considering bus transfer activities and temporal features. To fill this gap, a relatively recent data mining approach called gradient boosting decision trees (GBDT) is applied to short-term subway ridership prediction and used to capture the associations with the independent variables. Taking three subway stations in Beijing as the cases, the short-term subway ridership and alighting passengers from its adjacent bus stops are obtained based on transit smart card data. To optimize the model performance with different combinations of regularization parameters, a series of GBDT models are built with various learning rates and tree complexities by fitting a maximum of trees. The optimal model performance confirms that the gradient boosting approach can incorporate different types of predictors, fit complex nonlinear relationships, and automatically handle the multicollinearity effect with high accuracy. In contrast to other machine learning methods-or "black-box" procedures-the GBDT model can identify and rank the relative influences of bus transfer activities and temporal features on short-term subway ridership. These findings suggest that the GBDT model has considerable advantages in improving short-term subway ridership prediction in a multimodal public transportation system.</description><identifier>ISSN: 2071-1050</identifier><identifier>EISSN: 2071-1050</identifier><identifier>DOI: 10.3390/su8111100</identifier><language>eng</language><publisher>Basel: MDPI AG</publisher><subject>Accuracy ; Algorithms ; Decision trees ; Forecasting ; Neural networks ; Passengers ; Public transportation ; Smart cards ; Sustainability ; Variables</subject><ispartof>Sustainability, 2016-10, Vol.8 (11), p.1100-1100</ispartof><rights>Copyright MDPI AG 2016</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c391t-4c2bc67bc910563d69bd471d06db156ba714a375c745a0cd02e8661dfd9931ff3</citedby><cites>FETCH-LOGICAL-c391t-4c2bc67bc910563d69bd471d06db156ba714a375c745a0cd02e8661dfd9931ff3</cites><orcidid>0000-0001-9560-8585</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://www.proquest.com/docview/1849289632/fulltextPDF?pq-origsite=primo$$EPDF$$P50$$Gproquest$$Hfree_for_read</linktopdf><linktohtml>$$Uhttps://www.proquest.com/docview/1849289632?pq-origsite=primo$$EHTML$$P50$$Gproquest$$Hfree_for_read</linktohtml><link.rule.ids>314,780,784,25753,27924,27925,37012,37013,44590,75126</link.rule.ids></links><search><creatorcontrib>Ding, Chuan</creatorcontrib><creatorcontrib>Wang, Donggen</creatorcontrib><creatorcontrib>Ma, Xiaolei</creatorcontrib><creatorcontrib>Li, Haiying</creatorcontrib><title>Predicting Short-Term Subway Ridership and Prioritizing Its Influential Factors Using Gradient Boosting Decision Trees</title><title>Sustainability</title><description>Understanding the relationship between short-term subway ridership and its influential factors is crucial to improving the accuracy of short-term subway ridership prediction. Although there has been a growing body of studies on short-term ridership prediction approaches, limited effort is made to investigate the short-term subway ridership prediction considering bus transfer activities and temporal features. To fill this gap, a relatively recent data mining approach called gradient boosting decision trees (GBDT) is applied to short-term subway ridership prediction and used to capture the associations with the independent variables. Taking three subway stations in Beijing as the cases, the short-term subway ridership and alighting passengers from its adjacent bus stops are obtained based on transit smart card data. To optimize the model performance with different combinations of regularization parameters, a series of GBDT models are built with various learning rates and tree complexities by fitting a maximum of trees. The optimal model performance confirms that the gradient boosting approach can incorporate different types of predictors, fit complex nonlinear relationships, and automatically handle the multicollinearity effect with high accuracy. In contrast to other machine learning methods-or "black-box" procedures-the GBDT model can identify and rank the relative influences of bus transfer activities and temporal features on short-term subway ridership. These findings suggest that the GBDT model has considerable advantages in improving short-term subway ridership prediction in a multimodal public transportation system.</description><subject>Accuracy</subject><subject>Algorithms</subject><subject>Decision trees</subject><subject>Forecasting</subject><subject>Neural networks</subject><subject>Passengers</subject><subject>Public transportation</subject><subject>Smart cards</subject><subject>Sustainability</subject><subject>Variables</subject><issn>2071-1050</issn><issn>2071-1050</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2016</creationdate><recordtype>article</recordtype><sourceid>PIMPY</sourceid><recordid>eNpdkUFLAzEUhIMoWLQH_0HAix5Wk81utjlqtbVQsNj2vGSTrE3ZbmpeVqm_3tSKiHN5D-ZjGBiELii5YUyQW-gGNIqQI9RLSUETSnJy_Oc_RX2ANYlijArKe-h95o22Ktj2Fc9XzodkYfwGz7vqQ-7wi9XGw8pusWw1nnnrvA32cw9PAuBJWzedaYOVDR5JFZwHvIS9O_ZS2-jge-fgO_zBKAvWtXjhjYFzdFLLBkz_556h5ehxMXxKps_jyfBumigmaEgylVaKF5USsT1nmotKZwXVhOuK5rySBc0kK3JVZLkkSpPUDDinutZCMFrX7AxdHXK33r11BkK5saBM08jWuA5KOshFVmSc5RG9_IeuXefb2C5SmUgHgrM0UtcHSnkH4E1dbr3dSL8rKSn3I5S_I7AvIrR53A</recordid><startdate>20161028</startdate><enddate>20161028</enddate><creator>Ding, Chuan</creator><creator>Wang, Donggen</creator><creator>Ma, Xiaolei</creator><creator>Li, Haiying</creator><general>MDPI AG</general><scope>AAYXX</scope><scope>CITATION</scope><scope>4U-</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>7ST</scope><scope>7U6</scope><scope>C1K</scope><orcidid>https://orcid.org/0000-0001-9560-8585</orcidid></search><sort><creationdate>20161028</creationdate><title>Predicting Short-Term Subway Ridership and Prioritizing Its Influential Factors Using Gradient Boosting Decision Trees</title><author>Ding, Chuan ; Wang, Donggen ; Ma, Xiaolei ; Li, Haiying</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c391t-4c2bc67bc910563d69bd471d06db156ba714a375c745a0cd02e8661dfd9931ff3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2016</creationdate><topic>Accuracy</topic><topic>Algorithms</topic><topic>Decision trees</topic><topic>Forecasting</topic><topic>Neural networks</topic><topic>Passengers</topic><topic>Public transportation</topic><topic>Smart cards</topic><topic>Sustainability</topic><topic>Variables</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Ding, Chuan</creatorcontrib><creatorcontrib>Wang, Donggen</creatorcontrib><creatorcontrib>Ma, Xiaolei</creatorcontrib><creatorcontrib>Li, Haiying</creatorcontrib><collection>CrossRef</collection><collection>University Readers</collection><collection>ProQuest Central (Alumni)</collection><collection>ProQuest Central</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central</collection><collection>Publicly Available Content Database</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>Environment Abstracts</collection><collection>Sustainability Science Abstracts</collection><collection>Environmental Sciences and Pollution Management</collection><jtitle>Sustainability</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Ding, Chuan</au><au>Wang, Donggen</au><au>Ma, Xiaolei</au><au>Li, Haiying</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Predicting Short-Term Subway Ridership and Prioritizing Its Influential Factors Using Gradient Boosting Decision Trees</atitle><jtitle>Sustainability</jtitle><date>2016-10-28</date><risdate>2016</risdate><volume>8</volume><issue>11</issue><spage>1100</spage><epage>1100</epage><pages>1100-1100</pages><issn>2071-1050</issn><eissn>2071-1050</eissn><abstract>Understanding the relationship between short-term subway ridership and its influential factors is crucial to improving the accuracy of short-term subway ridership prediction. Although there has been a growing body of studies on short-term ridership prediction approaches, limited effort is made to investigate the short-term subway ridership prediction considering bus transfer activities and temporal features. To fill this gap, a relatively recent data mining approach called gradient boosting decision trees (GBDT) is applied to short-term subway ridership prediction and used to capture the associations with the independent variables. Taking three subway stations in Beijing as the cases, the short-term subway ridership and alighting passengers from its adjacent bus stops are obtained based on transit smart card data. To optimize the model performance with different combinations of regularization parameters, a series of GBDT models are built with various learning rates and tree complexities by fitting a maximum of trees. The optimal model performance confirms that the gradient boosting approach can incorporate different types of predictors, fit complex nonlinear relationships, and automatically handle the multicollinearity effect with high accuracy. In contrast to other machine learning methods-or "black-box" procedures-the GBDT model can identify and rank the relative influences of bus transfer activities and temporal features on short-term subway ridership. These findings suggest that the GBDT model has considerable advantages in improving short-term subway ridership prediction in a multimodal public transportation system.</abstract><cop>Basel</cop><pub>MDPI AG</pub><doi>10.3390/su8111100</doi><tpages>1</tpages><orcidid>https://orcid.org/0000-0001-9560-8585</orcidid><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 2071-1050
ispartof Sustainability, 2016-10, Vol.8 (11), p.1100-1100
issn 2071-1050
2071-1050
language eng
recordid cdi_proquest_miscellaneous_1859474635
source Publicly Available Content Database
subjects Accuracy
Algorithms
Decision trees
Forecasting
Neural networks
Passengers
Public transportation
Smart cards
Sustainability
Variables
title Predicting Short-Term Subway Ridership and Prioritizing Its Influential Factors Using Gradient Boosting Decision Trees
url http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-26T16%3A47%3A03IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Predicting%20Short-Term%20Subway%20Ridership%20and%20Prioritizing%20Its%20Influential%20Factors%20Using%20Gradient%20Boosting%20Decision%20Trees&rft.jtitle=Sustainability&rft.au=Ding,%20Chuan&rft.date=2016-10-28&rft.volume=8&rft.issue=11&rft.spage=1100&rft.epage=1100&rft.pages=1100-1100&rft.issn=2071-1050&rft.eissn=2071-1050&rft_id=info:doi/10.3390/su8111100&rft_dat=%3Cproquest_cross%3E4279789581%3C/proquest_cross%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c391t-4c2bc67bc910563d69bd471d06db156ba714a375c745a0cd02e8661dfd9931ff3%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=1849289632&rft_id=info:pmid/&rfr_iscdi=true