SCMarker: Ab initio marker selection for single cell transcriptome profiling

Single-cell RNA-sequencing data generated by a variety of technologies, such as Drop-seq and SMART-seq, can reveal simultaneously the mRNA transcript levels of thousands of genes in thousands of cells. It is often important to identify informative genes or cell-type-discriminative markers to reduce...

Full description

Saved in:
Bibliographic Details
Published in:PLoS computational biology 2019-10, Vol.15 (10), p.e1007445
Main Authors: Wang, Fang, Liang, Shaoheng, Kumar, Tapsi, Navin, Nicholas, Chen, Ken
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
cited_by cdi_FETCH-LOGICAL-c633t-6ffa33cbe9168435d27ddf87683da4006618587174b17dc8429f1bd34bac734d3
cites cdi_FETCH-LOGICAL-c633t-6ffa33cbe9168435d27ddf87683da4006618587174b17dc8429f1bd34bac734d3
container_end_page
container_issue 10
container_start_page e1007445
container_title PLoS computational biology
container_volume 15
creator Wang, Fang
Liang, Shaoheng
Kumar, Tapsi
Navin, Nicholas
Chen, Ken
description Single-cell RNA-sequencing data generated by a variety of technologies, such as Drop-seq and SMART-seq, can reveal simultaneously the mRNA transcript levels of thousands of genes in thousands of cells. It is often important to identify informative genes or cell-type-discriminative markers to reduce dimensionality and achieve informative cell typing results. We present an ab initio method that performs unsupervised marker selection by identifying genes that have subpopulation-discriminative expression levels and are co- or mutually-exclusively expressed with other genes. Consistent improvements in cell-type classification and biologically meaningful marker selection are achieved by applying SCMarker on various datasets in multiple tissue types, followed by a variety of clustering algorithms. The source code of SCMarker is publicly available at https://github.com/KChen-lab/SCMarker.
doi_str_mv 10.1371/journal.pcbi.1007445
format article
fullrecord <record><control><sourceid>gale_plos_</sourceid><recordid>TN_cdi_plos_journals_2314933803</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><galeid>A607647016</galeid><doaj_id>oai_doaj_org_article_72842138f9ab4487a247312afd0e6e31</doaj_id><sourcerecordid>A607647016</sourcerecordid><originalsourceid>FETCH-LOGICAL-c633t-6ffa33cbe9168435d27ddf87683da4006618587174b17dc8429f1bd34bac734d3</originalsourceid><addsrcrecordid>eNqVUkuP0zAQjhCIfcA_QBCJCxxa7IxjuxyQqopHpQISC2fL8SO4JHGwE7T8e5w2u9oiLsgHe2a--Wa-8WTZE4yWGBh-tfdj6GSz7FXllhghRkh5LzvHZQkLBiW_f-d9ll3EuEcoPVf0YXYGmJa8oMV5trvafJThhwmv83WVu84NzuftwZNH0xiV7C63PlmuqxuTK9M0-RBkF1Vw_eBbk_fBW9ek8KPsgZVNNI_n-zL79u7t182Hxe7z--1mvVsoCjAsqLUSQFVmhSknUOqCaW05oxy0JAhRinnJGWakwkwrToqVxZUGUknFgGi4zJ4defvGRzEPIooCMFkBcAQJsT0itJd70QeXJP0WXjpxcPhQCxkGpxojWJEKYOB2JStCOJMFYYALaTUy1ABOXG_mamPVGq1Ml-Q3J6Snkc59F7X_JZIeVpKJ4MVMEPzP0cRBtC5Oc5Sd8eOhbwQIIzL1_fwv6L_VLY-oWiYBrrM-1VXpaNM65TuTvsOINUWMEoYwTQkvTxISZjDXQy3HGMX26st_YD-dYskRq4KPMRh7OxWMxLSlN-2LaUvFvKUp7endid4m3awl_AGgEeIN</addsrcrecordid><sourcetype>Open Website</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2314933803</pqid></control><display><type>article</type><title>SCMarker: Ab initio marker selection for single cell transcriptome profiling</title><source>Publicly Available Content Database</source><source>PubMed Central</source><creator>Wang, Fang ; Liang, Shaoheng ; Kumar, Tapsi ; Navin, Nicholas ; Chen, Ken</creator><creatorcontrib>Wang, Fang ; Liang, Shaoheng ; Kumar, Tapsi ; Navin, Nicholas ; Chen, Ken</creatorcontrib><description>Single-cell RNA-sequencing data generated by a variety of technologies, such as Drop-seq and SMART-seq, can reveal simultaneously the mRNA transcript levels of thousands of genes in thousands of cells. It is often important to identify informative genes or cell-type-discriminative markers to reduce dimensionality and achieve informative cell typing results. We present an ab initio method that performs unsupervised marker selection by identifying genes that have subpopulation-discriminative expression levels and are co- or mutually-exclusively expressed with other genes. Consistent improvements in cell-type classification and biologically meaningful marker selection are achieved by applying SCMarker on various datasets in multiple tissue types, followed by a variety of clustering algorithms. The source code of SCMarker is publicly available at https://github.com/KChen-lab/SCMarker.</description><identifier>ISSN: 1553-7358</identifier><identifier>ISSN: 1553-734X</identifier><identifier>EISSN: 1553-7358</identifier><identifier>DOI: 10.1371/journal.pcbi.1007445</identifier><identifier>PMID: 31658262</identifier><language>eng</language><publisher>United States: Public Library of Science</publisher><subject>Algorithms ; B cells ; Base Sequence - genetics ; Bioinformatics ; Biology ; Biology and Life Sciences ; Biomarkers ; Cancer ; Cluster Analysis ; Clustering ; Computational Biology - methods ; Datasets ; Gene expression ; Gene Expression Profiling - methods ; Gene sequencing ; Genes ; Genetic research ; Genomes ; Head &amp; neck cancer ; Humans ; Medicine and Health Sciences ; Messenger RNA ; Metastasis ; Principal components analysis ; Research and analysis methods ; RNA ; RNA - genetics ; Sequence Analysis, RNA - methods ; Single-Cell Analysis - methods ; Software ; Source code ; Technology ; Transcription ; Transcriptome - genetics</subject><ispartof>PLoS computational biology, 2019-10, Vol.15 (10), p.e1007445</ispartof><rights>COPYRIGHT 2019 Public Library of Science</rights><rights>2019 Wang et al. This is an open access article distributed under the terms of the Creative Commons Attribution License: http://creativecommons.org/licenses/by/4.0/ (the “License”), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><rights>2019 Wang et al 2019 Wang et al</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c633t-6ffa33cbe9168435d27ddf87683da4006618587174b17dc8429f1bd34bac734d3</citedby><cites>FETCH-LOGICAL-c633t-6ffa33cbe9168435d27ddf87683da4006618587174b17dc8429f1bd34bac734d3</cites><orcidid>0000-0002-3510-4550 ; 0000-0003-2825-3387 ; 0000-0003-4013-5279</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://www.proquest.com/docview/2314933803/fulltextPDF?pq-origsite=primo$$EPDF$$P50$$Gproquest$$Hfree_for_read</linktopdf><linktohtml>$$Uhttps://www.proquest.com/docview/2314933803?pq-origsite=primo$$EHTML$$P50$$Gproquest$$Hfree_for_read</linktohtml><link.rule.ids>230,314,723,776,780,881,25731,27901,27902,36989,36990,44566,53766,53768,75096</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/31658262$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Wang, Fang</creatorcontrib><creatorcontrib>Liang, Shaoheng</creatorcontrib><creatorcontrib>Kumar, Tapsi</creatorcontrib><creatorcontrib>Navin, Nicholas</creatorcontrib><creatorcontrib>Chen, Ken</creatorcontrib><title>SCMarker: Ab initio marker selection for single cell transcriptome profiling</title><title>PLoS computational biology</title><addtitle>PLoS Comput Biol</addtitle><description>Single-cell RNA-sequencing data generated by a variety of technologies, such as Drop-seq and SMART-seq, can reveal simultaneously the mRNA transcript levels of thousands of genes in thousands of cells. It is often important to identify informative genes or cell-type-discriminative markers to reduce dimensionality and achieve informative cell typing results. We present an ab initio method that performs unsupervised marker selection by identifying genes that have subpopulation-discriminative expression levels and are co- or mutually-exclusively expressed with other genes. Consistent improvements in cell-type classification and biologically meaningful marker selection are achieved by applying SCMarker on various datasets in multiple tissue types, followed by a variety of clustering algorithms. The source code of SCMarker is publicly available at https://github.com/KChen-lab/SCMarker.</description><subject>Algorithms</subject><subject>B cells</subject><subject>Base Sequence - genetics</subject><subject>Bioinformatics</subject><subject>Biology</subject><subject>Biology and Life Sciences</subject><subject>Biomarkers</subject><subject>Cancer</subject><subject>Cluster Analysis</subject><subject>Clustering</subject><subject>Computational Biology - methods</subject><subject>Datasets</subject><subject>Gene expression</subject><subject>Gene Expression Profiling - methods</subject><subject>Gene sequencing</subject><subject>Genes</subject><subject>Genetic research</subject><subject>Genomes</subject><subject>Head &amp; neck cancer</subject><subject>Humans</subject><subject>Medicine and Health Sciences</subject><subject>Messenger RNA</subject><subject>Metastasis</subject><subject>Principal components analysis</subject><subject>Research and analysis methods</subject><subject>RNA</subject><subject>RNA - genetics</subject><subject>Sequence Analysis, RNA - methods</subject><subject>Single-Cell Analysis - methods</subject><subject>Software</subject><subject>Source code</subject><subject>Technology</subject><subject>Transcription</subject><subject>Transcriptome - genetics</subject><issn>1553-7358</issn><issn>1553-734X</issn><issn>1553-7358</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2019</creationdate><recordtype>article</recordtype><sourceid>PIMPY</sourceid><sourceid>DOA</sourceid><recordid>eNqVUkuP0zAQjhCIfcA_QBCJCxxa7IxjuxyQqopHpQISC2fL8SO4JHGwE7T8e5w2u9oiLsgHe2a--Wa-8WTZE4yWGBh-tfdj6GSz7FXllhghRkh5LzvHZQkLBiW_f-d9ll3EuEcoPVf0YXYGmJa8oMV5trvafJThhwmv83WVu84NzuftwZNH0xiV7C63PlmuqxuTK9M0-RBkF1Vw_eBbk_fBW9ek8KPsgZVNNI_n-zL79u7t182Hxe7z--1mvVsoCjAsqLUSQFVmhSknUOqCaW05oxy0JAhRinnJGWakwkwrToqVxZUGUknFgGi4zJ4defvGRzEPIooCMFkBcAQJsT0itJd70QeXJP0WXjpxcPhQCxkGpxojWJEKYOB2JStCOJMFYYALaTUy1ABOXG_mamPVGq1Ml-Q3J6Snkc59F7X_JZIeVpKJ4MVMEPzP0cRBtC5Oc5Sd8eOhbwQIIzL1_fwv6L_VLY-oWiYBrrM-1VXpaNM65TuTvsOINUWMEoYwTQkvTxISZjDXQy3HGMX26st_YD-dYskRq4KPMRh7OxWMxLSlN-2LaUvFvKUp7endid4m3awl_AGgEeIN</recordid><startdate>20191001</startdate><enddate>20191001</enddate><creator>Wang, Fang</creator><creator>Liang, Shaoheng</creator><creator>Kumar, Tapsi</creator><creator>Navin, Nicholas</creator><creator>Chen, Ken</creator><general>Public Library of Science</general><general>Public Library of Science (PLoS)</general><scope>CGR</scope><scope>CUY</scope><scope>CVF</scope><scope>ECM</scope><scope>EIF</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>ISN</scope><scope>ISR</scope><scope>3V.</scope><scope>7QO</scope><scope>7QP</scope><scope>7TK</scope><scope>7TM</scope><scope>7X7</scope><scope>7XB</scope><scope>88E</scope><scope>8AL</scope><scope>8FD</scope><scope>8FE</scope><scope>8FG</scope><scope>8FH</scope><scope>8FI</scope><scope>8FJ</scope><scope>8FK</scope><scope>ABUWG</scope><scope>AEUYN</scope><scope>AFKRA</scope><scope>ARAPS</scope><scope>AZQEC</scope><scope>BBNVY</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>BHPHI</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>FR3</scope><scope>FYUFA</scope><scope>GHDGH</scope><scope>GNUQQ</scope><scope>HCIFZ</scope><scope>JQ2</scope><scope>K7-</scope><scope>K9.</scope><scope>LK8</scope><scope>M0N</scope><scope>M0S</scope><scope>M1P</scope><scope>M7P</scope><scope>P5Z</scope><scope>P62</scope><scope>P64</scope><scope>PHGZM</scope><scope>PHGZT</scope><scope>PIMPY</scope><scope>PJZUB</scope><scope>PKEHL</scope><scope>PPXIY</scope><scope>PQEST</scope><scope>PQGLB</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>Q9U</scope><scope>RC3</scope><scope>7X8</scope><scope>5PM</scope><scope>DOA</scope><orcidid>https://orcid.org/0000-0002-3510-4550</orcidid><orcidid>https://orcid.org/0000-0003-2825-3387</orcidid><orcidid>https://orcid.org/0000-0003-4013-5279</orcidid></search><sort><creationdate>20191001</creationdate><title>SCMarker: Ab initio marker selection for single cell transcriptome profiling</title><author>Wang, Fang ; Liang, Shaoheng ; Kumar, Tapsi ; Navin, Nicholas ; Chen, Ken</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c633t-6ffa33cbe9168435d27ddf87683da4006618587174b17dc8429f1bd34bac734d3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2019</creationdate><topic>Algorithms</topic><topic>B cells</topic><topic>Base Sequence - genetics</topic><topic>Bioinformatics</topic><topic>Biology</topic><topic>Biology and Life Sciences</topic><topic>Biomarkers</topic><topic>Cancer</topic><topic>Cluster Analysis</topic><topic>Clustering</topic><topic>Computational Biology - methods</topic><topic>Datasets</topic><topic>Gene expression</topic><topic>Gene Expression Profiling - methods</topic><topic>Gene sequencing</topic><topic>Genes</topic><topic>Genetic research</topic><topic>Genomes</topic><topic>Head &amp; neck cancer</topic><topic>Humans</topic><topic>Medicine and Health Sciences</topic><topic>Messenger RNA</topic><topic>Metastasis</topic><topic>Principal components analysis</topic><topic>Research and analysis methods</topic><topic>RNA</topic><topic>RNA - genetics</topic><topic>Sequence Analysis, RNA - methods</topic><topic>Single-Cell Analysis - methods</topic><topic>Software</topic><topic>Source code</topic><topic>Technology</topic><topic>Transcription</topic><topic>Transcriptome - genetics</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Wang, Fang</creatorcontrib><creatorcontrib>Liang, Shaoheng</creatorcontrib><creatorcontrib>Kumar, Tapsi</creatorcontrib><creatorcontrib>Navin, Nicholas</creatorcontrib><creatorcontrib>Chen, Ken</creatorcontrib><collection>Medline</collection><collection>MEDLINE</collection><collection>MEDLINE (Ovid)</collection><collection>MEDLINE</collection><collection>MEDLINE</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>Gale In Context: Canada</collection><collection>Gale In Context: Science</collection><collection>ProQuest Central (Corporate)</collection><collection>Biotechnology Research Abstracts</collection><collection>Calcium &amp; Calcified Tissue Abstracts</collection><collection>Neurosciences Abstracts</collection><collection>Nucleic Acids Abstracts</collection><collection>Health &amp; Medical Collection (Proquest)</collection><collection>ProQuest Central (purchase pre-March 2016)</collection><collection>Medical Database (Alumni Edition)</collection><collection>Computing Database (Alumni Edition)</collection><collection>Technology Research Database</collection><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>ProQuest Natural Science Collection</collection><collection>Hospital Premium Collection</collection><collection>Hospital Premium Collection (Alumni Edition)</collection><collection>ProQuest Central (Alumni) (purchase pre-March 2016)</collection><collection>ProQuest Central (Alumni)</collection><collection>ProQuest One Sustainability</collection><collection>ProQuest Central UK/Ireland</collection><collection>Advanced Technologies &amp; Aerospace Database‎ (1962 - current)</collection><collection>ProQuest Central Essentials</collection><collection>Biological Science Collection</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest Natural Science Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central</collection><collection>Engineering Research Database</collection><collection>Health Research Premium Collection</collection><collection>Health Research Premium Collection (Alumni)</collection><collection>ProQuest Central Student</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Computer Science Collection</collection><collection>Computer Science Database</collection><collection>ProQuest Health &amp; Medical Complete (Alumni)</collection><collection>Biological Sciences</collection><collection>Computing Database</collection><collection>Health &amp; Medical Collection (Alumni Edition)</collection><collection>PML(ProQuest Medical Library)</collection><collection>Biological Science Database</collection><collection>ProQuest advanced technologies &amp; aerospace journals</collection><collection>ProQuest Advanced Technologies &amp; Aerospace Collection</collection><collection>Biotechnology and BioEngineering Abstracts</collection><collection>ProQuest Central (New)</collection><collection>ProQuest One Academic (New)</collection><collection>Publicly Available Content Database</collection><collection>ProQuest Health &amp; Medical Research Collection</collection><collection>ProQuest One Academic Middle East (New)</collection><collection>ProQuest One Health &amp; Nursing</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Applied &amp; Life Sciences</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>ProQuest Central Basic</collection><collection>Genetics Abstracts</collection><collection>MEDLINE - Academic</collection><collection>PubMed Central (Full Participant titles)</collection><collection>DOAJ Directory of Open Access Journals</collection><jtitle>PLoS computational biology</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Wang, Fang</au><au>Liang, Shaoheng</au><au>Kumar, Tapsi</au><au>Navin, Nicholas</au><au>Chen, Ken</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>SCMarker: Ab initio marker selection for single cell transcriptome profiling</atitle><jtitle>PLoS computational biology</jtitle><addtitle>PLoS Comput Biol</addtitle><date>2019-10-01</date><risdate>2019</risdate><volume>15</volume><issue>10</issue><spage>e1007445</spage><pages>e1007445-</pages><issn>1553-7358</issn><issn>1553-734X</issn><eissn>1553-7358</eissn><abstract>Single-cell RNA-sequencing data generated by a variety of technologies, such as Drop-seq and SMART-seq, can reveal simultaneously the mRNA transcript levels of thousands of genes in thousands of cells. It is often important to identify informative genes or cell-type-discriminative markers to reduce dimensionality and achieve informative cell typing results. We present an ab initio method that performs unsupervised marker selection by identifying genes that have subpopulation-discriminative expression levels and are co- or mutually-exclusively expressed with other genes. Consistent improvements in cell-type classification and biologically meaningful marker selection are achieved by applying SCMarker on various datasets in multiple tissue types, followed by a variety of clustering algorithms. The source code of SCMarker is publicly available at https://github.com/KChen-lab/SCMarker.</abstract><cop>United States</cop><pub>Public Library of Science</pub><pmid>31658262</pmid><doi>10.1371/journal.pcbi.1007445</doi><orcidid>https://orcid.org/0000-0002-3510-4550</orcidid><orcidid>https://orcid.org/0000-0003-2825-3387</orcidid><orcidid>https://orcid.org/0000-0003-4013-5279</orcidid><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 1553-7358
ispartof PLoS computational biology, 2019-10, Vol.15 (10), p.e1007445
issn 1553-7358
1553-734X
1553-7358
language eng
recordid cdi_plos_journals_2314933803
source Publicly Available Content Database; PubMed Central
subjects Algorithms
B cells
Base Sequence - genetics
Bioinformatics
Biology
Biology and Life Sciences
Biomarkers
Cancer
Cluster Analysis
Clustering
Computational Biology - methods
Datasets
Gene expression
Gene Expression Profiling - methods
Gene sequencing
Genes
Genetic research
Genomes
Head & neck cancer
Humans
Medicine and Health Sciences
Messenger RNA
Metastasis
Principal components analysis
Research and analysis methods
RNA
RNA - genetics
Sequence Analysis, RNA - methods
Single-Cell Analysis - methods
Software
Source code
Technology
Transcription
Transcriptome - genetics
title SCMarker: Ab initio marker selection for single cell transcriptome profiling
url http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-23T04%3A59%3A50IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-gale_plos_&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=SCMarker:%20Ab%20initio%20marker%20selection%20for%20single%20cell%20transcriptome%20profiling&rft.jtitle=PLoS%20computational%20biology&rft.au=Wang,%20Fang&rft.date=2019-10-01&rft.volume=15&rft.issue=10&rft.spage=e1007445&rft.pages=e1007445-&rft.issn=1553-7358&rft.eissn=1553-7358&rft_id=info:doi/10.1371/journal.pcbi.1007445&rft_dat=%3Cgale_plos_%3EA607647016%3C/gale_plos_%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c633t-6ffa33cbe9168435d27ddf87683da4006618587174b17dc8429f1bd34bac734d3%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=2314933803&rft_id=info:pmid/31658262&rft_galeid=A607647016&rfr_iscdi=true