Loading…
Autoencoding Galaxy Spectra. II. Redshift Invariance and Outlier Detection
We present an unsupervised outlier detection method for galaxy spectra based on the spectrum autoencoder architecture spender , which reliably captures spectral features and provides highly realistic reconstructions for SDSS galaxy spectra. We interpret the sample density in the autoencoder latent s...
Saved in:
Published in: | The Astronomical journal 2023-08, Vol.166 (2), p.75 |
---|---|
Main Authors: | , , , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
cited_by | cdi_FETCH-LOGICAL-c416t-63fc2dbc645f3b0c2a3c23c4e2baf1f2cb178b5ff69f9078b4555dcef089ecb63 |
---|---|
cites | cdi_FETCH-LOGICAL-c416t-63fc2dbc645f3b0c2a3c23c4e2baf1f2cb178b5ff69f9078b4555dcef089ecb63 |
container_end_page | |
container_issue | 2 |
container_start_page | 75 |
container_title | The Astronomical journal |
container_volume | 166 |
creator | Liang, Yan Melchior, Peter Lu, Sicong Goulding, Andy Ward, Charlotte |
description | We present an unsupervised outlier detection method for galaxy spectra based on the spectrum autoencoder architecture
spender
, which reliably captures spectral features and provides highly realistic reconstructions for SDSS galaxy spectra. We interpret the sample density in the autoencoder latent space as a probability distribution, and identify outliers as low-probability objects with a normalizing flow. However, we found that the latent-space position is not, as expected from the architecture, redshift invariant, which introduces stochasticity into the latent space and the outlier detection method. We solve this problem by adding two novel loss terms during training, which explicitly link latent-space distances to data-space distances, preserving locality in the autoencoding process. Minimizing the additional losses leads to a redshift-invariant, nondegenerate latent-space distribution with clear separations between common and anomalous data. We inspect the spectra with the lowest probability and find them to include blends with foreground stars, extremely reddened galaxies, galaxy pairs and triples, and stars that are misclassified as galaxies. We release the newly trained
spender
model and the latent-space probability for the entire SDSS-I galaxy sample to aid further investigations. |
doi_str_mv | 10.3847/1538-3881/ace100 |
format | article |
fullrecord | <record><control><sourceid>proquest_doaj_</sourceid><recordid>TN_cdi_doaj_primary_oai_doaj_org_article_fbc3ccb893ce413e9e681d1a7bc433b5</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><doaj_id>oai_doaj_org_article_fbc3ccb893ce413e9e681d1a7bc433b5</doaj_id><sourcerecordid>2843094773</sourcerecordid><originalsourceid>FETCH-LOGICAL-c416t-63fc2dbc645f3b0c2a3c23c4e2baf1f2cb178b5ff69f9078b4555dcef089ecb63</originalsourceid><addsrcrecordid>eNp1kEtPwzAQhC0EEqVw5xiJK2nt2EmcY1WgBCFV4nG27M26uApxcVJE_z0pQeXEaVejmdnVR8gloxMuRT5lKZcxl5JNNSCj9IiMDtIxGVFKRZwlaXZKztp2TSljkooReZhtO48N-Mo1q2iha_21i543CF3Qk6gsJ9ETVu2bs11UNp86ON0ARrqpouW2qx2G6Aa73u18c05OrK5bvPidY_J6d_syv48fl4tyPnuMQbCsizNuIakMZCK13FBINIeEg8DEaMtsAobl0qTWZoUtaL-KNE0rQEtlgWAyPibl0Ft5vVab4N512CmvnfoRfFgpHToHNSprgAMYWXBAwTgWmElWMZ0bEJybtO-6Gro2wX9sse3U2m9D07-vEik4LUSe895FBxcE37YB7eEqo2pPX-1Rqz1qNdDvI9dDxPnNX-e_9m9eh4W2</addsrcrecordid><sourcetype>Open Website</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2843094773</pqid></control><display><type>article</type><title>Autoencoding Galaxy Spectra. II. Redshift Invariance and Outlier Detection</title><source>Directory of Open Access Journals</source><creator>Liang, Yan ; Melchior, Peter ; Lu, Sicong ; Goulding, Andy ; Ward, Charlotte</creator><creatorcontrib>Liang, Yan ; Melchior, Peter ; Lu, Sicong ; Goulding, Andy ; Ward, Charlotte</creatorcontrib><description>We present an unsupervised outlier detection method for galaxy spectra based on the spectrum autoencoder architecture
spender
, which reliably captures spectral features and provides highly realistic reconstructions for SDSS galaxy spectra. We interpret the sample density in the autoencoder latent space as a probability distribution, and identify outliers as low-probability objects with a normalizing flow. However, we found that the latent-space position is not, as expected from the architecture, redshift invariant, which introduces stochasticity into the latent space and the outlier detection method. We solve this problem by adding two novel loss terms during training, which explicitly link latent-space distances to data-space distances, preserving locality in the autoencoding process. Minimizing the additional losses leads to a redshift-invariant, nondegenerate latent-space distribution with clear separations between common and anomalous data. We inspect the spectra with the lowest probability and find them to include blends with foreground stars, extremely reddened galaxies, galaxy pairs and triples, and stars that are misclassified as galaxies. We release the newly trained
spender
model and the latent-space probability for the entire SDSS-I galaxy sample to aid further investigations.</description><identifier>ISSN: 0004-6256</identifier><identifier>EISSN: 1538-3881</identifier><identifier>DOI: 10.3847/1538-3881/ace100</identifier><language>eng</language><publisher>Madison: The American Astronomical Society</publisher><subject>Astronomy ; Astrostatistics ; Data analysis ; Galaxies ; Invariants ; Outliers (statistics) ; Probability distribution ; Red shift ; Spectra ; Spectroscopy ; Stars & galaxies</subject><ispartof>The Astronomical journal, 2023-08, Vol.166 (2), p.75</ispartof><rights>2023. The Author(s). Published by the American Astronomical Society.</rights><rights>2023. The Author(s). Published by the American Astronomical Society. This work is published under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c416t-63fc2dbc645f3b0c2a3c23c4e2baf1f2cb178b5ff69f9078b4555dcef089ecb63</citedby><cites>FETCH-LOGICAL-c416t-63fc2dbc645f3b0c2a3c23c4e2baf1f2cb178b5ff69f9078b4555dcef089ecb63</cites><orcidid>0000-0002-4557-6682 ; 0000-0002-8873-5065 ; 0000-0002-1001-1235 ; 0000-0002-8814-1670 ; 0000-0003-4700-663X</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>314,776,780,860,2096,27901,27902</link.rule.ids></links><search><creatorcontrib>Liang, Yan</creatorcontrib><creatorcontrib>Melchior, Peter</creatorcontrib><creatorcontrib>Lu, Sicong</creatorcontrib><creatorcontrib>Goulding, Andy</creatorcontrib><creatorcontrib>Ward, Charlotte</creatorcontrib><title>Autoencoding Galaxy Spectra. II. Redshift Invariance and Outlier Detection</title><title>The Astronomical journal</title><addtitle>AJ</addtitle><addtitle>Astron. J</addtitle><description>We present an unsupervised outlier detection method for galaxy spectra based on the spectrum autoencoder architecture
spender
, which reliably captures spectral features and provides highly realistic reconstructions for SDSS galaxy spectra. We interpret the sample density in the autoencoder latent space as a probability distribution, and identify outliers as low-probability objects with a normalizing flow. However, we found that the latent-space position is not, as expected from the architecture, redshift invariant, which introduces stochasticity into the latent space and the outlier detection method. We solve this problem by adding two novel loss terms during training, which explicitly link latent-space distances to data-space distances, preserving locality in the autoencoding process. Minimizing the additional losses leads to a redshift-invariant, nondegenerate latent-space distribution with clear separations between common and anomalous data. We inspect the spectra with the lowest probability and find them to include blends with foreground stars, extremely reddened galaxies, galaxy pairs and triples, and stars that are misclassified as galaxies. We release the newly trained
spender
model and the latent-space probability for the entire SDSS-I galaxy sample to aid further investigations.</description><subject>Astronomy</subject><subject>Astrostatistics</subject><subject>Data analysis</subject><subject>Galaxies</subject><subject>Invariants</subject><subject>Outliers (statistics)</subject><subject>Probability distribution</subject><subject>Red shift</subject><subject>Spectra</subject><subject>Spectroscopy</subject><subject>Stars & galaxies</subject><issn>0004-6256</issn><issn>1538-3881</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><sourceid>DOA</sourceid><recordid>eNp1kEtPwzAQhC0EEqVw5xiJK2nt2EmcY1WgBCFV4nG27M26uApxcVJE_z0pQeXEaVejmdnVR8gloxMuRT5lKZcxl5JNNSCj9IiMDtIxGVFKRZwlaXZKztp2TSljkooReZhtO48N-Mo1q2iha_21i543CF3Qk6gsJ9ETVu2bs11UNp86ON0ARrqpouW2qx2G6Aa73u18c05OrK5bvPidY_J6d_syv48fl4tyPnuMQbCsizNuIakMZCK13FBINIeEg8DEaMtsAobl0qTWZoUtaL-KNE0rQEtlgWAyPibl0Ft5vVab4N512CmvnfoRfFgpHToHNSprgAMYWXBAwTgWmElWMZ0bEJybtO-6Gro2wX9sse3U2m9D07-vEik4LUSe895FBxcE37YB7eEqo2pPX-1Rqz1qNdDvI9dDxPnNX-e_9m9eh4W2</recordid><startdate>20230801</startdate><enddate>20230801</enddate><creator>Liang, Yan</creator><creator>Melchior, Peter</creator><creator>Lu, Sicong</creator><creator>Goulding, Andy</creator><creator>Ward, Charlotte</creator><general>The American Astronomical Society</general><general>IOP Publishing</general><scope>O3W</scope><scope>TSCCA</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7TG</scope><scope>8FD</scope><scope>H8D</scope><scope>KL.</scope><scope>L7M</scope><scope>DOA</scope><orcidid>https://orcid.org/0000-0002-4557-6682</orcidid><orcidid>https://orcid.org/0000-0002-8873-5065</orcidid><orcidid>https://orcid.org/0000-0002-1001-1235</orcidid><orcidid>https://orcid.org/0000-0002-8814-1670</orcidid><orcidid>https://orcid.org/0000-0003-4700-663X</orcidid></search><sort><creationdate>20230801</creationdate><title>Autoencoding Galaxy Spectra. II. Redshift Invariance and Outlier Detection</title><author>Liang, Yan ; Melchior, Peter ; Lu, Sicong ; Goulding, Andy ; Ward, Charlotte</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c416t-63fc2dbc645f3b0c2a3c23c4e2baf1f2cb178b5ff69f9078b4555dcef089ecb63</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>Astronomy</topic><topic>Astrostatistics</topic><topic>Data analysis</topic><topic>Galaxies</topic><topic>Invariants</topic><topic>Outliers (statistics)</topic><topic>Probability distribution</topic><topic>Red shift</topic><topic>Spectra</topic><topic>Spectroscopy</topic><topic>Stars & galaxies</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Liang, Yan</creatorcontrib><creatorcontrib>Melchior, Peter</creatorcontrib><creatorcontrib>Lu, Sicong</creatorcontrib><creatorcontrib>Goulding, Andy</creatorcontrib><creatorcontrib>Ward, Charlotte</creatorcontrib><collection>Open Access: IOP Publishing Free Content</collection><collection>IOPscience (Open Access)</collection><collection>CrossRef</collection><collection>Meteorological & Geoastrophysical Abstracts</collection><collection>Technology Research Database</collection><collection>Aerospace Database</collection><collection>Meteorological & Geoastrophysical Abstracts - Academic</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Directory of Open Access Journals</collection><jtitle>The Astronomical journal</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Liang, Yan</au><au>Melchior, Peter</au><au>Lu, Sicong</au><au>Goulding, Andy</au><au>Ward, Charlotte</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Autoencoding Galaxy Spectra. II. Redshift Invariance and Outlier Detection</atitle><jtitle>The Astronomical journal</jtitle><stitle>AJ</stitle><addtitle>Astron. J</addtitle><date>2023-08-01</date><risdate>2023</risdate><volume>166</volume><issue>2</issue><spage>75</spage><pages>75-</pages><issn>0004-6256</issn><eissn>1538-3881</eissn><abstract>We present an unsupervised outlier detection method for galaxy spectra based on the spectrum autoencoder architecture
spender
, which reliably captures spectral features and provides highly realistic reconstructions for SDSS galaxy spectra. We interpret the sample density in the autoencoder latent space as a probability distribution, and identify outliers as low-probability objects with a normalizing flow. However, we found that the latent-space position is not, as expected from the architecture, redshift invariant, which introduces stochasticity into the latent space and the outlier detection method. We solve this problem by adding two novel loss terms during training, which explicitly link latent-space distances to data-space distances, preserving locality in the autoencoding process. Minimizing the additional losses leads to a redshift-invariant, nondegenerate latent-space distribution with clear separations between common and anomalous data. We inspect the spectra with the lowest probability and find them to include blends with foreground stars, extremely reddened galaxies, galaxy pairs and triples, and stars that are misclassified as galaxies. We release the newly trained
spender
model and the latent-space probability for the entire SDSS-I galaxy sample to aid further investigations.</abstract><cop>Madison</cop><pub>The American Astronomical Society</pub><doi>10.3847/1538-3881/ace100</doi><tpages>10</tpages><orcidid>https://orcid.org/0000-0002-4557-6682</orcidid><orcidid>https://orcid.org/0000-0002-8873-5065</orcidid><orcidid>https://orcid.org/0000-0002-1001-1235</orcidid><orcidid>https://orcid.org/0000-0002-8814-1670</orcidid><orcidid>https://orcid.org/0000-0003-4700-663X</orcidid><oa>free_for_read</oa></addata></record> |
fulltext | fulltext |
identifier | ISSN: 0004-6256 |
ispartof | The Astronomical journal, 2023-08, Vol.166 (2), p.75 |
issn | 0004-6256 1538-3881 |
language | eng |
recordid | cdi_doaj_primary_oai_doaj_org_article_fbc3ccb893ce413e9e681d1a7bc433b5 |
source | Directory of Open Access Journals |
subjects | Astronomy Astrostatistics Data analysis Galaxies Invariants Outliers (statistics) Probability distribution Red shift Spectra Spectroscopy Stars & galaxies |
title | Autoencoding Galaxy Spectra. II. Redshift Invariance and Outlier Detection |
url | http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-05T00%3A03%3A39IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_doaj_&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Autoencoding%20Galaxy%20Spectra.%20II.%20Redshift%20Invariance%20and%20Outlier%20Detection&rft.jtitle=The%20Astronomical%20journal&rft.au=Liang,%20Yan&rft.date=2023-08-01&rft.volume=166&rft.issue=2&rft.spage=75&rft.pages=75-&rft.issn=0004-6256&rft.eissn=1538-3881&rft_id=info:doi/10.3847/1538-3881/ace100&rft_dat=%3Cproquest_doaj_%3E2843094773%3C/proquest_doaj_%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c416t-63fc2dbc645f3b0c2a3c23c4e2baf1f2cb178b5ff69f9078b4555dcef089ecb63%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=2843094773&rft_id=info:pmid/&rfr_iscdi=true |