Loading…
Estimating photometric redshift from mock flux for CSST survey by using weighted Random Forest
ABSTRACT Accurate estimation of photometric redshifts (photo-z) is crucial in studies of both galaxy evolution and cosmology using current and future large sky surveys. In this study, we employ Random Forest (RF), a machine learning algorithm, to estimate photo-z and investigate the systematic uncer...
Saved in:
Published in: | Monthly notices of the Royal Astronomical Society 2024-02, Vol.527 (4), p.12140-12153 |
---|---|
Main Authors: | , , , , , , , , , , , , |
Format: | Article |
Language: | English |
Citations: | Items that this one cites Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | ABSTRACT
Accurate estimation of photometric redshifts (photo-z) is crucial in studies of both galaxy evolution and cosmology using current and future large sky surveys. In this study, we employ Random Forest (RF), a machine learning algorithm, to estimate photo-z and investigate the systematic uncertainties affecting the results. Using galaxy flux and colour as input features, we construct a mapping between input features and redshift by using a training set of simulated data, generated from the Hubble Space Telescope Advanced Camera for Surveys (HST-ACS) and COSMOS catalogue, with the expected instrumental effects of the planned China Space Station Telescope (CSST). To improve the accuracy and confidence of predictions, we incorporate inverse variance weighting and perturb the catalogue using input feature errors. Our results show that weighted RF can achieve a photo-z accuracy of $\rm \sigma _{NMAD}=0.025$ and an outlier fraction of $\rm \eta =2.045\ \hbox{per cent}$, significantly better than the values of $\rm \sigma _{NMAD}=0.043$ and $\rm \eta =6.45\ \hbox{per cent}$ obtained by the widely used Easy and Accurate Zphot from Yale (EAZY) software, which uses template-fitting method. Furthermore, we have calculated the importance of each input feature for different redshift ranges and found that the most important input features reflect the approximate position of the break features in galaxy spectra, demonstrating the algorithm’s ability to extract physical information from data. Additionally, we have established confidence indices and error bars for each prediction value based on the shape of the redshift probability distribution function, suggesting that screening sources with high confidence can further reduce the outlier fraction. |
---|---|
ISSN: | 0035-8711 1365-2966 |
DOI: | 10.1093/mnras/stad3976 |