Transient-optimized real-bogus classification with Bayesian convolutional neural networks - sifting the GOTO candidate stream - UTU Tutkimustietojärjestelmä

A1 Vertaisarvioitu alkuperäisartikkeli tieteellisessä lehdessä

Transient-optimized real-bogus classification with Bayesian convolutional neural networks - sifting the GOTO candidate stream

Tekijät: Killestein TL, Lyman J, Steeghs D, Ackley K, Dyer MJ, Ulaczyk K, Cutter R, Mong YL, Galloway DK, Dhillon V, O'Brien P, Ramsay G, Poshyachinda S, Kotak R, Breton RP, Nuttall LK, Palle E, Pollacco D, Thrane E, Aukkaravittayapun S, Awiphan S, Burhanudin U, Chote P, Chrimes A, Daw E, Duffy C, Eyles-Ferris R, Gompertz B, Heikkila T, Irawati P, Kennedy MR, Levan A, Littlefair S, Makrygianni L, Sanchez DM, Mattila S, Maund J, McCormac J, Mkrtichian D, Mullaney J, Rol E, Sawangwit U, Stanway E, Starling R, Strom PA, Tooke S, Wiersema K, Williams SC

Kustantaja: OXFORD UNIV PRESS

Julkaisuvuosi: 2021

Lehti: Monthly Notices of the Royal Astronomical Society

Tietokannassa oleva lehden nimi: MONTHLY NOTICES OF THE ROYAL ASTRONOMICAL SOCIETY

Lehden akronyymi: MON NOT R ASTRON SOC

Vuosikerta: 503

Numero: 4

Aloitussivu: 4838

Lopetussivu: 4854

Sivujen määrä: 17

ISSN: 0035-8711

DOI: https://doi.org/10.1093/mnras/stab633

Rinnakkaistallenteen osoite: https://research.utu.fi/converis/portal/detail/Publication/58753479

Tiivistelmä

Large-scale sky surveys have played a transformative role in our understanding of astrophysical transients, only made possible by increasingly powerful machine learning-based filtering to accurately sift through the vast quantities of incoming data generated. In this paper, we present a new real-bogus classifier based on a Bayesian convolutional neural network that provides nuanced, uncertainty-aware classification of transient candidates in difference imaging, and demonstrate its application to the datastream from the GOTO wide-field optical survey. Not only are candidates assigned a well-calibrated probability of being real, but also an associated confidence that can be used to prioritize human vetting efforts and inform future model optimization via active learning. To fully realize the potential of this architecture, we present a fully automated training set generation method which requires no human labelling, incorporating a novel data-driven augmentation method to significantly improve the recovery of faint and nuclear transient sources. We achieve competitive classification accuracy (FPR and FNR both below 1 percent) compared against classifiers trained with fully human-labelled data sets, while being significantly quicker and less labour-intensive to build. This data-driven approach is uniquely scalable to the upcoming challenges and data needs of next-generation transient surveys. We make our data generation and model training codes available to the community.

Ladattava julkaisu

This is an electronic reprint of the original article.
This reprint may differ from the original in pagination and typographic detail. Please cite the original version.

stab633.pdf