A4 Refereed article in a conference publication

Predicting profitability of peer-to-peer loans with recovery models for censored data




AuthorsMarkus Viljanen, Ajay Byanjankar, Tapio Pahikkala

EditorsIreneusz Czarnowski, Robert J. Howlett, Lakhmi C. Jain

Conference nameInternational Conference on Intelligent Decision Technologies

PublisherSpringer

Publication year2020

JournalInternational Conference on Intelligent Decision Technologies

Book title Intelligent Decision Technologies: Proceedings of the 12th KES International Conference on Intelligent Decision Technologies (KES-IDT 2020)

Journal name in sourceSmart Innovation, Systems and Technologies

Series titleSmart Innovation, Systems and Technologies

Volume193

First page 15

Last page25

ISBN978-981-15-5924-2

ISSN2190-3018

DOIhttps://doi.org/10.1007/978-981-15-5925-9_2


Abstract

Peer-to-peer lending is a new lending approach gaining in popularity.
These loans can offer high interest rates, but they are also exposed to
credit risk. In fact, high default rates and low recovery rates are the
norms. Potential investors want to know the expected profit in these
loans, which means they need to model both defaults and recoveries.
However, real-world data sets are censored in the sense that they have
many ongoing loans, where future payments are unknown. This makes
predicting the exact profit in recent loans particularly difficult. In
this paper, we present a model that works for censored loans based on
monthly default and recovery rates. We use the Bondora data set, which
has a large amount of censored and defaulted loans. We show that loan
characteristics predicting lower defaults and higher recoveries are
usually, but not always, similar. Our predictions have some correlation
with the platform’s model, but they are substantially different. Using a
more accurate model, it is possible to select loans that are expected
to be more profitable. Our model is unbiased, with a relatively low
prediction error. Experiments in selecting portfolios of loans with
lower or higher Loss Given Default (LGD) demonstrate that our model is
useful, whereas predictions based on the platform’s model or credit
ratings are not better than random. 



Last updated on 2024-26-11 at 17:20