Skip to content

A repo dedicated to predicting bank failures which tests feature selection algorithms and prediction model architectures.

License

Notifications You must be signed in to change notification settings

wyattschwanbeck/US_Bank_Failure_Predictions

Repository files navigation

US Bank Failure Prediction Summary

This was created December of 2023 and served as my final project for my Master's degree. The full pdf providing greater detail is included along with all source code dealing with data gathering from SEC's EDGAR API (C#),  feature selection, and machine learning models (Python). Over 2200 variables were batch analyzed using feature selection algorithms. Each set of variables were then trained with quarterly data from 2000 - 2022 based on the categorization of failed banks with Y=1 if the reporting date is within 6 months of the reported failure date.

Selected Features

The best performing feature selection algorithm turned out to be an ensemble dataset composed of 36 similarly selected ratio variables from Random Forest Boruta and Random Forest Regression. 

FDIC Var NameDescription
LNOT3T12RALL OTHER LNS & LS * 3-12 MONS RATIO
RBC1AAJLEVERAGE RATIO-PCA
OREROTHER REAL ESTATE OWNED RATIO
NARERESRNONACCRUAL-RE*1-4 FAMILY RATIO
LNRESNCRLOAN LOSS RESERVE/N/C LOANS
VOLIABRVOLATILE LIABILITIES RATIO
SCRDEBTRDEBT SECURITIES RATIO
ROAReturn on assets (ROA)
INTINCYINTEREST INCOME TO EARNING ASSETS RATIO
RBCRWAJTOTAL RBC RATIO-PCA
IDDEPINRIDDEPINR
LNOT3LESRALL OTHER LNS & LS*3 MO OR LESS RATIO
IDT1RWAJRTIER 1 RISK-BASED CAPITAL RATIO
NTRRNONTRANSACTION-TOTAL RATIO
LIABRTOTAL LIABILITIES RATIO
ROEINJRRETAINED EARNINGS/AVG BK EQUITY
RB2LNRESRALLOWANCE FOR L&L IN TIER 2 RATIO
NAASSETRNONACCRUAL-AG LNS*SMALL BKS RATIO
LNATRESRRALLOW FOR LOANS + ALLOC TRN RISK RATIO
EQREQUITY CAPITAL RATIO
NCRERN/C REAL ESTATE LNS/REAL ESTATE
NTLNLSRNET CHARGE-OFFS/LOANS & LEASES
EEFFREFFICIENCY RATIO
RBCT1JRTIER 1 RBC ADJUSTED LLR - PCA RATIO
P3ASSETR30-89 DAYS P/D TOTAL ASSETS RATIO
EQUPTOTRUP-NET & OTHER CAPITAL RATIO
ROAPTXPretax return on assets
CD3T12SRTIME DEP $250,000 OR LESS REMAINING MATURITY OR REPRICING 3-12 MONTHS RATIO
NPERFVNONPERF ASSETS/TOTAL ASSETS
LNCI1RC&I LOANS-UNDER-100-$ RATIO
P3RER30-89 DAYS P/D-REAL ESTATE LOANS RATIO
DEPINSRESTIMATED INSURED DEPOSITS RATIO
COREDEPRCORE DEPOSITS RATIO
IDLNCORRNET LOANS AND LEASES TO CORE DEPOSITS RATIO
DEPLGAMTRAMT DEP ACC GREATER THAN $250,000 RATIO
NARECNOTRNONACCRUAL OTHER CONSTR & LAND RATIO

Selected Model and Results

The highest performing model within testing was the probabilistic neural network (PNN) which produced estimate distributions with which standard deviation can be calculated along with mean predictions. This model type excelled with the unbalanced nature of the rate of banking failures against non-failing banks (10:1 in training).

Notably, Republic First Bank dba Republic Bank failed in April of 2024 and was correctly identified as likely to fail within this model at 75%, 83%, and 87% from 12/30/2022 to 06/30/2023. Further, Citizens Bank was classified as likely to fail during this time at 88%-93% and did fail in November, 2023. 

Future Development

There is opportunity to improve this models precision although it is imperative that recall remain maximized due to the critical nature of bank failures. The top ~20% of false-positive non-failing banks may be further scrutinized as some may truly also be operating on the brink of failure.

About

A repo dedicated to predicting bank failures which tests feature selection algorithms and prediction model architectures.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published