This was created December of 2023 and served as my final project for my Master's degree. The full pdf providing greater detail is included along with all source code dealing with data gathering from SEC's EDGAR API (C#), feature selection, and machine learning models (Python). Over 2200 variables were batch analyzed using feature selection algorithms. Each set of variables were then trained with quarterly data from 2000 - 2022 based on the categorization of failed banks with Y=1 if the reporting date is within 6 months of the reported failure date.
The best performing feature selection algorithm turned out to be an ensemble dataset composed of 36 similarly selected ratio variables from Random Forest Boruta and Random Forest Regression.
FDIC Var Name | Description |
LNOT3T12R | ALL OTHER LNS & LS * 3-12 MONS RATIO |
RBC1AAJ | LEVERAGE RATIO-PCA |
ORER | OTHER REAL ESTATE OWNED RATIO |
NARERESR | NONACCRUAL-RE*1-4 FAMILY RATIO |
LNRESNCR | LOAN LOSS RESERVE/N/C LOANS |
VOLIABR | VOLATILE LIABILITIES RATIO |
SCRDEBTR | DEBT SECURITIES RATIO |
ROA | Return on assets (ROA) |
INTINCY | INTEREST INCOME TO EARNING ASSETS RATIO |
RBCRWAJ | TOTAL RBC RATIO-PCA |
IDDEPINR | IDDEPINR |
LNOT3LESR | ALL OTHER LNS & LS*3 MO OR LESS RATIO |
IDT1RWAJR | TIER 1 RISK-BASED CAPITAL RATIO |
NTRR | NONTRANSACTION-TOTAL RATIO |
LIABR | TOTAL LIABILITIES RATIO |
ROEINJR | RETAINED EARNINGS/AVG BK EQUITY |
RB2LNRESR | ALLOWANCE FOR L&L IN TIER 2 RATIO |
NAASSETR | NONACCRUAL-AG LNS*SMALL BKS RATIO |
LNATRESRR | ALLOW FOR LOANS + ALLOC TRN RISK RATIO |
EQR | EQUITY CAPITAL RATIO |
NCRER | N/C REAL ESTATE LNS/REAL ESTATE |
NTLNLSR | NET CHARGE-OFFS/LOANS & LEASES |
EEFFR | EFFICIENCY RATIO |
RBCT1JR | TIER 1 RBC ADJUSTED LLR - PCA RATIO |
P3ASSETR | 30-89 DAYS P/D TOTAL ASSETS RATIO |
EQUPTOTR | UP-NET & OTHER CAPITAL RATIO |
ROAPTX | Pretax return on assets |
CD3T12SR | TIME DEP $250,000 OR LESS REMAINING MATURITY OR REPRICING 3-12 MONTHS RATIO |
NPERFV | NONPERF ASSETS/TOTAL ASSETS |
LNCI1R | C&I LOANS-UNDER-100-$ RATIO |
P3RER | 30-89 DAYS P/D-REAL ESTATE LOANS RATIO |
DEPINSR | ESTIMATED INSURED DEPOSITS RATIO |
COREDEPR | CORE DEPOSITS RATIO |
IDLNCORR | NET LOANS AND LEASES TO CORE DEPOSITS RATIO |
DEPLGAMTR | AMT DEP ACC GREATER THAN $250,000 RATIO |
NARECNOTR | NONACCRUAL OTHER CONSTR & LAND RATIO |
The highest performing model within testing was the probabilistic neural network (PNN) which produced estimate distributions with which standard deviation can be calculated along with mean predictions. This model type excelled with the unbalanced nature of the rate of banking failures against non-failing banks (10:1 in training).
Notably, Republic First Bank dba Republic Bank failed in April of 2024 and was correctly identified as likely to fail within this model at 75%, 83%, and 87% from 12/30/2022 to 06/30/2023. Further, Citizens Bank was classified as likely to fail during this time at 88%-93% and did fail in November, 2023.
There is opportunity to improve this models precision although it is imperative that recall remain maximized due to the critical nature of bank failures. The top ~20% of false-positive non-failing banks may be further scrutinized as some may truly also be operating on the brink of failure.