By Jacob Kosoff, Head of Model Risk Management & Validation, Regions Bank
Validation and governance of non-traditional models with increase in automation and robotics. – Day 2 at 2:00pm.
May 15, 2019 is Day Two of Risk Americas 2019. That day Jacob Kosoff, the head of Model Risk Management and Validation at Regions Bank, will present on a panel related to validation and governance of non-traditional models – including with automation in credit decisioning. That session will discuss unique risks from non-traditional models including Machine Learning models.
Model risks unique to or heightened by ML models
ML models pose unique challenges across data gathering, model development, and model monitoring. These are in addition to the challenges with transparency to ensure acceptance and usage of ML models by business users and third parties.
Model Data Risks
ML models are designed to tease out detailed and complex relationships given large amounts of data and variables as well as modern computing power. The first step in evaluating whether to utilize a ML model is whether the data even supports a model of such complexity. Also, data bias and limitations are exacerbated by ML models, and the model developer must carefully analyze data labels and the appropriateness of input variables for supervised ML.
For example, for credit risk or loss forecasting models, the data can be biased by a benign economic environment as well as a market of expanding debt, where the long-term risks may not be apparent in the data, and the data could even provide misleading information on the direction of risk.
In using ML for credit risk underwriting models, keep in mind that customers originated during good times may turn into undesirable customers during a downturn. Overall, without controls on a ML system, the model risks can outweigh benefits and would likely generate more, previously unforeseen risks.
Explainability and Parsimony in Conceptual Soundness
Conceptual soundness of the model can easily be compromised due to lack of explainability. In traditional models, the effects of input variables to output decisions are very transparent. Input attributions due to input multicollinearities are well-investigated and handled. ML models without proper regularization can still perform very well in terms of prediction errors yet when highly collinear inputs are involved, the models can be fraught with potential non-sensical variable effects. Explainability is particularly important to make sure that models will work in the area where training data are sparse or extrapolated. Recent developments in model explainability techniques should be carefully applied.
ML model development can tease out complex relationships and provide a significant lift above more traditional regression-based models, but the model fit, robustness and business sense must be transparent and verified. Indeed, modeling with ML methods requires at least as much rigor and explainability as with any simpler regression model.
For robustness after the model is fit, performance monitoring assures that models are working as intended, including when the environment changes. Monitoring for ML models should involve monitoring of overall performance as well as the relationships of the most important variables to outcomes to ensure that the underlying data and relationships are not changing with new observations.
When models are frequently retrained with new data, performance evaluation of the retraining is not sufficient. The changes and stability of variable effects should be part of performance monitoring. For example:
- When updates are not automated, a transparent challenger model could be run as part of ongoing monitoring so that when it shows significant lift in fit and performance, it can replace the champion model
- For ML models with automated updates, these material updates should be included in monitoring with sufficient transparency into the changes for explainability and to support validation testing of the changes.
ML models have the advantage of capturing complex relationships and can be automatically or frequently updated to capture/retain the most optimal model fit, but this then requires close monitoring to ensure that the model continues to identify the correct relationships in the data.
All models are susceptible to changes in the environment, including changes in products, processes, or human behavior. However, ML models are more susceptible if appropriate transparency and monitoring are not in place.
The opinions expressed in the article are statements of the author and are intended only for informational purposes, and are not opinions of any financial institution and any representation to the contrary is expressly disclaimed.