Model inventory management: Should a financial model ‘know’ its own ID?

Model inventory management: Should a financial model ‘know’ its own ID?

By Jon Hill, Former MD, Global Head of Model Governance, Credit Suisse

About Jon:

Jon Hill, Ph.D., is a former Managing Director at Credit Suisse with over twenty years of experience in various areas of quantitative finance, specializing most recently in model risk management, governance and validation. He is currently the Global Head of Model Risk Governance Standards at Credit Suisse, a team comprised of 14 model risk managers in New York London, Zurich, Mumbai and Singapore.  Jon’s team is responsible for the ongoing identification, measurement, risk rating, inventory and monitoring of CS corporate model risk across all business units, regions and legal entities and for validation of medium risk models.

Prior to joining Credit Suisse in January of 2017, he was the founder and global head of the Morgan Stanley’s global market and operational risk validation team; his team of 7 Ph.D. and Masters level quants in New York and Budapest is responsible for the validation (second-line-of-defense) of Morgan Stanley’s global market risk models, including Value at Risk (VaR), Stressed VaR, Incremental Risk Charge, Comprehensive Risk Measure and all firm wide Operational Risk models.

Prior to Morgan Stanley Jon was a member of the model validation group at Citigroup for six years, concentrating on equity, fixed income, foreign exchange, credit and market risk models. Before joining the Citigroup model validation team he worked for eight years on model development and general quantitative risk analytic methodologies as a member of the Quantitative Analysis Group at Salomon Smith Barney, which later merged with Citibank to form Citigroup.

Jon holds a Ph.D. in Biophysics from the University of Utah. He is a frequent speaker and chairperson at professional model risk management conferences.

What, for you, are the benefits of attending a Course like ‘Model Risk Management’? What can attendees expect to learn from your session?

The greatest benefit is getting to hear from experienced professional model practitioners from many of the leading financial and insurance firms and learning about their approaches to this challenging discipline and what innovations they may be pursuing, such as applying data mining and machine learning techniques to different areas of model risk management. Opportunities for ad hoc discussions during the luncheons and breaks can be particularly rewarding – this is often how you may first learn about some of the leading-edge advancements being made in the field of model validation and risk management.

Not all model risks arise from within models – some less recognized forms of model risk exist outside of and between models within a firm’s model ecosystem. My presentation will focus attention on one of the less familiar types of risk external to models: I call it model inventory risk. I became painfully aware of how backward model inventory management practices are at most firms when I recently I was assigned on short notice the task of pulling together a firm’s CCAR model inventory complete with all upstream and downstream dependencies. It was a very manual and tedious process involving pulling validation data from two different model inventory databases, collecting attestations of completeness and accuracy from a dozen model supervisors and trying to nail down multiple layers of upstream and downstream dependencies. But that difficult experience made begin to wonder if in this age of automation there shouldn’t be a better way than an antiquated manual process that seems to be stuck in the late 20thcentury.

It is an uncomfortable truth that today most financial firms cannot claim to have a complete and accurate inventory of all their active models, even though this is a regulatory requirement under increasing scrutiny in recent bank exams. It is even more uncomfortable that these firms cannot answer with much accuracy such questions as “how many times was this model actually used during the last year?”, or “which models exhibit significant seasonality” or “in what geographic regions, or Legal Entities, is this model used?”, or “Were any unvalidated models used during the last year?”. Many of these types of questions could only be answered by querying model owners and users, and, if they answered at all, receiving what are at best informed estimates and at worst just guesses.

I will try to make the case in my presentation that this lack of transparency about model inventory and usage can be traced to a single fundamental industrywide shortcoming in model risk management: quantitative models at almost all leading financial firms today simply do not ‘know who they are’. By this I mean their model IDs are not embedded in the model source code.  This single, industry-wide blind spot in model discipline exposes almost all firms to inventory risk, which is actually one of the few risk types that can be eliminated with proper controls. In my presentation I will define precisely what I mean by model inventory risk, identify the resulting liabilities if it remains unmitigated, and then proceed to propose an innovation for remediation of this form risk.

This is an area of Model Risk Management (MRM) that is rarely if ever addressed within institutions or at professional conferences. Because I have an eccentric fondness for talking about relatively obscure topics that no one else seems inclined to discuss, I offer this presentation.

Can you give our readers some insight into your concept of a model knowing its own ID? Why would this be beneficial and how can organizations get started?

I find it ironic that my iPhone, washing machine and automobile all ‘know’ their own unique serial numbers. Today they are embedded in permanent memory (such as ROM) in the electronics for the device, but even before electronics they were physically stamped on the device. Yet at every financial institution I am aware of it is still not a common practice for quantitative model IDs to be embedded in the model’s actual source code. Model IDs typically appear on the first pages of developer and validation documents but are not embedded in the model’s source code as a global parameter.

This leads to an obvious puzzling question: why is the financial model software world so far behind hardware manufacturers in this respect? Why weren’t model IDs embedded in the model software when they were first introduced? I think the answer is probably because financial models were originally identified only by name, some of which could be very convoluted and obscure. It wasn’t until firms began developing centralized database repositories for model documentation (and in some cases source code) in the early 2000s, motivated by regulatory pressure, that it became necessary to assign unique alpha-numeric identifiers to models as indices into the model databases.

A firm can start down this path by asking (or requiring) model developers/owners to embed model IDs and version numbers by adding a few executable lines to the main routine of the source code – this small change will not impact performance and can be performed as part of a model’s regular review cycle. Once done, the model ID and model version can be attached to any results produced by the model, eliminating potential confusion about which model and version produced which result; this type of confusion can be a real problem in many firms that do not have rigorous version control discipline.

What impact could this have on an organization?

Embedding IDs and versions into source code opens a door to the vexing types of questions posed in the answer to the 2ndquestion. But to answer these types of usage questions accurately for all models employed by a firm a second step will be required: adding a tracking function to each model. The tracking function I will propose in my presentation is analogous to the transponders that all civilian aircraft are equipped with that allow traffic controllers identify any aircraft and track its location and speed. Properly designed, a model transponder function would be called once each time a model is executed and would send a message to a central repository reporting the model ID, version, date, time, model type and something called a MAC address which indicates the physical location of the processor.

If this basic indicative data is stored for each model in in use over the course of 6 months or a year into a centralized database, a wealth of detailed ‘which, when and where’ information about model usage will be available to model risk managers. Histograms of frequency of execution and plots of frequency as a function time can be produced, the latter offering a means of identifying seasonality or other time variant patterns of usage. As I will explain in the presentation, it would also be possible to identify all the layers of upstream model dependencies based on actual execution sequence rather than potentially error-prone attestations by model owners and users. If model performance metrics (such as R-squared and goodness-of-fit statistics) are captured by the transponder function the information can be used to support the ongoing monitoring requirements of SR11-7. Such a warehouse of detailed usage data will offer opportunities for data mining and machine learning to identify detailed patterns of model usage that may offer revelations not apparent under inspection by human agents. It is a common experience that mining extensive and detailed datasets often uncovers some unexpected nuggets of insight.

With this background I can now explain how this data could be used to address the vexing question of “what geographical or legal entities used this model over the last year?” which is virtually impossible to answer with today’s inventories except by polling all downstream model users, because, unfortunately, model owners do not typically know who all of the downstream users of their models are. Imagine if for any model a map of the countries of the world could be populated with blue or red flashing dots showing the locations of all the computers that ran that model during the past year. Move this plot forward or backward through one-month time slices could reveal seasonal patterns of usage such as might occur with seasonal commodities such energy or agricultural derivatives. I doubt there is any financial firm today that can accurately produce such time-dependent displays of geographic model usage.

How do you think organizations would benefit from automating the inventory attestation process?

CCAR/DFAST banks are required by regulators to submit an annual attestation to the completeness and accuracy of a firm’s model inventory. This is still accomplished by MRM at most banks through a very manual process involving submitting lists of models to model supervisors or owners and requesting confirmation, often by return email, that the list is accurate and complete. In my own experience, due to errors or oversights in the inventory database, the response could be something like “12 of these models belong to my team but not the 13th– I have no idea who the owner should be”, or “the list does not include these two models that I also am responsible for”. Several more iterations may be required to establish ownership of the orphan and omitted models.

Imagine the effort required to perform this exercise at large firms that may have more than 2000 models and you get some idea of the effort required for what is still very much a tedious and manual 20thcentury process.

Suppose that instead a firm is able to attest that every model in inventory has an embedded ID and an operational transponder function. This attestation would be part the annual model review carried out by the 2ndline of defense. This approach to attestation could apply equally not only to production models but to EUC (such as spreadsheet) models and even vendor models as I will explain in my presentation. If a list of the usage IDs is compared against the list of IDs in inventory successfully it would comprise proof that the inventory is complete and accurate without requiring any of the manual process currently employed. Models not used during the period covered would still have to be flagged for further investigation.

The direct benefits to a firm would be greater accuracy with reduced workload (and potentially MRM staff requirements) and a demonstrably improved firm-wide model discipline.

How do you see the model risk landscape evolving over the next 6-12 months?

I first started performing model validations in 2003. A great deal has changed in finance over the last 15 years, but validation methodology has barely evolved. We are still performing validations today very much the same way that we did in 2003, with only some incremental improvements along the way, such as improved high-level modeling languages and pre-packaged toolkits for replication and testing. But it is still very much a 20thcentury manual effort performed by skilled quants who assess the model and its documentation, confirm implementation through independent replication, perform additional testing and write a detailed validation document which can be as lengthy as a master’s thesis. Three hundred page validation documents are not unknown (the longest I have personally encountered was well over 500 pages!).

But change is in the air and over the coming year I expect to see continued progress in some of the innovations that are already taking place at a few bleeding-edge firms. I particularly expect to see more applications of Machine Learning (ML) and Big Data (BD) methodologies to the automation of some of the more tedious and time-consuming parts the validation process such as creation of benchmarks and generation of test suites customized for various classes of models. Deep Learning (DL) techniques can accumulate a wealth of detailed knowledge about a firm’s models over time and progressively refine test suites to identify suspected weaknesses. Progress in this field is very rapid so I would advise anyone new to the field of model risk management or anyone concerned about staying current to become familiar with and stay abreast of developments in ML/DL and BD.

If I may be allowed to try to look even further down the road, over the next 3-5 years I anticipate a major disruption in the ways that model validations will be performed, with much of the time-consuming manual effort (including review of developers’ documentation) being taken over by ML/DL, even as far as the automated generation of parts of the final validation document. I don’t know if we will ever see the day when we can submit a detailed developer’s document to a Deep Learning model and receive a detailed validation document complete with independent testing and a pass/fail recommendation an hour later, but 10 years ago I didn’t think I would live to see self-driving cars and now many prototypes are being tested on our roadways.

This is what I expect the future of MRM will look like and these trends are already gaining momentum no matter how uncomfortable and ingrained in the old ways of doing validations we may be. My advice to anyone new to the field is to attend at least one model risk conference, such as the one CFP is sponsoring, annually, stay abreast of the trends I have outlined and plan to align your skillset development program accordingly. There will always be roles for talented and knowledgeable quants, but the ways in which we will be able to add value in the field of MRM are going to change dramatically over the next 3-5 years.

Hear insights like this and more at the Model Risk Management Course…
Make your free account
become-a-risk-insights-member-banner-1