15 August 2013

Default Risk Modeling: Little Experiment (Part 3: Modeling)

I’ll continue the exercise of modeling the U.S. corporate default rates started on 28 July 2013. You may want to first recall Part 2 of this series where we summarized variable definitions, discussed modeling method and looked into the correlations of the different default risk indicators.

(Warning: Even though I’m trying to include into the articles of this series valuable insights about the current state of affairs, they may not be that interesting read for those with low Nerdometer Score, XD.)   
...

Modeling process

Broadly speaking, statistical modeling consists of the following steps:

  1. Business understanding (what we are modeling and why; what’s the logic of the underlying business; what data is available etc.);
  2. Defining theoretical model (model form etc.);
  3. Data collection,  –analysis and –preparation;
  4. Model development;
  5. Model analysis and selection of the final model.
We have already passed steps (1)-(3). Now we are about to develop the model which involves: choosing the final set of variables and calibration of the model parameters. Having said that, it is to be remembered that modeling is an iterative process; usually after taking a look to the preliminary results, one has to return to the work with data and prepare them differently or even search for additional information. At the end, one is most probably able to develop more than one alternative model. So a choice has to be made. That’s perhaps the most difficult part – especially if stakes are high, for example if a financial regulator wanted to create a benchmark model for the banking industry to calculate capital requirements.

If you want a quick and easy way for modeling, you let the software (such as R, the free software environment for statistical computing and graphics that I’m using at the moment) decide which variables to include and which model is the best. You apply the stepwise regression and see what comes out. Technically it can be reduced to a more or less “push the button” exercise.

Yet more experienced modelers are saying that letting the software have free reign to include and exclude model variables does not result in better models, only less well understood ones. Maybe and maybe not, it depends on how well one understands the inputs… Let’s see where we’ll end up this time.

If everything is a priority (you will end up crazy)

We have carefully selected a set of indicators each of which is explaining one or another aspect of the default risk. What if we just included them all into the model? That’s what we got:
 

Despite of the good in-sample coefficient of determination (denoted as "R-Squared"), the outcome looks quite like a modeling disaster – doesn’t it? Model betas with illogical signs are simply disturbing.

I guess an explanation of the table above would not do any harm for those who haven’t seen presentations like that before. Skip the following few paragraphs if you can read it anyway, and continue from the next subheading.

In the first column (the column with grey background) there is the name of the variable that is supposed to explain the U.S. corporate default risk, i.e. the default risk indicator as defined in Part 2 of this series. Note that FedFundsRate_Q is a categorical variable; that is: if it takes value “0” it is basically being ignored in the model calculation, if it takes value “1” it is being treated in the model as FedFundsRate_Q1 and if it takes value “2” it is being treated in the model as FedFundsRate_Q2. Just to mention, (Intercept) is not a variable defined by us; it indicates the starting point for a model’s calculations – that’s why it’s in brackets. (According to the model table above, we’d start calculations from the initial default rate of 1.19%, see column “Estimate”.)

You may recall that we chose linear regression as the model’s functional form; sort of Y=β0+β1*X1+β2*X2+… calculation. Needless to remind, Y denotes the default rate that we are aiming to predict. Specific numerical value of a variable in the first column is X (X1 for the first variable, X2 for the second etc.).

In column “Estimate” you see the so-called model beta coefficients (β1, β2 etc.) or model betas or model parameters – whatever you might prefer to call them. I often simply say “betas”. In our case, a variable’s beta tells you how much the U.S. corporate default rate would change if the value of variable X changed by one unit. For example: if VIX increased by one unit on the Yahoo! Finance chart, our preliminary model in the table above would suggest that the forecasted U.S. corporate default rate in the coming year will be by 0.084% higher than we expected for the starters, all other things equal. Whoops, VIX just increased by 1.5. Are U.S. corporate default rates this year now going to be 0.084%*1.5=0.126% higher than I thought – 2.13% instead of 2.00%? Your math is correct but read on.

That’s where it gets illogical with some variables in the all-inclusive model:

  • If corporate bond spread increased by 1% would it lead to a decline in the default rates by 0.09%? No, of course not. It’s just that there is something wrong with the model.
  •  If the default rate had increased during the last year when compared to the year before, would it mean that the expected default rate is lower for this year? No – yet again, there is no causal relationship.
And so it goes on… Some variables are simply carrying the same information as some others. If correlations between a model’s input variables are stronger than those between an explanatory variable and the result variable one most probably ends up in nonsense conclusions.

P-values and significance codes in the last two columns are telling exactly that: if a variable’s P-value is bigger than the aimed significance level (usually 0.05, i.e. 5%), that variable is statistically insignificant. Feel free to “google” for Std.Errors and t-values if interested in math behind the P-values. The short summary is that we currently have too many variables in the model that are interfering each other.

Model 1: Model chosen by the software

So what if we let the software decide which variables to take and which ones to leave? After the removal of all the statistically insignificant variables from the previous all-inclusive model, the “backward” method provides us with the following solution:



Out of the seven initial input variables, the algorithm leaves only three significant ones. Adjusted R-Squared that penalizes the ordinary R-Squared for excessive variables included in the model, improves from 0.913 to 0.924. There wouldn’t be any obvious issues with the behavior of the input variables. Ok: the model’s intercept is insignificant but this only means that the model calculations can start from a zero default rate instead of the model-implied 0.06%...

That looks pretty promising to start with. Let’s check it more closely, while being guided by the model criteria as defined in Part 2 as well as by the common sense. (That’s the definite advantage of simpler models: you can apply common sense when it comes to judging about model calculations.)

First of all, the model is supposed to have reasonable predictive power; predicted default rates and actually realized default rates are ought to be similar at all times. Here is the graphical representation of the model predictions versus realized default rates over my observation period (1991-2012):  
Visually it seems that we have achieved pretty accurate results, at least in-sample. A quick layman’s calculation suggests that the model predictions differ from the realized default rates on average by 21.8%. I mean: if the realized default rate would have been 2% and we would have made that average mistake, we’d have expected a default rate of 2% +/- 21.8%*2% or 2.44% / 1.56% depending on which direction we erred.

Not an excellent result, but it would do. After all, if we assumed that the realized default rate would always be at the sample average level of 1.88%, the mean difference between the predicted versus realized default rate would be 87.3% instead of 21.8%... So clearly, the model adds value: it is ca. four times more precise than a naive assumption of the average default rate.

There is another comparison:
Creditors sometimes “calibrate” their internal point-in-time PD-models so that on average, the forecasted default rate for the next year is equal to the realized default rate during the year before. In our case, this approach would correspond to a naive model where we assumed that DF=DF_t.1. In our sample, it would give an average forecast error of 61.4% which is nearly three times bigger from the 21.8% of our model (even if somewhat better when compared to the naive assumption of a long-term average default rate).

And yet, there are a few issues that would need further consideration:
  • It’s a bit disturbing that when using this software-selected model, we’d have had no clue about the default rate picking up in 1995 (reflections of the emerging markets crash of 1994, I guess). We’d also have been quite wrong about the years 2000 and 2010. Of course, a model as simplification of reality will never be perfect, but can something be done about this particular issue?
  • FedFundsRate_Q definitely is an important variable but perhaps it’s still too strong in the model. We have to admit that its definition involved a degree of subjectivity and it’s not all that sure if it would work that well out-of-sample. Clearly, it needs to be tested.
  • Due to strong correlation between them, the two variables named Credit_standards_t.1_4Q and VIX_t.1_Dec are interfering with each other; even if neither is insignificant, VIX_t.1_Dec is “borrowing” some of its relative importance from the Credit_standards_t.1_4Q. We’d need to check the stability of the model betas.
  • In the banks’ internal and external stress tests it is more or less universally assumed that macro variables such as GDP have a significant impact to the default rates. Can we indeed dismiss GDP and PMI as relatively insignificant?
Let’s keep the software-selected model as benchmark and see if we can possibly come up with something better manually.

Model 2: Custom model

After “running” a number of single variable regressions and alternative models with the manually selected variables, it so turns out that at least for the time being, the software indeed has made the best choice given the input data. An advantage – and in other situations a disadvantage – of involving a human being over a machine is that a thinking person is not limited to the given inputs and -formulas.

I looked more closely into the data and found that there is at least one indicator which would have signaled the major discrepancies of the software selected model all at the time (underestimations in 1995 and 2000, and overestimation in 2010). Yeah, I have to “eat my words” about dismissing interaction terms, but it in fact this indicator is a combination of the two variables, even if a fairly simple one:
  • Change in the realized default rates (last year when compared to the year before), and
  • GDP forecast for the year considered (when compared to the real GDP growth rate in the year before).
More specifically, the combined qualitative variable DF_t.1_GDP takes the value “1” (i.e. indicates higher default risk for the coming year) if: (a) the last year’s realized default rate has increased when compared to the year before, and (b) GDP growth forecast for the forecast period is lower than the actual GDP growth last year. It’s “0” otherwise.

Namely, I observed that that the first increase in default rates tends to be a good warning signal for credit cycle having turned from good to bad but at the same time it is late in indicating improved environment. At least supposedly forward-looking GDP forecast is ought to correct for this limitation – even if left aside on stand-alone basis due to weaker overall explanatory power when compared to other variables.

So that’s what I got as a custom model:
On average, the model predicted PD differs from the realized default rate by 19.4% and we’d have been able to catch the default rate pick-up in 1995. The other largest forecast errors are somewhat smaller as well:
 
They are more or less normally distributed with zero mean as they ought to be (the solid line shows normal curve that has mean zero and the same standard deviation as the distribution of forecast errors.):

 

When it comes to the model design, it’s now quite in line with the commonly assumed reasons of corporate bankruptcies:
  • Asset prices – as a forward-looking measure, we are taking into account the fear index VIX;
  • Credit availability – yes, we have that variable;
  • Real effects in macroeconomy – even if compared to the other indicators, GDP displays low importance in explaining the default probability within 12-months horizon, it’s now included;
  • Policy implications – yes, and the FedFundsRate_Q remain very significant even if we reduced its dominance a bit.
Given the limited scope of our experiment, we did not manage to take into account the failing start-up projects and other firm-specific parameters; that would require much more extensive research (which ironically would not add much on economy-wide level).

Model 3: Alternative custom model

But I’ll tell you a secret: there is actually at least one model with better in-sample performance that ironically puts even more weight to what the Fed does or does not with the interest rates. Yes, and for a minimum model, I left out the variable of credit standards as well. The model is as follows:  

Hmm, better explanatory power based on the Adjusted R-Squared and smaller average forecast error even if missing the small pick-up in default rates in 1995 (our layman’s forecast error rate is 16.6% which compares to 19.4% in previous model):
Notice what did the trick: replacing the variable DF_t.1_GDP with another one: DF_t.1_FF. The other adjustments did not change the model precision all that much. Yeah, I said that the first increase in default rates tends to be a good warning for credit cycle having turned from good to bad but it’s late in indicating improved environment. Instead of the GDP, I now used the same Fed funds rate to catch the change in credit cycle. Specifically, the “magic” variable takes value:
  • “1” if (a) the last year’s realized default rate has increased when compared to the year before (that part is unchanged when compared to DF_t.1_GDP), and (b) the Fed funds rate is above the trend line (as defined in Part 2) at the beginning of the forecast period – with the time lag of one year;     ,
  •  “0” otherwise.
With this model, I’d at least suspect the over fitting problem given that we have only three (ok, four with concessions) variables two of which are qualitative and reflecting just one factor (the Fed funds rate). Can forecasting U.S. corporate default rates indeed be this simple? We have to test it.

Summary of models


We now have three alternative models to choose from:

 

Once again, each model’s prediction would be given as follows:

Next year’s default rate (expressed in percent) = Intercept + Beta coefficient of Variable 1 (as given by the model) * Value of Variable 1 (input) + Beta coefficient of Variable 2 (as given by the model) * Value of Variable 2 (input) + …
[the same for each model and for each model variable]

If not daring to wait, you can now search for the input data, and make your three alternative predictions for 2013 and for the first half of 2014.  

Last but not least, we have to make that choice between the models. As implied above, it’s not that you just take the model with the best in-sample performance (which would be Model 3). It’s also not that you take the one which gives you the most desirable predictions for the future (something which people with conflicting interests would probably do). You have to test the alternatives. It sounds like a long(er) story and thus, we’ll leave it for Part 4 together with the forecasting part.

…And I wouldn’t have believed what the results of this little experiment will reveal about the current state of affairs…


______

Making otherwise proprietary financial expertise available to those who bother to pay attention – as best I can…

No comments:

Post a Comment