Very efficient approximations exist for the SABR model

like the original Hagan et. al. formula [1] or variants of it [2] but these analytic formulas are in general not arbitrage free. Solving the corresponding partial differential equation leads to an arbitrage free solution

but is computationally demanding. The basic idea here is to use a neural network or gradient boosted trees to interpolate (predict) the difference between the analytic approximation and the exact result from the partial differential equation for a large variate of model parameters.

First step is to reduce the number of dimensions of the parameter space by utilizing the scaling symmetry of the SABR model [3]

so that we can focus on the case without lose of generality

.

This in turns also limits the “natural” parameter space for which will be set to

.

Next on the list is to set-up an efficient PDE solver to prepare the training data. The QuantLib solver supports already the two standard error reduction techniques, namely adaptive grid refinement around important points and cell averaging around special points of the payoff. The latter one ensure a smooth second order convergence in spatial direction [4]. The Hundsdorfer-Viewer ADI scheme is also of second order in the time direction and additional Rannacher smoothing steps at the beginning will ensure a smooth convergence in the time direction as well [5]. Hence the Richardson extrapolation can be used to improve the convergence order of the overall algorithm. An example pricing for

is shown in the diagram below to demonstrate the efficiency of the Richardson extrapolation. The original grid size for scaling factor 1.0 is .

The training data was generated by a five dimensional quasi Monte-Carlo Sobol sequence for the parameter ranges

.

The strikes are equally distributed between the and quantile of the risk neutral density distribution w.r.t to the ATM volatility of the SABR model. The training set includes 617K samples values. The network is trained to fit the difference between the correct SABR volatility from the solution of the partial differential equation and the Floc’h-Kennedy approximation. It does not need a large neural network to interpolate the parameter space, e.g. the following Tensorflow/Keras model definition with 46K parameters has been used in the examples below

model = Sequential() model.add(Dense(20, activation='linear', input_shape=(7, ))) model.add(Dense(100, activation='linear')) model.add(Dense(400, activation='sigmoid')) model.add(Dense(10, activation='tanh')) model.add(Dense(1, activation='sigmoid')) model.compile(loss='mae', optimizer='adam')

As always it is important for the predictive power of the neural network to normalize the input data e.g. by using sklearn.preprocessing.MinMaxScaler. The out-of-sample mean absolute error of the neural network is around 0.00025 in annualized volatility, far better than the Kennedy-Floc’h or Hagan et al approximation.

The diagram below shows the difference between the correct volatility and the different approximations using the parameter example from the previous post. One could also used gradient tree boosting algorithms like XGBoost or LightGBM. For example the models

xgb_model = xgb.XGBRegressor(nthread=-1, max_depth=50, n_estimators=100, eval_metric ="mae") gbm_model = lgb.train({'objective': 'mae', 'num_leaves': 500 } lgb.Dataset(train_X, train_Y), num_boost_round=2000, valid_sets=lgb_eval, early_stopping_rounds=20)

result in similar out-of-sample mean absolute errors of 0.00030 for XGBoost and 0.00035 for LightGBM. On the first glance the interpolation looks smooth as can be seen in the diagram below using the same SABR model parameters, but zooming into it exposes non differentiable points, which defeats the object of stable greeks.

The average run time for the different approximations is shown in the tabular below.

[1] P. Hagan, D. Kumar, A. Lesnieski, D. Woodward: Managing Smile Risk.

[2] F. Le Floc’h, G. Kennedy: Explicit SABR Calibration through Simple Expansions.

[3] H. Park: Efficient valuation method for the SABR model.

[4] K. in’t Hout: Numerical Partial Differential Equations in Finance explained.

[5] K. in’t Hout, M. Wyns: Convergence of the Hundsdorfer–Verwer scheme for two-dimensional convection-diffusion equations with mixed derivative term