QuantLib 1.6.2 Multithreading Patch for JVM/.NET

Update 23.11.2015: A modified version of the patch is now part of the official QuantLib Release 1.7.

The implementation of QuantLib’s observer pattern does not tolerate a parallel garbage collector running in a different thread. This can lead to random crashes or producing “pure virtual function calls” when using QuantLib in JVM and .NET languages (e.g. Java/Scala and C#/F#) via the SWIG interface. An original description of this problem can be found e.g. here and within the references.

With the version 1.58 and higher the boost library has extended the interface of class boost::enable_shared_from_this<T> by the method

weak_ptr<T> weak_from_this() noexcept;

which gives access to the underlying weak pointer used to implement the “shared pointer from this” functionality. This method now allows for a much cleaner and easier to install patch to make QuantLib’s observer pattern thread safe. Especially it is no longer necessary to define preprocessor defines or to edit boost files.

The implementation is backed by boost::signals2 to get the observer pattern working in a thread safe manner. The boost::signals2 library works purely based on shared_ptr. Therefore the first step is to add an Observer::Proxy class to QuantLib’s Observer, which can be registered with boost::signals2 and which forwards the update calls to the corresponding observer. In addition the class Observer is now derived from boost::enable_shared_from_this.

class Observer : public boost::enable_shared_from_this<Observer> {
          friend class Observable;
  public:
    // public interface remains untouched

    virtual ~Observer() {
        boost::lock_guard<boost::recursive_mutex> lock(mutex_);
        if (proxy_) {
            proxy_->deactivate();

            for (iterator i=observables_.begin(); 
                 i!=observables_.end(); ++i) 
                (*i)->unregisterObserver(proxy_);
        }
    }
  private:
    class Proxy {
      public:
        Proxy(Observer* const observer)
         : active_  (true),
           observer_(observer) {
        }

        void update() const {
            boost::lock_guard<boost::recursive_mutex> lock(mutex_);
            if (active_) {
                const boost::weak_ptr<Observer> o
                    = observer_->weak_from_this();
                if (!o._empty()) {
                    boost::shared_ptr<Observer> obs(o.lock());
                    if (obs)
                        obs->update();
                }
                else {
                    observer_->update();
                }
            }
        }

        void deactivate() {
            boost::lock_guard<boost::recursive_mutex> lock(mutex_);
            active_ = false;
        }

    private:
        bool active_;
        mutable boost::recursive_mutex mutex_;
        Observer* const observer_;
    };

    boost::shared_ptr<Proxy> proxy_;
    mutable boost::recursive_mutex mutex_;

    std::set<boost::shared_ptr<Observable> > observables_;
    typedef std::set<boost::shared_ptr >::iterator 
        iterator;
};

The critical path for the race condition in the original implementation is when the destructor of an observer is called while an observable triggers an update call. Now the destructor of an observer deactivates the proxy if the destructor is able to get the same mutex lock as the update method of the proxy acquires. If the proxy gets deactivated it will not forward any update calls to the observer which is about to be deleted.

If the update call manages to acquire the mutex lock then it will delay any concurrent observer destructor calls. Therefore it is safe trying to get the underlying weak pointer from the observer linked to the proxy in line 27. This weak pointer can be empty if no shared pointer has been created from the observer in the past. If a weak pointer exists then the proxy tries to get the shared pointer and calls the update methods from there. This part follows the same methodology as outlined in [1}. The rest of the implementation inherits the thread safeness from the boost signals2 library.

The patch for QuantLib 1.6.2 can be found here. It also contains Riccardo’s thread-safe singleton patch and it requires boost version 1.58 or higher.

[1] Shuo Chen, Where Destructors meet Threads

Heston Model Calibration using Adjoint Algorithmic Differentiation

Algorithmic Differentiation becomes more and more popular in financial engineering since the method was first brought to the attention of a wider audience in [1]. Key factors for the popularity are

  • Adjoint Algorithmic Differentiation (AAD): computational cost to calculate all first order partial derivatives of a function (or a computer program) with this method is loosely speaking three to six times larger than the cost of evaluating the function itself.  If the function has a large number of first order partial derivatives then this method clearly overtakes the finite difference method, for which the computational cost is proportional to the number of partial derivatives.
  • Easy to use C++ libraries, among others e.g. CppAD or ADOL-C
  • Accuracy: The partial derivatives are the direct result of a function evaluation and do not depend on an arbitrary bumping parameter.

Adjoint Algorithmic Differentiation was first mentioned in conjunction with QuantLib in Sebastian and Jan’s talk at the 2013 User Group Meeting [2]. The topic has gain further momentum with Peter’s initial blog on Adjoint Greeks and with Alexander’s announcement that CompatibL is working on an AAD port of QuantLib. Alexander will give a talk at this years Global Derivatives Conference on the techniques involved to port QuantLib.

Almost all efficient local optimisation algorithms used for model calibration like the Levenberg-Marquardt algorithm are based on gradient methods and therefore need to calculate the Jacobian matrix of the target function Z. The target function for the Heston model calibration is defined by the goodness of fit measure

\displaystyle \zeta = \min_\Theta \|Z\|^2 = \min_\Theta \sum_{i=1}^N Z_i^2 = \min_\Theta \sum_{i=1}^N \left( \sigma_i^{model}(K_i, T_i) - \sigma_i^{market}(K_i, T_i, \Theta) \right)^2

for the model parameters

\Theta = \{ x_1, .., x_5\} = \{ \nu_0, \kappa, \theta, \sigma, \rho\}

The model prices of the calibration options are evaluated using Gauss-Laguerre integration of the characteristic functions \phi_j(k) (j=1,2):

\begin{array}{rcl} \displaystyle \Pi_j &=& \displaystyle \frac{1}{2} + \frac{1}{\pi} \int_0^\infty\Re\left[\frac{\phi_j(u)\mathrm{e}^{-iu \ln K}}{iu}\right]\mathrm{d}u \\[0.9em] \nonumber \displaystyle P_j(\varphi) &=& \displaystyle \frac{1-\varphi}{2} + \varphi \Pi_j \\[0.9em] \nonumber \displaystyle \mathrm{NPV} &=& \varphi\left[\mathrm{e}^{-qT}SP_1(\varphi) - \mathrm{e}^{-rT}KP_2(\varphi)\right] \end{array}

with the binary variable \varphi=+1 for a call and \varphi=-1 for a put. The first step in order to use AAD for the model calibration is an implementation of the Gauss-Laquerre integration based on the CppAD library. The only change needed is to replace the data type Real by CppAD::AD<Real> in

template <class F>
CppAD::AD<Real> GaussianADQuadrature::operator()(const F& f) 
    const {
    CppAD::AD<Real> sum = 0.0;
    for (Integer i = order()-1; i >= 0; --i) {
        sum += w_[i] * f(x_[i]);
    }
    return sum;
}

Using this AAD version of the Gauss-Laguerre integration the method

CppAD::AD<Real>; 
AnalyticHestonADEngine::Fj_Helper::operator()(Real phi) const;

can be ported in a similar manner and the AnalyticHestonADEngine::doCalculation method now reads

std::vector<CppAD::AD<Real> > params;
params += spotPrice, v0, kappa, theta, sigma, rho;
CppAD::Independent(params);

std::vector<CppAD::AD<Real> > y(1);

// untouched code ...

const std::vector<Real> moreResults 
    = CppAD::ADFun<Real>(params, y)
          .Reverse(1, std::vector<Real>(1, 1.0));

results.value = CppAD::Value(y[0]);
results.additionalResults["v0"]    = moreResults[1];
results.additionalResults["kappa"] = moreResults[2];
results.additionalResults["theta"] = moreResults[3];
results.additionalResults["sigma"] = moreResults[4];
results.additionalResults["rho"]   = moreResults[5];

All first order Greeks for the calibration instruments can now be calculated using AAD. The Jacobian of the target function Z w.r.t. the first order Greeks is given by

\begin{array}{rcl} \displaystyle\frac{\partial Z_i}{\partial x_j} &=& \displaystyle \frac{\partial \sigma_i^{model}(\Theta)}{\partial x_j} = \displaystyle \frac{1}{\frac{\partial NPV_i}{\partial \sigma_i}}\frac{\partial NPV_i}{\partial x_j} \\[0.9em] \nonumber &=& \displaystyle \frac{1}{\nu_{BS}}\frac{\partial NPV_i}{\partial x_j} \end{array}

The advantage of using AAD for the Heston model is not calculation speed but precision. In fact the AAD version of the Heston model calibration is slower than the finite difference based method but the AAD method does not need an arbitrary, fine-tuned bumping parameter or any higher order finite difference schemes to come up with high precision first order derivatives. The diagram below shows the relative difference between the AAD value and several finite difference approximations for \frac{\partial NPV}{\partial \nu_0} of an ATM option with two years to maturity and

\nu_0=0.1, \kappa=1.0, \theta=0.1, \sigma=0.5, \rho=-0.75

Only the six point central finite difference scheme with optimal bumping size gives the AAD value within/close to machine precision. The two point forward scheme, which is used by default in the MINPACK implementation of the Levenberg-Marquardt algorithm, reproduces only the first eight digits. A more detailed analysis for the forward scheme can be found in [3] or [4]. Especially the latter paper calculates the values for the optimal choice of the bumping size for the different schemes. Please find the source code for the AAD pricing engine here.plot

The rate of convergence can be improved by using the Richardson extrapolation. For the diagram below the same analysis was repeated including a Richardson extrapolation step.

plotRE

[1] Giles, M. and Glasserman, P., (2006) Smoking adjoints: fast Monte Carlo Greeks. Risk, 19:88–92. 1

[2] Schlenkrich, S. and Riehme J., (2013) Design Patterns for Algorithmic Differentiation

[3] Kopecky, K. (2007) Numerical Differentiation, Lecture Notes

[4] Numerical Differentiation in Integration, Lecture Notes

QuantLib-1.5 SWIG Patch for JVM/.NET

Update 23.11.2015: A modified version of the patch is now part of the official QuantLib Release 1.7.

Update 22.09.2015: Please find the latest and improved version of the patch for QuantLib 1.6.2 here.

The usage of QuantLib in JVM and .NET languages (e.g. Java/Scala and C#/F#) via the SWIG interface has a known shortcoming. The implementation of QuantLib’s observer pattern does not tolerate a parallel garbage collector running in a different thread. As a result programs are randomly crashing or producing “pure virtual function calls”. A detailed description of this problem can be found e.g. here and within the references.

Please find here a patch for QuantLib 1.5 to fix this issue.

Intraday, high Resolution Day Counters

At the time being QuantLib’s smallest time resolution is a single day and QuantLib does not support “intraday pricing”. Close to an option’s maturity date this inaccuracy leads to wrong greeks. Especially at maturity date QuantLib gives back zero option npv and greeks. A temporary solution would be to shift the expiry date of the option one day forward but this still leads to wrong valuation of the price and the pin risk of the option.

The root of the problem is DayCounter’s yearFraction method, which acts on dates and the Date class does not support any time resolution smaller than days. On the other hand the Date class is probably the most widely used QuantLib class and therefore any solution must be backwards compatible. One solution for this problem has recently been discussed here.

Instead of adding new derived Date classes another solution will be to change the Date class itself to cope with intraday time information. The advantage of this approach is that all overloaded operators remain consistent. The date time arithmetic has been implemented using the boost::posix_time::ptime class. If one is interested in the correct treatment of time zones and day light saving one should use boost::local_time::local_date_time but this comes with a performance penalty. In order to allow backwards compatibility all signatures and behavior of the existing Date and DayCounter classes should stay the same. Additional methods are inter alia a new high-resolution date constructor

Date::Date(Day d, Month m, Year y,
           Size hours, Size minutes, Size seconds,
           Size millisec = 0, Size microsec = 0);

and methods to get the differences in days between two points in time including the fractions of the days. The maximal resolution of the methods is either micro or nano seconds depending on the underlying boost installation.

   Time Date::fractionOfDay() const
   Time daysBetween(const Date&, const Date&);

Last bit missing now is to change the corresponding day counters, e.g. the yearFraction method of the Actual365Fixed day counter becomes

Time Actual365Fixed::Impl::yearFraction(
    const Date& d1, const Date& d2, 
    const Date&, const Date&) const {

        return daysBetween(d1, d2)/365.0;
}

The new Date and DayCounter implementation is available here. It acts as a drop-in replacement for the existing classes, meaning the QuantLib test suite runs properly without any changes. Only the day counters Actual360, Actual365Fixed, ActualActual allow a strictly monotone definition of time and only these once have been adapted. The patch also contains an example of an intraday pricing of an ATM option during the last two and a half hours of the last trading day based on the Heston model and the finite difference pricing engine FDHestonVanillaEngine. The resulting Gamma and NPV of the option are shown in the diagram below.

gamma

QuantLib-SWIG Patch for JVM/.NET Languages

Update 23.11.2015: The latest version is now part of the official QuantLib Release 1.7.

Update 22.09.2015: Please find the latest and improved version of the patch for QuantLib 1.6.2 here.

The usage of QuantLib in JVM and .NET languages (e.g. Java/Scala and C#/F#) via the SWIG interface has a known shortcoming. The implementation of QuantLib’s observer pattern does not tolerate a parallel garbage collector running in a different thread. As a result programs are randomly crashing or producing “pure virtual function calls”. A detailed description of this problem can be found e.g. here and within the references.

Please find here a patch for QuantLib 1.4 to fix this issue. It contains

Installation instructions are included in the readme.txt file.

Probability Distribution of the Heston Model, Part II

Starting point for a semi-analytical solution of the Fokker-Planck forward equation for the Heston model is the exact sampling algorithm of Broadie and Kaya [1] (for the notation please see [2])

\begin{array}{rcl} x_t &=& x_0 + m(t) + \sigma(t)Z \nonumber \\ \sigma^2(t) &=& (1-\rho^2)\int_0^t \nu_s ds \nonumber \\ m(t) &=& (r_t-q_t)t - \frac{1}{2}\int_0^t \nu_s ds + \rho\int_0^t \sqrt{\nu_s}dW_s^{(1)} \nonumber \end{array}

The probability distribution function can be described as

p(x_t, \nu_t, t) = p(\nu_t, t) p(x_t,t\mid \nu = \nu_t)

and p(\nu_t, t) is  given by a noncentral chi squared distribution. The distribution p(x_t, t \mid \nu = \nu_t) can be calculated using the exact simulation algorithm. In this algorithm the variable x_t is given as a function of two random variables \int_0^t \nu_s ds and Z.

The distribution of x_t can now be derived using the general transformation theorem for random variables: Let be a random variable with probability density function f. The transformed random variable Y=h(X) has the probability density function

p(y) = f(h^{-1}(y)) \left| \det \left( \frac{\partial h^{-1}_i(y)}{\partial y_j} \right)\right|

First step is now to rewrite the exact simulation method in terms of the two random variables

X_1 = \int_0^t \nu_s ds \ , X_2=Z .

The simulation scheme then becomes

\begin{array}{rcl} x_t &=& x_0+a(t) -\frac{1}{2}X_1 + \frac{\rho\kappa}{\sigma}X_1+\sigma(t) X_2 \nonumber \\\sigma^2(t)&=&(1-\rho^2)X_1 \nonumber \\a(t)&=&(r_t-q_t)t+\frac{\rho}{\sigma}\left( \nu_t-\nu_0-\kappa\theta t \right)\end{array}

or in terms of the transformed random variable

\begin{array}{rcl} Y_1 =h_1(X_1,X_2) &=& x_0+a(t)-\frac{1}{2}X_1 + \frac{\rho\kappa}{\sigma}X_1+\sqrt{\left(1-\rho^2\right)X_1}X_2 \nonumber \\ Y_2=h_2(X_1,X_2)&=& X_1 \nonumber\end{array}.

Let \phi(x_1) be the density function of X_1

\phi(x_1)=\frac{2}{\pi}\int_0^\infty \cos ux_1 \mathrm{Re}(\Phi(u))\mathrm{d}u

and X_2 follows by definition a normal distribution. The joint probability density function of (X_1, X_2) is then

f\left( \begin{matrix} x_1 \\ x_2 \end{matrix}\right)=\phi(x_1) \frac{1}{\sqrt{2\pi}}e^{-\frac{x_2^2}{2}}

with

\begin{array}{rcl}f\left(h^{-1}(y)\right)&=&f\left(\begin{matrix}y_2 \\ \frac{1}{\sqrt{\left(1-\rho^2\right)y_2}}\left(y_1-x_0-a(t)+\frac{1}{2}y_2-\frac{\rho\kappa}{\sigma}y_2\right)\end{matrix}\right)  \nonumber \\  \left|\det \left( \frac{\partial h^{-1}_i(y)}{\partial y_j} \right)\right| &=& \frac{1}{\sqrt{\left(1-\rho^2\right)y_2}}  \end{array}.

This yields to the semi-analytical formula for the solution of the Fokker-Planck equation because by definition p(x_t,t \mid \nu = \nu_t) is the distribution density function of Y_1, which is given by

p(x_t,t \mid \nu = \nu_t) = \int_0^\infty \mathrm{d}y_2 p(y_1, y_2)\mid_{y_1=x_t} = \int_0^\infty \mathrm{d}y_2 \left[f\left(h^{-1}(y)\right)\left|\det \left( \frac{\partial h^{-1}_i(y)}{\partial y_j} \right)\right| \right]_{y_1=x_t}

The integration over y_2 can be carried out using e.g. the Simpson integral rule together with the Cornish-Fisher expansion, which gives an upper bound for the truncation of the upper limit of the integration.

The contour plots below show the probability density function of the Heston model for some example parametrisations.

plot
\begin{array}{|c|c|c|c|c|c|c|c|c|} \hline {\rm Parameters} & x_0 & \nu_0 & r & q & \kappa & \theta & \sigma & \rho \\ \hline \hline  a & 4.6052 & 0.4 & 5\% & 2.5\% & 1.0 & 0.4 & 0.8 & -75\%  \\ \hline  b & 4.6052 & 0.4 & 5\% & 2.5\% & 1.0 & 0.4 & 0.8 & \ \ 75\%  \\ \hline  c & 4.6052 & 0.4 & 5\% & 2.5\% & 1.0 & 0.4 & 0.4 & -75\%  \\ \hline  d & 4.6052 & 0.4 & 5\% & 2.5\% & 1.0 & 0.4 & 0.4 & \ \ \ \ 0\%  \\ \hline  \end{array}

The example code is available here and depends on the upcoming QuantLib version 1.4.

[1] M. Broadie, Ö. Kaya, Exact Simulation of Stochastic Volatility and other Affine Jump Diffusion Processes

[2] K. Spanderen, Probability Distribution of the Heston Model, Part I

[3] R.U. Seydel, Tools for Computational Finance, pp 86

Probability Distribution of the Heston Model, Part I

The Heston model is defined by the following stochastic differential equation of the log spot x_t = \ln S_t

\begin{array}{rcl} dx_t &=& \left(r_t - q_t - \frac{\nu_t}{2}\right)dt + \sqrt\nu_t dW^{x}_t \nonumber \\ d\nu_t&=& \kappa\left(\theta-\nu_t \right ) dt + \sigma\sqrt\nu_t dW^{\nu}_t \nonumber \\ \rho dt &=& dW^{x}_tdW^{\nu}_t \end{array}

To a significant extent the popularity of the Heston model is based on the fact that semi-closed formulas for vanilla European options exist using the characteristic function of the model. The time evolution of the probability density function p(x_t, v_t, t) is given by the corresponding Fokker-Planck equation [1]

\frac{\partial p}{\partial t} = \frac{1}{2}\frac{\partial^2}{\partial x^2}(\nu p) + \frac{\partial^2}{\partial x \partial \nu} (\rho\sigma\nu p) + \left(\frac{\nu}{2} -r_t+q_t\right)\frac{\partial}{\partial x} p + \frac{\sigma^2}{2}\frac{\partial^2}{\partial \nu^2}(\nu p) - \frac{\partial}{\partial \nu}\left(\kappa\left(\theta-\nu \right ) p\right)

with the initial condition

p(x,\nu,t=0) = \delta(x-x_0)\delta(\nu-\nu_0)

The reduced probability density function

p(x_t, t \mid x_0,\nu_0) = \int_0^\infty p(x_t, \nu_t, t) d\nu

for this initial value problem can be calculated using a semi-closed integral formula [2]

\begin{array}{rcl} \Gamma &=& \kappa+i\rho\sigma p_x \\ \Omega &=& \sqrt{\Gamma^2 + \sigma^2\left(p_x^2-ip_x \right )} \\ p(x_t, t \mid \nu_0) &=& \int_{-\infty}^\infty \frac{dp_x}{2\pi}\exp\left( ip_x (x_t-x_0 -(r-q)t) -\nu_0 \frac{p_x^2-ip_x}{\Gamma + \Omega \coth\left(\Omega t/2 \right )} \right) \\ && \ \ \ \ \ \ \ \ \times \exp\left(-\frac{2\theta\kappa}{\sigma^2}\ln\left(\cosh\frac{\Omega t}{2} + \frac{\Gamma}{\Omega} \sinh \frac{\Omega t}{2}\right )+\frac{\kappa\Gamma\theta t}{\sigma^2} \right) \\ &=& \int_{-\infty}^\infty \frac{dp_x}{2\pi} \tilde{p}(p_x,t \mid \nu_0) \end{array}

This gives the opportunity to write a pricing engine for arbitrary European payoffs. The value of an European option with payoff function P(S_t) \in \mathbb{L}^2 at maturity t is given by

\begin{array}{rcl} \text{npv} &=& \int_{-\infty}^\infty dx_t\int_{0}^\infty d\nu_t P(S_0 e^{x_t+(r_t-q_t)t})e^{-r_t t} p(x_t,\nu_t,t) \\ &=& e^{-r_t t}\int_{-\infty}^\infty dx_t P(S_0 e^{x_t+(r_t-q_t)t})p(x_t,t \mid x_0,\nu_0) \end{array}

The calculation needs two nested integrations which can be carried out efficiently using e. g. the Gauss-Lobatto algorithm. The solution of the equation

| \tilde{p}(p_x, t \mid \nu_0)| = \text{QL\_EPSILON}

determines the upper boundary for the integration over p_x. The boundaries \left[ -x_{min}, x_{max}\right] for the integration over x_t are chosen such that the interval covers ten times the expected variance

-x_{min} = x_{max}=10\sqrt{\int_0^{t}E\left[ \nu_t \right ] dt} = 10\sqrt{\theta t + \frac{1}{\kappa}\left(\nu_0-\theta\right)\left(1-e^{-\kappa t}\right)}

Obviously the nested integration makes this algorithm more tricky than the standard ways to price plain vanilla European options but it is not limited to vanilla payoffs. The implementation of this algorithm can be found here within the QuantLib trunk on Github.

Broadie and Kaya [1] have outlined an algorithm to sample from the full probability density function p(x_t, \nu_t, t \mid x_0, \nu_0) instead of the reduced density function p(x_t, t \mid x_0, \nu_0). Starting point for this algorithm is the exact solution of the Heston stochastic differential equation

\begin{array}{rcl} x_t &=& x_o + (r_t-q_t)t - \frac{1}{2}\int_0^t \nu_s ds + \rho\int_0^t \sqrt{\nu_s} dW_s^{(1)} + \sqrt{1-\rho^2}\int_0^t \sqrt{\nu_s} dW_s^{(2)} \nonumber \\ \nu_t &=& \nu_0 + \kappa\theta t - \kappa \int_0^t \nu_s ds + \sigma \int_0^t\sqrt{\nu_s}dW_s^{(1)} \nonumber \end{array}

The probability density function of the variance process \nu_t is given by a noncentral chi-squared distribution

\nu_t = \frac{\sigma^2\left( 1-e^{-\kappa t}\right)}{4\kappa}\chi_d^{'2}\left(\frac{4\kappa e^{-\kappa t}}{\sigma^2\left(1-e^{-\kappa t}\right)}\nu_0\right), d=\frac{4\theta\kappa}{\sigma^2}

The distribution \Psi(0,t) of the integral \int_0^t \nu_s ds conditional on \nu_t and \nu_0 can be calculated via the characteristic function

\begin{array}{rcl} \text{Pr}(\Psi(t) \le x)&=& \frac{2}{\pi}\int_0^\infty \frac{\sin ux}{u}\text{Re}(\Phi(u)) du \\ \\ \Phi(a)&=& \frac{\gamma(a)e^{-\frac{1}{2}(\gamma(a)-\kappa)t} \left(1-e^{-\kappa t}\right)} {\kappa\left(1-e^{\gamma(a)t}\right)} \exp\left( \frac{\nu_t+\nu_0}{\sigma^2} \left[ \frac{\kappa\left(1+e^{-\kappa t}\right)}{1-e^{-\kappa t}} - \frac{\gamma(a)\left(1+e^{-\gamma(a)t}\right)}{1-e^{-\gamma(a)t}} \right] \right) \\ && \times \frac{I_{0.5d-1} \left( \sqrt{\nu_0\nu_t} \frac{4\gamma (a) e^{-0.5\gamma(a)t}}{\sigma^2\left(1-e^{-\gamma(a)t}\right)}\right)}{ I_{0.5d-1} \left( \sqrt{\nu_0\nu_t} \frac{4\kappa e^{-0.5\kappa t}}{\sigma^2\left(1-e^{-\kappa t}\right)}\right)} \\ && \times \frac{\exp\left((0.5d-1) \left[-\frac{1}{2}\gamma(a)t + \ln \frac{\gamma(a)}{1-e^{-\gamma(a)t}} \right]\right)}{\left(\frac{\gamma(a) e^{-0.5\gamma(a)t} }{ 1-e^{-\gamma(a)t}}\right)^{0.5d-1}} \\ \\ \gamma(a)&=&\sqrt{\kappa^2-2 i \sigma^2 a} \end{array}

The modified Bessel function of first kind I_{0.5d-1}(z) can be evaluated using series expansion for small and medium |z| or asymptotic approximation for large |z| [5]. Unfortunately Boost provides only real versions of the Bessel functions and the copyright status of older complex valued Fortran77 routine is vague. Therefore QuantLib comes with its own implementation.

Please notice that \Phi(a) is already a continuous version of the characteristic function and therefore the integration does not need to track the branches of arg(z) when calculating the complex valued Bessel function [4].

The integration over the characteristic function is best done using either Gauss-Laguerre, Gauss-Lobatto or trapezoidal rule. The two later algorithms need to truncate the integration at some upper bound. First guess for a truncation limit can be taken from the Cornish-Fisher expansion  for some very small \epsilon. The moment-generating function \Phi(a) can be used to get the first, second and third moment of the distribution via finite difference quotient.

m_n = E(X^n) = \frac{d^n}{dy^n}\Phi(x+iy)\Big|_{x=y=0}

The next term is now fairly easy to calculate

\int_0^t \sqrt{\nu_s} dW_s^{(1)} = \frac{1}{\sigma}\left( \nu_t - \nu_0 - \kappa\theta t+\kappa \int_0^t \nu_s ds \right)

The log spot process can now be sampled using a standard normal random variable Z and

\begin{array}{rcl} x_t &=& x_0 + m(t) + \sigma(t)Z \nonumber \\ \sigma^2(t) &=& (1-\rho^2)\int_0^t \nu_s ds \nonumber \\ m(t) &=& (r_t-q_t)t - \frac{1}{2}\int_0^t \nu_s ds + \rho\int_0^t \sqrt{\nu_s}dW_s^{(1)} \nonumber \end{array}

This sampling algorithm is exact even for very large time steps and therefore gives some advantages for quasi random Monte-Carlo methods but the inversion of the integration of the characteristic function is also very slow. The algorithm is implemented within the HestonProcess class.

[1] I. Clark, Foreign Exchange Option Pricing: A Practitioners Guide, p. 113

[2] A. Dragulescu, V. Yakovenko, Probability distribution of returns in the Heston model with stochastic volatility

[3] M. Broadie, Ö. Kaya, Exact Simulation of Stochastic Volatility and other Affine Jump Diffusion Processes

[4] R. Lord, Efficient pricing algorithms for exotic derivatives, p. 40

[5] J.R. Culham, Bessel Functions of the First and Second Kind