In my previous post, I explored the Lord-Kahl method to compute the call option prices under the Heston model. One of the advantages of this method is to go beyond machine epsilon accuracy and be able to compute very far out of the money prices or very short maturities. The standard methods to compute the Heston price are based on a sum/difference where both sides are far from 0 and will therefore be limited to less than machine epsilon accuracy even if the integration is very precise.
However the big trick in it is to find the optimal alpha used in the integration. A suboptimal alpha will often lead to high inaccuracy, because of some strong oscillations that will appear in the integration. So the method is robust only if the root finding (for the optimal alpha) is robust.
The original paper looks the Ricatti equation for B where B is the following term in the characteristic function:
$$\phi(u) = e^{iuf+A(u,t)+B(u,t)\sigma_0}$$
The solution defines the \(\alpha_{max}\) where the characteristic function explodes. While the Ricatti equation is complex but not complicated:
$$ dB/dt = \hat{\alpha}(u)-\beta(u) B+\gamma B^2 $$
I initially did not understand its role (to compute \(\alpha_{max}\)), so that, later, one can compute alpha_optimal with a good bracketing. The bracketing is particularly important to use a decent solver, like the Brent solver. Otherwise, one is left with, mostly, Newton’s method. It turns out that I explored a reduced function, which is quite simpler than the Ricatti and seems to work in all the cases I have found/tried: solve $$1/B = 0$$ If B explodes, \(\phi\) will explode. The trick, like when solving the Ricatti equation, is to have either a good starting point (for Newton) or, better, a bracketing. It turns out that Lord and Kahl give a bracketing for (1/B), even if they don’t present it like this: their \(\tau_{D+}\) on page 10 for the lower bracket, and \(\tau_+\) for the upper bracket. \(\tau_+\) will make \(1/B\) explode, exactly. One could also find the next periods by adding \(4\pi/t\) instead of \(2\pi/t\) like they do to move from \(\tau_{D+}\) to \(\tau_+\). But this does not have much interest as we don’t want to go past the first explosion.
It’s quite interesting to see that my simple approach is actually closely related to the more involved Ricatti approach. The starting point could be the same. Although it is much more robust to just use Brent solver on the bracketed max. I actually believe that the Ricatti equation explodes at the same points, except, maybe for some rare combination of Heston parameters.
From a coding perspective, I found that Apache commons maths was a decent library to do complex calculus or solve/minimize functions. The complex part was better than some in-house implementation: for example the square root was more precise in commons maths, and the solvers are robust. It even made me think that it is often a mistake to reinvent to wheel. It’s good to choose the best implementations/algorithms as possible. But reinventing a Brent solver??? a linear interpolator??? Also the commons maths library imposes a good structure. In house stuff tends to be messy (not real interfaces, or many different ones). I believe the right approach is to use and embrace/extends Apache commons maths. If some algorithms are badly coded/not performing well, then write your own using the same kind of interfaces as commons maths (or some other good maths library).The next part of this series on Lord-Kahl method is here.
I just tried to implement Lord Kahl algorithm to compute the Heston call price. The big difficulty of their method is to find the optimal alpha. That’s what make it work or break. The tricky part is that the function of alpha we want to minimize has multiple discontinuities (it’s periodic in some ways). This is why the authors rely on the computation of an alpha_max: bracketing is very important, otherwise your optimizer will jump the discontinuity without even noticing it, while you really want to stay in the region before the first discontinuity.
To find alpha_max, they solve a non linear differential equation, for which I would need a few more readings to really understand it. Given that the problem looked simple: if you graph the function to minimize, it seems so simple to find the first discontinuity. So I just tried to do it directly. Numerically, I was surprised it was not so simple. I did find a solution that, amazingly seems to work in all the examples of the paper, but it’s luck. I use Newton-Raphson to find the discontinuity, on a reduced function where the discontinuity really lies. I solve the inverse of the discontinuity so that I can just solve for 0. Earlier on I reduced the function too much and it did not work, this is why I believe it is not very robust. Newton-Raphson is quite simple, but also simple to understand why it breaks if it breaks, and does not need a bracketing (what I am looking for in the first place). Once I find the discontinuity, I can just use Brent on the right interval and it works well.
In the end, it’s neat to be able to compute option prices under machine epsilon. But in practice, it’s probably not that useful. For calibration, those options should have a very small (insignificant) weight. The only use case I found is really for graphing so that you don’t have some flat extrapolation too quickly, especially for short maturities. I was curious as well about the accuracy of some approximations of the implied volatility in the wings, to see if I could use them instead of all this machinery.
In any case I did not think that such a simple problem was so challenging numerically.
Marsaglia in his paper on Normal Distribution made the same mistake I initially did while trying to verify the accuracy of the normal density.In his table of values comparing the true value computed by Maple for some values of x to the values computed by Sun or Ooura erfc, he actually does not really use the same input for the comparison. One example is the last number: 16.6. 16.6 does not have an exact representation in double precision, even though it is displayed as 16.6 because of the truncation at machine epsilon precision. Using Python mpmath, one can see that:>>> mpf(-16.6)mpf(’-16.6000000000000014210854715202004’)This is the more accurate representation if one goes beyond double precision (here 30 digits). And the value of the cumulative normal distribution is:>>> ncdf(-16.6)mpf(‘3.4845465199503256054808152068743e-62’)It is different from:>>> ncdf(mpf("-16.6"))mpf(‘3.48454651995040810217553910503186e-62’)where in this case it is really evaluated around -16.6 (up to 30 digits precision). Marsaglia gives this second number as reference. But all the other algorithms will actually take as input the first input. It is more meaningful to compare results using the exact same input. Using human readable but computer truncated numbers is not the best. The cumulative normal distribution will often be computed using some output of some calculation where one does not have an exact human readable input.The standard code for Ooura and Schonfelder (as well as Marsaglia) algorithms for the cumulative normal distribution don’t use Cody’s trick to evaluate the exp(-xx). This function appears in all those implementations because it is part of the dominant term in the usual expansions. Out of curiosity, I replaced this part with Cody trick. For Ooura I also made minor changes to make it work directly on the CND instead of going through the error function erfc indirection. Here are the results without the Cody trick (except for Cody):and with it:All 3 algorithms are now of similiar accuracy (note the difference of scale compared to the previous graph), with Schonfelder being a bit worse, especially for x >= -20. If one uses only easily representable numbers (for example -37, -36,75, -36,5, …) in double precision then, of course, Cody trick importance won’t be visible and here is how the 3 algorithms would fare with or without Cody trick:Schonfelder looks now worse than it actually is compared to Cody and Ooura.To conclude, if someone claims that a cumulative normal distribution is up to double precision accuracy and it does not use any tricks to compute exp(-xx), then beware, it probably is quite a bit less than double precision.
In my previous post, I stated that some library (SPECFUN by W.D. Cody) computes \(e^{-\frac{x^2}{2}}\) the following way:
xsq =fint(x *1.6) /1.6;
del = (x - xsq) * (x + xsq);
result =exp(-xsq * xsq *0.5) *exp(-del *0.5);
where fint(z) computes the floor of z.
Why 1.6?
An integer divided by 1.6 will be an exact representation of the corresponding number in double: 1.6 because of 16 (dividing by 1.6 is equivalent to multiplying by 10 and dividing by 16 which is an exact operation). It also allows to have something very close to a rounding function: x=2.6 will make xsq=2.5, x=2.4 will make xsq=1.875, x=2.5 will make xsq=2.5. The maximum difference between x and xsq will be 0.625.
(a-b) * (a+b) decomposition
del is of the order of 2* x * (x-xsq). When (x-xsq) is very small, del will, most of the cases be small as well: when x is too high (beyond 39), the result will always be 0, because there is no small enough number to represent exp(-0.5*39*39) in double precision, while (x-xsq) can be as small as machine epsilon (around 2E-16). By splitting x * x into xsq * xsq and del, the exp function works on a more refined value of the remainder del, which in turn should lead to an increase of accuracy.
Real world effect
Let’s make x move by machine epsilon and see how the result varies using the naive implementation exp(-0.5*x*x) and using the refined Cody way. We take x=20, and add machine epsilon a number of times (frac).
The staircase happens because if we add machine epsilon to 20, this results in the same 20, until we add it enough to describe the next number in double precision accuracy. But what’s interesting is that Cody staircase is regular, the stairs have similar height while the Naive implementation has stairs of uneven height.
To calculate the actual error, we must rely on higher precision arithmetic.
Update March 22, 2013
I thus looked for a higher precision exp implementation, that can go beyond double precision. I initially found an online calculator (not so great to do tests on), and after more search, I found one very simple way: mpmath python library. I did some initial tests with the calculator and thought Cody was in reality not much better than the Naive implementation. The problem is that my tests were wrong, because the online calculator expects an input in terms of human digits, and I did not always use the correct amount of digits. For example a double of -37.7 is actually -37.
Here is a plot of the relative error of our methods compared to the high accuracy python implementation, but using as input strict double numbers around x=20. The horizontal axis is x-20, the vertical is the relative error.
We can see that Cody is really much more accurate (more than 20x). The difference will be lower when x is smaller, but there is still a factor 10 around x=-5.7:
Any calculation using a Cody like Gaussian density implementation, will likely not be as careful as this, so one can doubt of the usefulness in practice of such accuracy tricks. The Cody implementation uses 2 exponentials, which can be costly to evaluate, however Gary Kennedy commented out that we can cache the exp xsq because of fint and therefore have accuracy and speed.
Some library computes the Gaussian density function $$e^{-\frac{x^2}{2}}$$ the following way:
xsq =fint(x *1.6) /1.6;
del = (x - xsq) * (x + xsq);
result =exp(-xsq * xsq *0.5) *exp(-del *0.5);
where fint(z) computes the floor of z.
Basically, x*x is rewritten as xsq*xsq+del. I have seen that trick once before, but I just can’t figure out where and why (except that it is probably related to high accuracy issues).
This is very much what's in the Carr-Lee paper "Robust Replication of Volatility Derivatives", but it wasn't so easy to obtain in practice:
The formulas as written in the paper are not usable as is: they can be simplified (not too difficult, but intimidating at first)
The numerical integration is not trivial: a simple Gauss-Laguerre is not precise enough (maybe if I had an implementation with more points), a Gauss-Kronrod is not either (maybe if we split it in different regions). Funnily a simple adaptive Simpson works ok (but my boundaries are very basic: 1e-5 to 1e5).
A Volatility swap is a forward contract on future realized volatility. The pricing of such a contract used to be particularly challenging, often either using an unprecise popular expansion in the variance, or a model specific way (like Heston or local volatility with Jumps). Carr and Lee have recently proposed a way to price those contracts in a model independent way in their paper “robust replication of volatility derivatives”. Here is the difference between the value of a synthetic volatility swap payoff at maturity (a newly issued one, with no accumulated variance) and a straddle.
Those are very close payoffs!
I wonder how good is the discrete Derman approach compared to a standard integration for such a payoff as well as how important is the extrapolation of the implied volatility surface.The real payoff (very easy to obtain through Carr-Lee Bessel formula):
I found a nice finite difference scheme, where the solving part can be parallelized on 2 processors at each time-step
I was a bit surprised to notice that the parallelized algorithm ran in some cases twice slower than the same algorithm not parallelized. I tried ForkJoinPool, ThreadPoolExecutor, my one notify/wait based parallelization. All resulted in similar performance compared to just calling thread1.run() and thread2.run() directly.
I am still a bit puzzled by the results. Increasing the time of the task by increasing the number of discretization points does not really improve the parallelization. The task is relatively fast to perform and is repeated many (around of 1000) times, so synchronized around 1000 times, which is likely why parallelization is not great on it: synchronization overhead reaps any benefit of the parallelization. But I expected better. Using a Thread pool of 1 thread is also much slower than calling run() twice (and fortunately slower than the pool of 2 threads).
I still did not abandon Scala despite my previous post, mainly because I have already quite a bit of code, and am too lazy to port it. Furthermore the issues I detailed were not serious enough to motivate a switch. But these days I am more and more fed up with Scala, especially because of the Eclipse plugin. I tried the newer, the beta, and the older, the stable, the conclusion is the same. It’s welcome but:code completion is not great compared to Java. For example one does not seem to be able to see the constructor parameters, or the method parameters can not be automatically populated.the plugin makes Eclipse very slow. Everything seems at least 3-5x slower. On the fly compilation is also much much slower than Java’s.It’s nice to type less, but if overall writing is slower because of the above issues, it does not help. Beside curiosity of a new language features, I don’t see any point in Scala today, even if some of the ideas are interesting. I am sure it will be forgotten/abandoned in a couple of years. Today, if I would try a new language, I would give Google Go a try: I don’t think another big language can make it/be useful on the JVM (beside a scripting kind of language, like JavaScript or Jython).Google Go focuses on the right problem: concurrency. It also is not constrained to JVM limitation (on the other side one can not use a Java library - but open source stuff is usually not too difficult to port from one language to another). It has one of the fastest compilers. It makes interesting practical choices: no inheritance.
In my Linux quest, I changed distribution again on my home desktop, from OpenSuse 11.1 with KDE to Ubuntu 13.04 (not yet released - so alpha) with Unity. Why?
KDE was crashing a bit too often for my taste (more than once a week). Although I enjoyed the KDE environment.
Not easy to transfer files to my Android 4.2 phone. Ubuntu 13.04 is not fully there yet, but is on its way.
zypper is a bit too specific for my taste. I would be ok with yum+rpm or apt-get, but another tool just for a single distribution, not really.
Plus I’m just curious what’s next for Ubuntu, and Linux makes it very simple to change distributions, and reinstall applications with the same settings. So it’s never a big task to change distribution.
I somehow like how Ubuntu feels, not sure what it is exactly, maybe the Debian roots.
When people say OpenSuse is rock solid, I don’t have that impression, at least on the desktop. It might have been true in the past. But in the past, most distros were very stable. I never remember having big stability issues until, maybe, late 2010. In the early 2000s, a laptop would work very well with Linux, suspend included. I remember that my Dell Inspiron 8200 was working perfectly with Mandrake and WindowMaker. Nowadays, it never seem to work that well, but is just ok: Optimus comes to mind (especially with external screen), suspend problems, wifi (not anymore).
So far I can see that Ubuntu 13.04 is prettier than the past versions, the installer is great. I encrypted my two hard disks, it was just a matter of ticking a box - very nice. Unity Launcher, while interesting, is still not the greatest tool to find an installed application (compared to KDE launcher or Gnome Shell). I don’t notice any stability issue so far, even though I have some popup messages that sometimes tells me something crashed (typical for an alpha version). If I just ignore the messages, everything seems fine. OpenSuse-KDE was logging me out (session crash), or just stopped completely being responsive (hard reset necessary).