I had the opportunity to receive a free book on climate change, through the company I am working for. I had not heard of that book before, it called Saving Us and is written by an actual climate scientist (Katharine Hayhoe).
Unfortunately, written by does not mean that it is a scientific book, and it’s not. The author does not spend much effort explaining the physics or the reports, but focuses on how to convince people this is an important problem to tackle. It is mildly interesting at first, as it presents the problem from a psychological angle. But it becomes quickly repetitive. The method is always the same, connect with what’s important to an audience who initially rejects climate warming (mostly the phenomenon, sometimes proposals around it), and make them understand that climate warming plays a role in their life, in the very matters they care about. It is presented as a series of personal experiences of the author, a list of examples.
There is a small part on taxing carbon, it’s only 3 pages, and quite optimistic. It consists in a socialist approach (which I am not necessarily opposed to). There is no mention of carbon trades, no mention of what can go wrong with the taxing or the trade of carbon.
On the global warming subject, I much preferred the more controversial book Unsettled of Steve Koonin, sometimes quoted by climate warming deniers (who may have not read it at all) and often denounced by climate warming supporters (who may have read it too closely). The arguments made were clearer, more scientific. Even if the book was not that great either, I found it helped discovering the subject from various angles.
Overall it’s a bit strange that this kind of book is given away by companies. On one hand, I can see how it resonates with business values such as how to deal with antagonistic people, how to talk to people around you. On the other hand, it feels like a waste of time and money. There are much better books or courses that can teach you more in a shorter time. For example, I attended by accident two 4h courses around Myers-Briggs, and found it to make a much bigger impact.
It is well known that vanilla option prices must increase when we increase the implied volatility. Recently, a post on the Wilmott forums wondered about the true accuracy of Peter Jaeckel implied volatility solver, whether it was truely IEEE 754 compliant. In fact, the author noticed some inaccuracy in the option price itself. Unfortunately I can not reply to the forum, its login process does not seem to be working anymore, and so I am left to blog about it.
It can be quite challenging to evaluate the accuracy. For example, one potential error that the author makes is to use numbers that are not exactly representable in IEEE 754 standard such as 4.45 or 0.04. In Julia, and I believe in Python mpmath as well, there is a difference between a BigFloat(4.45) and a BigFloat(“4.45”):
If we work only on usual 64-bit floats, we likely want to estimate the accuracy of an implementation using BigFloat(4.45) as input rather than BigFloat(“4.45”). To compute a vanilla option price, the naive way would be through the cumulative normal distribution function (the textbook way), which looks like
function bs(strike, forward, voltte)
d1 = (log(forward/strike)+voltte^2/2)/voltte
return forward*cdf(Normal(),d1)-strike*cdf(Normal(),d1-voltte)
end
It turns out that this produces relatively high error for out of the money prices. In fact, the prices are far from being monotonic if we increase the vol by 1 ulp a hundred times we obtain:
Peter Jaeckel proposes several improvements, the first simple one is to use the scaled complementary error function erfcx rather than the cumulative normal distribution function directly. A second improvement is to use Taylor expansions for small prices. It turns out that on the example with strike K=2.0, forward F=1.0, vol=0.125, the Taylor expansion is being used.
The error is much lower, with Taylor being even a little bit better. But the price is still not monotonic.
How would the implied vol corresponding to those prices look like? If we apply Peter Jaeckel implied vol solver to the naive Black-Scholes formula, we end up with an accuracy limited by the naive formula.
If we apply it to the exact prices or to erfcx or to Taylor, we still end up with a non monotonic implied vol:
There is no visible difference in accuracy if we start from the erfcx prices and apply the SR Householder solver of Jherek Healy:
It should not be too surprising: if we compose the Black-Scholes formula with the associated solver from Peter Jaeckel, we do not obtain the identity function. Because the option prices are not monotonic with the vol, there are several vols which may be solution to the same price, and the solver is guaranteed to not solve exactly, which leads to the numerical noise we see on the plots. In contrast, it is remarkable that if we compose exp with log we have a nice monotonic shape:
The prerequisite for a more accurate solver would be a Black-Scholes function which is actually monotonic.
When computing the derivative of a function by finite difference, which step size is optimal? The answer depends on the kind of difference (forward, backward or central), and the degree of the derivative (first or second typically for finance).
For the first derivative, the result is very quick to find (it’s on wikipedia). For the second derivative, it’s more challenging. The Lecture Notes of Karen Kopecky provide an answer. I wonder if Copilot or ChatGPT would find a good solution to the question:
“What is the optimal step size to compute the second derivative of a function by centered numerical differentiation?”
Here is the answer of copilot:
and the one of chatGPT:
and a second answer from chatGPT:
Copilot always fails to provide a valid answer. ChatGPT v4 proves to be quite impressive: it is able to reproduce some of the reasoning presented in the lecture notes.
Interestingly, with centered difference, for a given accuracy, the optimal step size is different for the first and for the second derivative. It may, at first, seem like a good idea to use a different size for each. In reality, when both the first and second derivative of the function are needed (for example the Delta and Gamma greeks of a financial product), it is rarely a good idea to use a different step size for each. Firstly, it will be slower since more the function will be evaluated at more points. Secondly, if there is a discontinuity in the function or in its derivatives, the first derivative may be estimated without going over the discontinuity while the second derivative may be estimated going over the discontinuity, leading to puzzling results.
The COS method is a fast way to price vanilla European options under stochastic volatility models with a known characteristic function. There are alternatives, explored in previousblogposts. A main advantage of the COS method is its simplicity. But this comes at the expense of finding the correct values for the truncation level and the (associated) number of terms.
A related issue of the COS method, or its more fancy wavelet cousin the SWIFT method, is to require a huge (>65K) number of points to reach a reasonable accuracy for some somewhat extreme choices of Heston parameters. I provide an example in a recent paper (see Section 5).
Gero Junike recently wrote severalpapers on how to find good estimates for those two parameters. Gero derives a slightly different formula for the put option, by centering the distribution on \( \mathbb{E}[\ln S] \). It is closer to my own improved COS formula, where I center the integration on the forward. The estimate for the truncation is larger than the one we are used to (for example using the estimate based on 4 cumulants of Mike Staunton), and the number of points is very conservative.
The bigger issue with this new estimate, is that it relies on an integration of a function of the characteristic function, very much like the original problem we are trying to solve (the price of a vanilla option). This is in order to estimate the \( ||f^{(20)}||_{\infty} \). Interestingly, evaluating this integral is not always trivial, the double exponential quadrature in Julia fails. I found that reusing the transform from \( (0,\infty) \) to (-1,1) of Andersen and Lake along with a Gauss-Legendre quadrature on 128 points seemed to be ok (at least for some values of the Heston parameters, it may be fragile, not sure).
While very conservative, it seems to produce the desired accuracy on the extreme example mentioned in the paper, it leads to N=756467 points and a upper truncation at b=402.6 for a relative tolerance of 1E-4. Of course, on such examples the COS method is not fast anymore. For comparison, the Joshi-Yang technique with 128 points produces the same accuracy in 235 μs as the COS method in 395 ms on this example, that is a factor of 1000 (on many other examples the COS method behaves significantly better of course).
Furthermore, as stated in Gero Junike’s paper, the estimate fails for less smooth distributions such as the one of the Variance Gamma (VG) model.
I never paid too much attention to it, but the term-structure of variance swaps is not always realistic under the Schobel-Zhu stochastic volatility model.
This is not fundamentally the case with the Heston model, the Heston model is merely extremely limited to produce either a flat shape or a downward sloping exponential shape.
Under the Schobel-Zhu model, the price of a newly issued variance swap reads
$$ V(T) = \left[\left(v_0-\theta\right)^2-\frac{\eta^2}{2\kappa}\right]\frac{1-e^{-2\kappa T}}{2\kappa T}+2\theta(v_0-\theta)\frac{1-e^{-\kappa T}}{\kappa T}+\theta^2+\frac{\eta^2}{2\kappa},,$$
where \( \eta \) is the vol of vol.
When \( T \to \infty \), we have \( V(T) \to \theta^2 + \frac{\eta^2}{2\kappa} \).
Unless \( \kappa \) is very large, or the vol of vol is very small, the second term will often dominate. In plain words, the prices of long-term variance swaps are almost purely dictated by the vol of vol when the speed of mean reversion is lower than 1.0.
Below is an example of fit to the market. If the vol of vol and kappa are exogeneous and we calibrate only the initial vol v0, plus the long term vol theta to the term-structure of variance swaps, then we end up with upward shapes for the term-structure, regardless of theta. Only when we add the vol of vol to the calibration, we find a reasonable solution. The solution is however not really better than the one corresponding to the Heston fit with only 2 parameters. It thus looks overparameterized.
There are some cases where the Schobel-Zhu model allows for a better fit than Heston, and makes use of the flexibility due to the vol of vol.
It is awkward that the variance swap term-structure depends on the vol of vol in the Schobel-Zhu model.
In the previous post, I presented a new stochastic expansion for the prices of Asian options. The stochastic expansion is generalized to basket options in the paper, and can thus be applied on the problem of pricing vanilla options with cash dividends.
I have updated the paper with comparisons to more direct stochastic expansions for pricing vanilla options with cash dividends, such as the one of Etoré and Gobet, and my own refinement on it.
The second and third order basket expansions turn out to be significantly more accurate than the previous, more direct stochastic expansions, even on extreme examples.
Recently, I had the idea of applying the same stochastic expansion technique to the prices of Asian options, and more generally to the prices of vanilla basket options. It works surprising well for Asian options. A main advantage is that the expansion is straightforward to implement: there is no numerical solving or numerical quadrature needed.
It works a little bit less well for basket options. Even though I found a better proxy for those, the expansions behave less well with large volatility (really, large total variance), regardless of the proxy. I notice now this was also true for the case of discrete dividends where, clearly the accuracy deteriorates somewhat significantly in “extreme” examples such as a vol of 80% for an option maturity of 1 year (see Figure 3).
I did not compare the use of the direct stochastic expansion for discrete dividends, and the one going through the vanilla basket expansion, maybe for a next post.
In the Black-Scholes model with a term-structure of volatilities, the Log-Euler Monte-Carlo scheme is not necessarily exact.
This happens if you have two assets \(S_1\) and \(S_2\), with two different time varying volatilities \(\sigma_1(t), \sigma_2(t) \). The covariance from the Ito isometry from \(t=t_0\) to \(t=t_1\) reads $$ \int_{t_0}^{t_1} \sigma_1(s)\sigma_2(s) \rho ds, $$ while a naive log-Euler discretization may use
$$ \rho \bar\sigma_1(t_0) \bar\sigma_2(t_0) (t_1-t_0). $$
In practice, the \( \bar\sigma_i(t_0) \) are calibrated such that the vanilla option prices are exact, meaning
$$ \bar{\sigma}_i^2(t_0)(t_1-t_0) = \int_{t_0}^{t_1} \sigma_i^2(s) ds.$$
As such the covariance of the log-Euler scheme does not match the covariance from the Ito isometry unless the volatilities are constant in time. This means that the prices of European vanilla basket options is going to be slightly off, even though they are not path dependent.
The fix is of course to use the true covariance matrix between \(t_0\) and \(t_1\).
The issue may also happen if instead of using the square root of the covariance matrix in the Monte-Carlo discretization, the square root of the correlation matrix is used.
In my previous blog post, I looked at the roughness of the SVCJ stochastic volatility model with jumps (in the volatility). In this model, the jumps occur randomly, but at discrete times. And with typical parameters used in the litterature, the jumps are not so frequent. It is thus more interesting to look at the roughness of pure jump processes, such as the CGMY process.
The CGMY process is more challenging to simulate. I used the approach based on the characteristic function described in Simulating Levy Processes from Their Characteristic Functions and Financial Applications. Ballota and Kyriakou add some variations based on FFT pricing of the characteristic function in Monte Carlo simulation of the CGMY process and option pricing and pay much care about a proper truncation range. Indeed, I found that the truncation range was key to simulate the process properly and not always trivial to set up especially for \(Y \in (0,1) \). I however did not implement any automated range guess as I am merely interested in very specific use cases, and I used the COS method instead of FFT.
I also directly analyze the roughness of a sub-sampled asset path (the exponential of the CGMY process), and not of some volatility path as I was curious if pure jump processes would mislead roughness estimators. I simulated the paths corresponding to parameters given in Fang thesis The COS Method - An Efficient Fourier Method for Pricing Financial Derivatives: C = 1, G = 5, M = 5, Y = 0.5 or Y = 1.5.
And the corresponding Hurst index estimate via the Cont-Das method:
Similarly the Gatheral-Rosenbaum estimate has no issue finding H=0.5.
I was wondering if adding jumps to stochastic volatility, as is done in the SVCJ model of Duffie, Singleton and Pan “Transform Analysis and Asset Pricing for Affine Jump-Diffusion” also in Broadie and Kaya “Exact simulation of stochastic volatility and other affine jump diffusion processes”, would lead to rougher paths, or if it would mislead the roughness estimators.
The answer to the first question can almost be answered visually:
The parameters used are the one from Broadie and Kaya: v0=0.007569, kappa=3.46, theta=0.008, rho=-0.82, sigma=0.14 (Heston), jump correlation -0.38, jump intensity 0.47, jump vol 0.0001, jump mean 0.05, jump drift -0.1.
The Rough Heston with H=0.1 is much “noisier”. There is not apparent difference between SVCJ and Heston in the path of the variance.
The estimator of Cont and Das (on a subsampled path) leads to a Hurst exponent H=0.503, in line with a standard Brownian motion.
The estimator from Rosenbaum and Gatheral leads to a Hurst exponent (slope of the regression) H=0.520 with well behaved regressions:
On this example, there are relatively few jumps during the 1 year duration. If we multiply the jump intensity by 1000 and reduce the jump mean accordingly, the conclusions are the same. Jumps and roughness are fundamentally different.
Of course this does not mean that the short term realized volatility does not look rough as evidenced in Cont and Das paper:
I computed the realized volatility on a subsampled path, using disjoint windows of 1h of length.
It is not really rough, estimators will have a tough time leading to stable estimates on it.
This is very visible with the Rosenbaum-Gatheral way of estimating H, we see that the observations do not fall on a line at all but flatten:
The pure Heston model leads to similar observations.