Measuring Roughness with Julia

I received a few e-mails asking me for the code I used to measure roughness in my preprint on the roughness of the implied volatility. Unfortunately, the code I wrote for this paper is not in a good state, it’s all in one long file line by line, not necessarily in order of execution, with comments that are only meaningful to myself.

In this post I will present the code relevant to measuring the oxford man institute roughness with Julia. I won’t go into generating Heston or rough volatility model implied volatilities, and focus only on the measure on roughness based on some CSV like input. I downloaded the oxfordmanrealizedvolatilityindices.csv from the Oxford Man Institute website (unfortunately now discontinued, data bought by Refinitiv but still available in some github repos) to my home directory

using DataFrames, CSV, Statistics, Plots, StatsPlots, Dates, TimeZones 
df = DataFrame(CSV.File("/home/fabien/Downloads/oxfordmanrealizedvolatilityindices.csv"))
df1 =  df[df.Symbol .== ".SPX",:]
dsize =  trunc(Int,length(df1.close_time)/1.0)
tm = [abs((Date(ZonedDateTime(String(d),"y-m-d H:M:S+z"))-Date(ZonedDateTime(String(dfv.Column1[1]),"y-m-d H:M:S+z"))).value) for d in dfv.Column1[:]];
ivm = dfv.rv5[:]
using Roots, Statistics
function wStatA(ts, vs, K0,L,step,p)
    bvs = vs # big.(vs)
    bts = ts # big.(ts)
    value = sum( abs(log(bvs[k+K0])-log(bvs[k]))^p / sum(abs(log(bvs[l+step])-log(bvs[l]))^p for l in k:step:k+K0-step) * abs((bts[k+K0]-bts[k])) for k=1:K0:L-K0+1)
    return value
end
function meanRoughness(tm, ivm, K0, L)
    cm = zeros(length(tm)-L);
    for i = 1:length(cm)
        local ivi = ivm[i:i+L]
        local ti = tm[i:i+L]
        T = abs((ti[end]-ti[1]))
        try
            cm[i] = 1.0 /  find_zero(p -> wStatA(ti, ivi, K0, L, 1,p)-T,(1.0,100.0))
        catch e
            if isa(e, ArgumentError)
                cm[i] = 0.0
            else
                throw(e)
            end
        end
    end
    meanValue = mean(filter( function(x) x > 0 end,cm))
    stdValue = std(filter( function(x) x > 0 end,cm))
    return meanValue, stdValue, cm
end
meanValue, stdValue, cm = meanRoughness(tm, ivm, K0,K0^2)
density(cm,label="H",ylabel="Density")
The last plot should look like
It may be slightly different, depending on the date (and thus the number of observations) of the CSV file (the ones I found on github are not as recent as the one I used initially in the paper).

Black with Bachelier

I was experimenting with the recent SABR basket approximation of Hagan. The approximation only works for the normal SABR model, meaning beta=0 in SABR combined with the Bachelier option formula.

I was wondering how good the approximation would be for two flat smiles (in terms of Black volatilities). I then noticed something that escaped me before: the normal SABR model is able to fit the pure Black model (with constant vols) extremely well. A calibration near the money stays valid very out-of-the-money and the error in Black volatilities is very small.

For very low strikes (25%) the error in vol is below one basis point. And in fact, for a 1 year option of vol 20%, the option value is extremely small: we are in deep extrapolation territory.

It is remarkable that a Bachelier smile formula with 3 free parameters can fit so well the Black vols. Compared to stochastic collocation on a cubic polynomial (also 3 free parameters), the fit is 100x better with the normal SABR.

Error in fit against 10 vanilla options of maturity 1 year and volatility 20% within strike range [0.89, 1.5] .

Clenshaw-Curtis Quadrature Implementation by FFT in Practice

The Clenshaw-Curtis quadrature is known to be competitive with Gauss quadratures. It has several advantages:

  • the weights are easy and fast to compute.
  • adaptive / doubling quadratures are possible with when the Chebyshev polynomial of the second kind is used for the quadrature.
  • the Chebyshev nodes may also be used to interpolate some costly function.

The wikipedia article has a relatively detailed description on how to compute the quadrature weights corresponding to the Chebyshev polynomial of the second kind (where the points -1 and 1 are included), via a type-I DCT. It does not describe the weights corresponding to the Chebyshev polynomials of the first kind (where -1 and 1 are excluded, like the Gauss quadratures). The numbersandshapes blog post describes it very nicely. There are some publications around computation of Clenshaw-Curtis or Fejer rules, a recent one is Fast construction of Fejér and Clenshaw–Curtis rules for general weight functions.

I was looking for a simple implementation of the quadrature or its weights, leveraging some popular FFT library such as fftw (The Fastest Fourier Transform in the West). I expected it to be a simple search. Instead, it was surprisingly difficult to find even though the code in Julia consists only in a few lines:

  • First kind (following the numbersandshapes blog post)

    chebnodes(T, n::Int) = @. (cos(((2:2:2n) - 1) * T(pi) / 2n))
    x = chebnodes(Float64, N)
    m = vcat(sqrt(2), 0.0, [(1 + (-1)^k) / (1 - k^2) for k = 2:N-1])
    ws = sqrt(2 / N) * idct(m)

  • Second kind (the classic Clenshaw-Curtis, following some article from Treffenden)

    cheb2nodes(T, n::Int) = @. (cos(((0:n-1)) * T(pi) / (n-1)))
    x = cheb2nodes(Float64, N)
    c = vcat(2, [2 / (1 - i^2) for i = 2:2:(N-1)]) # Standard Chebyshev moments
    c = vcat(c, c[Int(floor(N / 2)):-1:2])         # Mirror for DCT via FFT 
    ws = real(ifft(c))                             # Interior weight
    ws[1] /= 2
    ws = vcat(ws, ws[1])                           # Boundary weights

There are some obvious possible improvements: the nodes are symmetric and only need to be computed up to N/2, but the above is quite fast already. The Julia package FastTransforms.jl provides the quadrature weights although the API is not all that intuitive:

  • First kind:
     fejer1weights(FastTransforms.chebyshevmoments1(Float64,N))
  • Second kind:
     clenshawcurtisweights(FastTransforms.chebyshevmoments1(Float64,N))

Interestingly the first kind is twice faster with FastTransforms, which suggests a similar symetry use as for the second kind. But the second kind is nearly twice slower.

Although Clenshaw-Curtis quadratures are appealing, the Gauss-Legendre quadrature is often slightly more accurate on many practical use cases and there exists also fast enough ways to compute its weights. For example, in the context of a two-asset basket option price using a vol of 20%, strike=spot=100, maturity 1 year, and various correlations we have the below error plots

Correlation = -90%,-50%,50%,90% from top to bottom.

Ghost Vacations

During my vacation, I don’t know why, but I looked at some stability issue with ghost points and the explicit method. I was initially trying out ghost points with the explicit runge kutta Chebyshev/Legendre/Gegenbauer technique and noticed some explosion in some cases.

I cornered it down to a stability issue of the standard explicit Euler method with ghost (or fictitious) points. The technique is described in the book “Paul Wilmott on Quantitative Finance” (also in Paul Wilmott introduces quantitative finance), which I find quite good, although I have some friends who are not much fond of it. The technique may be used to compute the price of a continuously monitored barrier option when the barrier does not fall on the grid, or more generally for time-dependent barriers. I however look at it in the simple context of a constant barrier in time.

The ghost point technique (from Paul Wilmott on Quantitative Finance).

I found out that the explicit Euler scheme is unstable if we let the grid boundaries be random. In practice, for more exotic derivatives, the grid upper bound will typically be based on a number of standard deviations from the spot or from the strike price. The spot price moves continually, and the strike price is fairly flexible for OTC options. So the upper bound can really be anything and the explicit Euler may require way too many time-steps to be practical with the ghost point technique. This is all described in the preprint Instabilities of explicit finite difference schemes with ghost points on the diffusion equation.

What does this mean?

The ghost point technique is not appropriate for the explicit Euler scheme (and thus for the Runge-Kutta explicit schemes as well), unless the ghost point is fixed to be in the middle of the grid, or on the grid outside by one space-step. This means the grid needs to be setup in a particular fashion. But if we need to setup the grid such that the barrier falls exactly somewhere then, for a constant barrier option, no ghost point is needed, one just need to place the barrier on the grid and use a Dirichlet boundary condition.

Explicit scheme requires a very large number of time-steps.

Crank_Nicolson oscillations near maturity with the ghost point.

Maximum Implied Variance Slope

The paper The Moment Formula for Implied Volatility at Extreme Strikes by Roger Lee redefined how practioners extrapolate the implied volatility, by showing that the total implied variance can be at most linear in the wings, with a slope below 2.

Shortly after, the SVI model of Jim Gatheral, with its linear wings, started to become popular.

In a recent paper in collaboration with Winfried Koller, we show that the asymptotic bounds are usually overly optimistic. This is somewhat expected, as the bound only holds as the log-moneyness goes to infinity. What is less expected, is that, even at very large strikes (nearly up to the limit of what SVI allow numerically), we may happen to have arbitrages, even though the bounds are respected.

In practice, this manifests by increasing call option prices far away in the wings (increasing call prices are an arbitrage opportunity), which could be problematic for any kind of replication based pricing, such as variance swap pricing.

Increasing call prices for various calibrated SVI parameters respecting the asymptotic boundary constraints.

The paper on arxiv.

The Return of the Arbitrage in the Perfect Volatility Surface

In a Wilmott article from 2018 (Wilmott magazine no. 97) titled “Arbitrage in the perfect volatility surface”, Uwe Wystup points out some interesting issues on seemingly innocuous FX volatility surfaces:

  • a cubic spline tends to produce artificial peaks/modes in the density.
  • SVI not arbitrage-free even on seemingly trivial input.

The examples provided are indeed great and the remarks very valid. There is more to it however:

  • a cubic spline on strikes or log-moneyness does not produce the artificial peak.
  • SVI with a=0 is arbitrage-free on this example.

For the first point, the denominator of the Dupire formula in terms of the representation of total variance as a function of logmoneyness gives some clues as it constitues a vega scaled version of the probability density with direct link to the total variance and its derivatives. In particular it is a simple function of its value, first and second derivatives, without involving any non-linear function and the second derivative only appears as a linear term. As such a low order polynomial representation of the variance in log-moneyness may be adequate. In contrast, the delta based representation introduces a strong non-linearity with the cumulative normal distribution function.

Splines on AUD/NZD 1w options.

For the second point, it is more of a curiosity. Everybody knows that SVI is not always arbitrage-free and examples of arbitrages abound. Finding such a minimal example is nice. I noticed that many of the real life butterfly spread arbitrages are avoided if we follow Zeliade’s recommendation to set a >= 0, where a is the base ordinate parameter of SVI. Yes the fit is slightly worse, but the increased calibration robustness and decrease of the number of smiles with arbitrage is probably worth it.

SVI on AUD/NZD 1w options.

SVI on AUD/NZD 1w options.

Even if I am no particular fan of SVI or of the cubic spline to represent implied volatilities, the subject merited some clarifications.

extract of the Wilmott article.

There is a third point which I found quite surprising. The axis is inconsistent between the first two plots of Figure 1. 10 delta means that we are out of the money. For a call, the strike is thus larger than the forward, and thus the corresponding log-moneyness must be positive. But on the Figure, the 10 Delta call volatility of the first plot does not correspond to the highest log-moneyness point of the second plot. One of the axis is reversed. This is particularly surprising for an expert like Uwe.

Ooohhhh!

It is reassuring that this kind of mistakes also happen to the best. Every FX quant has Uwe Wystup’s book, next to the Clark and the Castagna.

Princeton Fintech and Quant conference of December 2022

I recently presented my latest published paper On the Bachelier implied volatility at extreme strikes at the Princeton Fintech and Quant conference. The presenters were of quite various backgrounds. The first presentations were much more business oriented with lots of AI keywords, but relatively little technical content while the last presentation was about parallel programming. Many were a pitch to recruit to employees.

The diversity was interesting: it was refreshing to hear about quantitative finance from vastly different perspectives. The presentation from Morgan Stanley about their scala annotation framework to ease up parallel programming was enlightening. The main issue they were trying to solve is the necessity for all the boilerplate code to handle concurrency, caching, robustness, which obfuscates significantly the business logic in the code. This is an old problem. Decades ago, rule engines were the trend for similar reasons. Using the example of pricing bonds, the presenters put very well forward the issues in evidence, issues that I found very relevant.

While writing my own presentation, I realized I had not tried to plot the Bachelier implied volatility fractals. The one corresponding to the Householder method is pretty, and not too far off from the Black-Scholes case, possibly because the first, second and third derivatives towards the vol are similar between the two models.

Iterations of the Householder method on the problem of solving the Bachelier implied volatility for a given strike and maturity. Each point corresponds to an initial guess in the complex plane.

Of course, this is purely for the fun as this has no practical importance, since we can invert the Bachelier price with near machine accuracy through a fast analytic formula.

Slides of the presentation.

Desktop Linux in 2022

I have been a long time user of Fedora at work and have been happy quite happy about it. Around 6 months ago, I moved to Manjaro under VirtualBox in a Windows host, because the company imposes the use of a VPN client that does not run on Linux. It’s much less great for several reasons:

  • VirtualBox makes everything graphics much slower. It is still way better than WSL2 as I found WSL2 to be a poor man’s Linux, where graphics is via some half broken X11 client for Windows. With VirtualBox, programming is ok, watching videos is not (you need to use the host for that). Something relatively neat is that the VM can be run on any other host, including a Linux host by just copying the vm file.
  • I picked Manjaro out of curiosity and because it was perhaps faster/more optimized than Fedora. I don’t have too many issues with it. But I found out that the use of AUR packages would frequently break on upgrades. Futhermore, it is updated less frequently than Fedora, which is a bit surprising for a rolling release. I also installed it with ext4. With hindsight, I think both choices were poor, Fedora with brtfs is the way to go, or some Ubuntu variant.
  • Work also imposes various antivirus programs which tend to always keep the cpu busy.

Overall I am still much more productive than with a raw Windows, which is surprisingly slow for many software development related tasks. Having to deal more with Windows showed me how slow the system is, and how poor the experience is. Before using it again (after many years completely off it), I had started thinking it was good (well, it does crash much less than back in the days).

I otherwise use Fedora with either KDE or Gnome on my personal desktop and laptop with very few issues overall. Gnome seems much better since 42 even if overall, Gnome shell has not been the most stable in the past. Steam runs great. I also encountered some grub bootloader bug recently, which disappeared once I upgraded my bios.

Roughness of the Implied Volatility

This is a follow up of my previous post on rough volatility. I recently tried to reproduce the results of the paper Rough Volatility: Fact or Artefact? as I was curious to apply the technique using different inputs. The 5-minutes SPX realized volatility is freely available in CSV format at the Oxford-Man Institute of Quantitative Finance and it is thus relatively straightforward to reproduce the numbers presented in the paper.

Using a sampling of K=75 and L=75*75, I obtain an rounghness index H=0.181. The paper uses K=70 and L = 70x70, and their Figure 19 of the paper states 0.187. It turns out that there are more than L observations in the time-series, and, with K=70, I end up with a roughness index H=0.222 when I start from the first observation (year 2000), up to the observation L+1. But it is possible to slide this window and compute the roughness index at each starting point. The results are enlightening.

SPX historical vol roughness index sliding estimate with K=70.

density of SPX historical vol roughness index estimate with K=70. The mean is 0.158 with a standard deviation of 0.034

The mean estimated roughness is 0.158 with a standard deviation of 0.034. The results of the paper are within one standard deviation. But it shows it may be dangerous to use a single window for the estimate. If we use a smaller window size K=50, L=50x50, we can see regimes of roughness in time.

SPX roughness index sliding estimate with K=50.

density of SPX roughness index estimate with K=70. The mean is 0.145 with a standard deviation of 0.048

I also double checked that the SPX closing prices resulted in an estimate H=0.5, for the two window sizes. The SPX price process follows the standard Brownian motion statistics.

The paper explains that using the 5 minutes realized volatility is not necessarily a good proxy to measure the true underlying instant volatility roughness. I wondered then what happened if we use the daily quotes of the short term at-the-money (ATM) implied volatility as a proxy. Indeed, in a given stochastic volatility model, the short term ATM implied volatility is approximately the instant volatility of the stochastic volatility process. It turns out that there is some freely available data from the Nasdaq for MSFT, but only for the period 2015-01-02 to 2018-12-32 (four years). This constitutes 1006 samples. For the 30-day implied volatility, with K=28, the data implies a roughness index close to 0.5 (0.448).

density of MSFT 30-day implied volatility roughness index estimate. The mean is 0.448 with a standard deviation of 0.059

density of MSFT 10-day implied volatility roughness index estimate. The mean is 0.350 with a standard deviation of 0.049

Roughness is apparent in the 10-day implied volatility, with H=0.35. For the 60-day implied vol, I measure a roughness H=0.41 with a deviation of 0.06, so again not significantly different from the standard Brownian motion roughness. The roughness of the implied volatility skew, also available in this data, is more pronounced.

Another interesting analysis is the roughness of the VIX index, which can be mapped to the price of a newly issued 30 days variance swap on the SPX index. I am actually slightly surprised that the various papers on rough volatility have not looked at it already. I just took the quotes from Yahoo. With K=51, I find H=0.347 with a standard deviation of 0.026. This is close to the 10-day implied volatility roughness measured previously.

VIX roughness index sliding estimate with K=51.

density of VIX roughness index estimate with K=51. The mean is 0.347 with a standard deviation of 0.026

What about the Heston model? What is the estimate of the roughness index on a simulated stochastic variance path? Using one time-step per day for 5 years, and K=25, we find H=0.505 with a standard deviation of 0.032 for the roughness of the instantaneous variance, as well as for the one of the one-month at-the-money implied volatility (using v(t) and constant Heston parameters). This is, as expected, the standard roughness of the underlying Brownian motion.

To conclude, the Hurst index H of the implied volatility, and of the VIX index appear to be not as low as the one of the 5 minutes historical volatility. The estimated roughness H=0.35 is much closer to the standard Brownian motion roughness. In fact, for the 30-day implied volatility (with daily quotes), the measure is even closer to 0.5.

Looking at the 5 minutes historical volatility may make sense to price intraday options, but it is not clear what is the importance of it for traditional derivatives pricing (with maturities longer than a week). Does the historical micro-structure really matter there?

After writing all that, I found out Livieri et al. (2017) paper titled Rough volatility: evidence from option prices which also measures the roughness of the implied volatilities. Interestingly, I did not reproduce some of the claims in that paper: I find a large bias in the implied volatility proxy even for a low H value. The large bias, combined to the uncertainty in the measurement makes the implied volatility proxy not very practical to estimate the true Hurst index of the underlying volatility process. Another discrepancy with the paper is around the Heston model: I do not find any apparent roughness at all in the Heston model.

I added this post, with more details, as a note on arxiv, just in case someone needs to reference this. I intend to place some the Julia code I wrote in the context of the paper on github, although I did not find the time yet.

Volatility: Rough or Not? A Short Review

It is well-known that the assumption of constant volatility in the Black-Scholes model for pricing financial contracts is wrong and may lead to serious mispricing, especially for any exotic derivative contracts. A classic approach is to use a deterministic local volatility model to take into account the variation both in the time dimension and in the underlying asset price dimension. But the model is still not realistic in terms of forward smile (the implied volatilities of forward starting options). A stochastic volatility component must be added to correct for it. More recently, the concept of rough volatility emerged in many academic papers. Instead of using a classic Brownian motion for the stochastic volatility process, a fractional Brownian motion is used. The idea of using a fractional Brownian motion for financial time-series can be traced back to Mandelbrot, but it is only relatively recently that it has been proposed for the volatility process (and not the stock process).

After many published papers on the subject, Rama Cont and Purba Das recently added their preprint Rough Volatility: Fact or Artefact? on SSRN, raising the question whether using a rough volatility process makes sense from a practical point of view. Indeed, they show that the Hurst index measured is different for the realized volatility (even based on very short sampling intervals) and the instantaneous volatility. The authors make the conjecture that is is all due to the discretization error (a.k.a. the microstructure noise), and that in fact, a standard Brownian motion is compatible with the low Hurst index of the realized volatility.

As mentioned in their conclusion, they are not alone in showing that the rough volatility assumption may not be justified, LCG Rogers has also published a note in 2019 Things we think we know, with the same conclusion.

In a 2019 paper, Masaaki Fukasawa, Tetsuya Takabatake and Rebecca Westphal also attempt to answer the question Is volatility rough?. They present the same problem with the estimation method of Rosenbaum and Gatheral, namely that their measure may actually not measure the true Hurst index, and this could be why it is always close to H=0.1, regardless of the market index. It is shown that for specific models, the Rosenbaum and Gatheral measure is dependent on the period length considered (5 minutes vs. 1 minute lead to different Hurst indices). The authors therefore devise a more robust (and much more complex) way to estimate the Hurst index of time series, which is not so sensitive to the period length considered. They find that the market volatility, measured through the realized variance proxy is rough, with H=0.04, rougher than what Jim Gatheral suggested.

To conclude, the study of Gatheral and Rosenbaum, which motivated the use of rough volatility, was based on “biased” estimates of the Hurst index of the realized volatility of high frequency data as evidenced in the three papers cited above. Clearly then, this justification does not hold. But Fukasawa et al. suggest the volatility is rougher, and it is slightly surprising that Rama Cont and Purba Das do not take more those results into account. They shortly discuss it, but then go on proposing another biased estimate (one which seems to lead to the same results as Rosenbaum and Gatheral). To their credit, Rama Cont and Purba Das show that the microstructure noise is far from IID, while Fukasawa et al. assume it is close to IID in their method. Perhaps their indirect conclusion is that Fukasaza et al. estimates are also biased (i.e. measuring something else - although the results presented in their paper seem solid).

Now, there are also other interesting features of the rough volatility models, such as the variation of the at-the-money skew in time, which mimicks the market implied skews in practice, a feature not easily achieved by stochastic volatility models (this typically requires a multidimensional stochastic volatility process to approach it crudely).

Previous

Next