Where is the S&P 500 going to end?

Yesterday the American stocks went a bit crazy along with the VIX that jumped from 17.50 to 38. It’s not exactly clear why, the news mention that the Fed might raise its interest rates, the bonds yield have been recently increasing substantially, and the market self-correcting after stocks grew steadily for months in a low VIX environment.

I don’t exactly follow the SPX/SPW options daily. But I had taken a snapshot two weeks ago when the market was quiet. We can imply the probability density from the market option prices. It’s not an exact science. Here I do this with a least-squares spline on the implied volatilities (the least squares smoothes out the noise). I will show another approach in a subsequent post.

probability density of the SPX implied from 1-month SPW options.

We can clearly see two bumps and a smaller third one further away in the left tail. This means that the market participants expect the SPX500 to go mostly to 2780 (slightly below where the spx future used to be) one month from now. Some market participants are more cautious and believe it could hover around 2660. Finally a few pessimists think more of 2530.

Option traders like to look at the implied volatilities (below).

implied volatilities of 1-month SPW options.

On the graph above, it’s really not clear that there is some sort of trimodal distribution. There is however an extremely sharp edge, that we normally see only for much shorter maturities. The vols are interpolated with a cubic spline above, not with a least-squares spline, in order to show the edge more clearly.

This is a bit reminiscent of the Brexit bets where Iain Clark shows two modes in the probability density of the GBP/USD.

Discrete Sine Transform via the FFT

Several months ago, I had a quick look at a recent paper describing how to use Wavelets to price options under stochastic volatility models with a known characteristic function. The more classic method is to use some numerical quadrature directly on the Fourier integral as described in this paper for example. When I read the paper, I was skeptical about the Wavelet approach, since it looked complicated, and with many additional parameters.

I recently took a closer look and it turns out my judgment was bad. The additional parameters can be easily automatically and reliably set by very simple rules that work well (a subsequent paper by the same author, which applies the method to Bermudan options, clarifies this). The method is also not as complex as I first imagined. And more importantly, the FFT makes it fast. It is quite amazing to see the power of the FFT in action. It really is because of the FFT that the Shannon Wavelet method is practical.

Anyway one of the things that need to be computed are the payoff coefficients, and one expression is just the sum of a discrete Cosine transform (DCT) and a discrete Sine transform (DST). I was wondering then about a simple way to use the FFT for the Sine transform. There are many papers around how to use the FFT to compute the Cosine transform. A technique that is efficient and simple is the one of Makhoul.

The coefficients that need to be computed for all k can be represented by the following equation $$ V_{k} = \sum_{j=0}^{N-1} a_j \cos\left(\pi k\frac{j+\frac{1}{2}}{N}\right) + b_j \sin\left(\pi k\frac{j+\frac{1}{2}}{N}\right) $$ with \( N= 2^{\bar{J}-1} \) for some positive integer \( \bar{J} \). Makhoul algorithm to compute the DCT of size N with one FFT of size N consists in

  • initialize the FFT coefficients \(c_j\) with: $$ c_j = a_{2j} \quad,,\quad c_{N-1-j} = a_{2j+1} \quad \text{ for } j = 0,…, \frac{N}{2}-1 $$
  • and then from the result of the FFT \( hat{c} \), the DCT coefficients \( \hat{a} \) are $$ \hat{a}_k = \Re\left[ \hat{c}_j e^{-i \pi\frac{k}{2N}} \right],. $$

Makhoul does not specify the equivalent formula for the DST, but we can do something similar.

  • We first initialize the FFT coefficients \( c_j \) with: $$ c_j = b_{2j} \quad,,\quad c_{N-1-j} = -b_{2j+1} \quad \text{ for } j = 0,…, \frac{N}{2}-1 $$
  • and then from the result of the FFT \(\hat{c}\), the DST coefficients \(\hat{b}\) are $$\hat{b}_k = -\Im\left[ \hat{c}_j e^{-i \pi\frac{k}{2N}} \right],. $$

For maximum performance, the two FFTs can reuse the same sine and cosine tables. And the last step of the DCT and DST can be combined together.

Another approach would be to compute a single FFT of size 2N as we can rewrite the coefficients as $$ V_{k} = \Re\left[e^{-i \pi \frac{k}{2N}} \sum_{j=0}^{2N-1} (a_j + i b_j) e^{-2i\pi k\frac{j}{2N}} \right] $$ with \(a_j = b_j =0 \) for \( j \geq N \)

In fact the two are almost equivalent, since a FFT of size 2N is decomposed internally into two FFTs of size N.

With this in mind we can improve a bit the above to merge the two transforms together:

  • We first initialize the FFT coefficients \( c_j \) with: $$ c_j = a_{2j} + i b_{2j} \quad,,\quad c_{N-1-j} = a_{2j+1} -i b_{2j+1} \quad \text{ for } j = 0,…, \frac{N}{2}-1 $$
  • and then from the result of the FFT \(\hat{c}\), the coefficients are $$V_k = \Re\left[ \hat{c}_j e^{-i \pi\frac{k}{2N}} \right],. $$

And we have computed the coefficients with a single FFT of size N.

In terms of Go language code, the classic type-2 sine Fourier transforms reads

func DST2(vector []float64) error {
	n := len(vector)
	result := make([]float64, n)
	for k := range vector {
		sum := 0.0
		for j, vj := range vector {
			sum += math.Sin((float64(j)+0.5)*math.Pi*float64(k)/float64(n)) * vj
		}
		result[k] = sum
	}
	copy(vector, result)
	return nil
}

And the Fast Type-2 Sine transform reads

func FastDST2ByFFT(vector []float64) error {
	n := len(vector)
	halfLen := n / 2
	z := make([]complex128, n)
	for i := 0; i < halfLen; i++ {
		z[i] = complex(vector[i*2], 0)
		z[n-1-i] = complex(-vector[i*2+1], 0)
	}
	if n&1 == 1 {
		z[halfLen] = complex(vector[n-1], 0)
	}
	err := TransformRadix2Complex(z)
	for i := range z {
		temp := float64(i) * math.Pi / float64(n*2)
		s, c := math.Sincos(temp)
		vector[i] = real(z[i])*s - imag(z[i])*c
	}
	return err
}

The function TransformRadix2Complex, used above, steams from a standard implementation of the discrete Fourier transform (DFT) of the given complex vector, storing the result back into the vector (The vector’s length must be a power of 2. Uses the Cooley-Tukey decimation-in-time radix-2 algorithm).

func TransformRadix2Complex(z []complex128) error {
	n := len(z)
	levels := uint(31 - bits.LeadingZeros32(uint32(n))) // Equal to floor(log2(n))
	if (1 << levels) != n {
		return fmt.Errorf("Length is not a power of 2")
	}

	// Trigonometric tables
	cosTable, sinTable := MakeCosSinTable(n)

	// Bit-reversed addressing permutation
	for i := range z {
		j := int(bits.Reverse32(uint32(i)) >> (32 - levels))
		if j > i {
			z[i], z[j] = z[j], z[i]
		}
	}

	// Cooley-Tukey decimation-in-time radix-2 FFT
	for size := 2; size <= n; size *= 2 {
		halfsize := size / 2
		tablestep := n / size
		for i := 0; i < n; i += size {
			j := i
			k := 0
			for ; j < i+halfsize; j++ {
				l := j + halfsize
				tpre := real(z[l])*cosTable[k] + imag(z[l])*sinTable[k]
				tpim := -real(z[l])*sinTable[k] + imag(z[l])*cosTable[k]
				z[l] = z[j] - complex(tpre, tpim)
				z[j] += complex(tpre, tpim)
				k += tablestep
			}
		}
		if size == n { // Prevent overflow in 'size *= 2'
			break
		}
	}

	return nil
}

func MakeCosSinTable(n int) ([]float64, []float64) {
	cosTable := make([]float64, n/2)
	sinTable := make([]float64, n/2)
	for i := range cosTable {
		s, c := math.Sincos(2 * math.Pi * float64(i) / float64(n))
		cosTable[i] = c
		sinTable[i] = s
	}
	return cosTable, sinTable
}

Quantitative Finance Books Citing My Papers

I would have never really expected that when I started writing papers, but little by little there is a growing list of books citing my papers:

There are also some Springer books which are typically a collection of papers on a specific subject (which I find less interesting).

Google phones are overrated

It is a relatively common belief that the vanilla Android experience is better, as it runs smoother. The Samsung Touchwiz is often blamed for making things slow and not more practical.

I have had a Nexus 6 for a couple of years and I noticed the slowdowns after each update, up to a point where it sometimes took a few seconds to open the phone app, or to display the keyboard. I freed up storage, removed some apps but this did not make any difference. This is more or less the same experience that people have with the Samsung devices if reddit comments are to be believed.

The other myth is that Google phones will be updated for a longer time. Prior to the Nexus 6, I had a Galaxy Nexus, which Google stopped supporting after less than 2 years. The Nexus 6 received security update until this October, that is nearly 2 years for me and 2.5 years for early buyers.

In comparison Apple updates its phones for much longer and a 4 years old iphone 5s still runs smoothly. Fortunately for Android phones, there are the alternative roms. Being desperate with the state of my Nexus phone, I installed paranoid android. I am surprised at how good it is. My phone feels like new again, and as good as any flagship I would purchase today. To my great surprise I did not find any clear step by step installation process for Paranoid Android. I just followed the detailed steps for LineageOS (CyanogenMod descendent), but with the paranoid android zip file instead of the LineageOS one. I have some trouble to understand how some open source ROM can be much nicer than the pure Google Android ROM, but it is. I have had no crash/reboot (which became more common as well with the years), plus it’s still updated regularly. Google does not set a good standard by not supporting its own devices better.

There is however one thing that Google does well, it’s the camera app. The HDR+ mode makes my Nexus 6 take great pictures, even compared to new android phones or iphones. I am often amazed at the pictures I manage to take, and others are also amazed at how good they look (especially since they look often better on the Nexus 6 screen than on a computer monitor). Initially I remember the camera to be not so great, but some update that came in 2016 transformed the phone into a fantastic shooter. It allowed me to take very nice pictures in difficult conditions, see for example this google photo gallery. Fortunately it’s still possible to install the app in the Google Play store on Paranoid Android. They should really make it open-source and easier to adapt it to other android phones.

A photo made with the Google camera app on the Nexus 6.

SVN is dead

A few years ago, when Git was rising fast and SVN was already not hype anymore, a friend thought that SVN was for many organizations better suited than Git, with the following classical arguments, which were sound at the time:

  1. Who needs decentralization for a small team or a small company working together?
  2. ‎SVN is proven, works well and is simple to use and put in place.

Each argument is in reality not so strong. It becomes clear now that Git is much more established.

Regarding the first argument, a long time ago some people had trouble with CVS as it introduced the idea of merging instead of locking files. Git represents a similar paradigm shift between the centralized and the decentralized. It can scare people in not so rational ways. You could lock files with CVS as you did with visual sourcesafe or any other crappy old source control system. It’s just that people favored merges as it was effectively more convenient and more productive. You can also use Git with a centralized workflow. Another more scary paradigm shift with Git is to move away from the idea that branches are separate folders. With Git you just switch branches as it is instantaneous even though, again, you could use it in the old fashioned SVN way.

Now onto the second argument, SVN is proven to work well. But so is Cobol. Today it should be clear that SVN is essentially dead. Most big projects move or have moved to git. Tools work better with Git, even Eclipse works natively with Git but requires buggy plugins for SVN. New developers don’t know SVN.

I heard other much worse arguments against Git since. For example, some people believed that, with Git, they could lose some of their code changes. This was partially due to sensational news article such as the Gitlab.com data loss, where in reality some administrator deleted some directory and had non-working backups. As a result, some Git repositories were deleted, but in reality it’s a common data loss situation, unrelated to the use of Git as version control system. This Stackoverflow question gives a nice overview of data loss risks with Git.

What I feel is true however is that Git is more complex than SVN, because it is more powerful and more flexible. But if you adopt a simple workflow, it’s not necessarily more complicated.

The Neural Network in Your CPU

Machine learning and artificial intelligence are the current hype (again). In their new Ryzen processors, AMD advertises the Neural Net Prediction. It turns out this is was already used in their older (2012) Piledriver architecture used for example in the AMD A10-4600M. It is also present in recent Samsung processors such as the one powering the Galaxy S7. What is it really?

The basic idea can be traced to a paper from Daniel Jimenez and Calvin Lin “Dynamic Branch Prediction with Perceptrons”, more precisely described in the subsequent paper “Neural methods for dynamic branch prediction”. Branches typically occur in if-then-else statements. Branch prediction consists in guessing which code branch, the then or the else, the code will execute, thus allowing to precompute the branch in parallel for faster evaluation.

Jimenez and Lin rely on a simple single-layer perceptron neural network whose input are the branch outcome (global or hybrid local and global) histories and the output predicts which branch will be taken. In reality, because there is a single layer, the output y is simply a weighted average of the input (x, and the constant 1):

$$ y = w_0 + \sum_{i=1}^n x_i w_i $$

\( x_i = \pm 1 \) for a taken or not taken. \( y > 0 \) predicts to take the branch.

Ideally, each static branch is allocated its own perceptron. In practice, a hash of the branch address is used.

The training consists in updating each weight according to the actual branch outcome t : \( w_i = w_i + 1 \) if \( x_i = t \) otherwise \( w_i = w_i - 1 \). But this is done only if the predicted outcome is lower than the training (stopping) threshold or if the branch was mispredicted. The threshold keeps from overtraining and allow to adapt quickly to changing behavior.

The perceptron is one of those algorithms created by a psychologist. In this case, the culprit is Frank Rosenblatt. Another more recent algorithm created by a psychologist is the particle swarm optimization from James Kennedy. As in the case of particle swarm optimization, there is not a single well defined perceptron, but many variations around some key principles. A reference seems to be the perceptron from H.D. Block, probably because he describes the perceptron in terms closer to code, while Rosenblatt was really describing a perceptron machine.

The perceptron from H.D. Block is slightly more general than the perceptron used for branch prediction:

  • the output can be -1, 0 or 1. The output is zero if the weighted average is below a threshold (a different constant from the training threshold of the branch prediction perceptron).
  • reinforcement is not done on inactive connections, that is for \( x_i = 0 \).
  • a learning rate \( \alpha \) is used to update the weight: \( w_i += \alpha t x_i \)

The perceptron used for branch prediction is quite different from the deep learning neural networks fad, which have many more layers, with some feedback loop. The challenge of those is the training: when many layers are added to the perceptron, the gradients of each layer activation function multiply in the backpropagation algorithm. This makes the “effective” gradient at the first layers to be very small, which translates to tiny changes in the weights, making training not only very slow but also likely stuck in a sub-optimal local minimum. Beside the vanishing gradient problem, there is also the catastrophic interference problem to pay attention to. Those issues are today dealt with the use of specific strategies to train / structure the network combined with raw computational power that was unavailable in the 90s.

Benham disc in web canvas

Around 15 years ago, I wrote a small Java applet to try and show the Benham disk effect. Even back then applets were already passé and Flash would have been more appropriate. These days, no browser support Java applets anymore, and very few web users have Java installed. Flash also mostly disappeared. The web canvas is today’s standard allowing to embbed animations in a web page. This effect shows color perception from a succession of black and white pictures. It is a computer reproduction from the Benham disc with ideas borrowed from "Pour La Science Avril/Juin 2003". Using a delay between 40 and 60ms, the inner circle should appear red, the one in the middle blue and the outer one green. When you reverse the rotation direction, blue and red circles should be inverted.

Delay: Reverse

Blogs on Quantitative Finance

There are not many blogs on quantitative finance that I read. Blogs are not so popular anymore with the advent of the various social networks (facebook, stackoverflow, google plus, reddit, …). Here is a small list:

Another way to find out what’s going on in the quantitative finance world is to scan regularly recent papers on arxiv, SSRN or the suggestions of Google scholar.

Typo in Hyman non-negative constraint - 28 years later

In their paper “Nonnegativity-, Monotonicity-, or Convexity-Preserving Cubic and Quintic Hermite Interpolation”, Dougherty, Edelman and Hyman present a simple filter on the first derivatives to maintain positivity of a cubic spline interpolant.

Unfortunately, in their main formula for non-negativity, they made a typo: the equation (3.3) is not consistent with the equation (3.1): the \( \Delta x_{i-1/2} \) is interverted with \( \Delta x_{i+1/2} \).

It was not obvious to find out which equation was wrong since there is no proof in the paper. Fortunately, the proof is in the reference paper “Monotone piecewise bicubic interpolation” from Carlson and Fritsch and it is clear then that equation (3.1) is the correct one.

Here is a counter example for equation (3.3) for a Hermite cubic spline with natural boundary conditions

	x := []float64{-2.427680355823974, -2.3481443181227126, -2.268608280421452, -2.189072242720191, -2.1095362050189297, -2.0300001673176684, -1.9504641296164076, -1.8709280919151465, -1.7913920542138855, -1.7118560165126244, -1.6323199788113634, -1.5527839411101023, -1.4732479034088413, -1.3937118657075802, -1.3141758280063192, -1.234639790305058, -1.155103752603797, -1.075567714902536, -0.996031677201275, -0.9164956395000139, -0.8369596017987528, -0.7574235640974918, -0.6778875263962307, -0.5983514886949697, -0.5188154509937086, -0.4392794132924476, -0.3597433755911865, -0.28020733788992525, -0.20067130018866441, -0.12113526248740358, -0.04159922478614231, 0.03793681291511897, 0.1174728506163798, 0.19700888831764063, 0.2765449260189019, 0.3560809637201632, 0.435617001421424, 0.5151530391226848, 0.5946890768239461, 0.6742251145252074, 0.7537611522264682}
y := []float64{0.22303659708933243, 0.22655584607177778, 0.16305913092318047, 0.10448859033589929, 0.16002349837315494, 0.15632461201014838, 0.20216975718723287, 0.15780097999496998, 0.1293436080647528, 0.13101037880853464, 0.13496819587884762, 0.1241082758487616, 0.13677319399415233, 0.11493379360854317, 0.10181073092072719, 0.10390553149978735, 0.09705141597989163, 0.09469310717587004, 0.09397968424452849, 0.08115526370674754, 0.07021707117654993, 0.07896827656600293, 0.0733165278111029, 0.06626131630031683, 0.06036157053511007, 0.05598416466538939, 0.049613807117723785, 0.04821691404672292, 0.04274830629369516, 0.04053002586588819, 0.03690680495068648, 0.03232925805830648, 0.02686034599416645, 0.02152456554605186, 0.01281016667726421, 0.005672278716525797, 0.0021938540693167367, 0.0015582304062002356, 0.0002050615897307207, 3.232807285235909e-07, 4.059553937573009e-06}
z := 0.7410184269884271

In the above example, even though the step size \(\Delta x\) is constant, the error is visible at the last node since then only one inequation apply, and it will be different with the typo.

It is quite annoying to stumble upon such typos, especially when the equations stem from a derived correct paper. We wonder then where are the other typos in the paper and our trust in the equations is greatly weakened. Unfortunately, such mistakes happen to everybody, including myself, and they are rarely caught by reviewers.

Implied Volatility from Black-Scholes price

Dan Stefanica and Rados Radoicic propose a quite good initial guess in their very recent paper An Explicit Implied Volatility Formula. Their formula is simple, fast to compute and results in an implied volatility guess with a relative error of less than 10%.

It is more robust than the rational fraction from Minquiang Li: his rational fraction is only valid for a fixed range of strikes and maturities. The new approximation is mathematically proved accurate across all strikes and all maturities. There is only the need to be careful in the numerical implementation with the case where the price is very small (a Taylor expansion of the variable C will be useful in this case).

As mentioned in an earlier post, Peter Jäckel solved the real problem by providing the code for a fast, very accurate and robust solver along with his paper Let’s be rational. This new formula used as initial guess to Minquiang Li SOR-TS solver provides an interesting alternative: the resulting code is very simple and efficient. The accuracy, relative or absolute can be set to eventually speedup the calculation.

Below is an example of the performance on a few different cases for strike 150, forward 100, time to maturity 1.0 year and a relative tolerance of 1E-8 using Go 1.8.

Original volatility Method Implied Volatility Time
64% Jäckel 0.6400000000000002 1005 ns
64% Rational 0.6495154924570236 72 ns
64% SR 0.6338265040549524 200 ns
64% Rational-Li 0.6400000010047917 436 ns
64% SR-Li 0.6400000001905617 568 ns
16% Rational 0.1575005551326285 72 ns
16% SR 0.15117970813645165 200 ns
16% Jäckel 0.16000000000000025 1323 ns
16% Rational-Li 0.16000000000219483 714 ns
16% SR-Li 0.16000000000018844 1030 ns
4% Rational 0.1528010258201771 72 ns
4% SR 0.043006234681405076 190 ns
4% Jäckel 0.03999999999999886 1519 ns
4% Rational-Li 0.040000000056277685 10235 ns
4% SR-Li 0.040000000000453895 2405 ns

The case 4% was an example of a particularly challenging setting in a Wilmott forum. It results in a very small call option price (9E-25).

Previous

Next