Google phones are overrated

It is a relatively common belief that the vanilla Android experience is better, as it runs smoother. The Samsung Touchwiz is often blamed for making things slow and not more practical.

I have had a Nexus 6 for a couple of years and I noticed the slowdowns after each update, up to a point where it sometimes took a few seconds to open the phone app, or to display the keyboard. I freed up storage, removed some apps but this did not make any difference. This is more or less the same experience that people have with the Samsung devices if reddit comments are to be believed.

The other myth is that Google phones will be updated for a longer time. Prior to the Nexus 6, I had a Galaxy Nexus, which Google stopped supporting after less than 2 years. The Nexus 6 received security update until this October, that is nearly 2 years for me and 2.5 years for early buyers.

In comparison Apple updates its phones for much longer and a 4 years old iphone 5s still runs smoothly. Fortunately for Android phones, there are the alternative roms. Being desperate with the state of my Nexus phone, I installed paranoid android. I am surprised at how good it is. My phone feels like new again, and as good as any flagship I would purchase today. To my great surprise I did not find any clear step by step installation process for Paranoid Android. I just followed the detailed steps for LineageOS (CyanogenMod descendent), but with the paranoid android zip file instead of the LineageOS one. I have some trouble to understand how some open source ROM can be much nicer than the pure Google Android ROM, but it is. I have had no crash/reboot (which became more common as well with the years), plus it’s still updated regularly. Google does not set a good standard by not supporting its own devices better.

There is however one thing that Google does well, it’s the camera app. The HDR+ mode makes my Nexus 6 take great pictures, even compared to new android phones or iphones. I am often amazed at the pictures I manage to take, and others are also amazed at how good they look (especially since they look often better on the Nexus 6 screen than on a computer monitor). Initially I remember the camera to be not so great, but some update that came in 2016 transformed the phone into a fantastic shooter. It allowed me to take very nice pictures in difficult conditions, see for example this google photo gallery. Fortunately it’s still possible to install the app in the Google Play store on Paranoid Android. They should really make it open-source and easier to adapt it to other android phones.

A photo made with the Google camera app on the Nexus 6.

SVN is dead

A few years ago, when Git was rising fast and SVN was already not hype anymore, a friend thought that SVN was for many organizations better suited than Git, with the following classical arguments, which were sound at the time:

  1. Who needs decentralization for a small team or a small company working together?
  2. ‎SVN is proven, works well and is simple to use and put in place.

Each argument is in reality not so strong. It becomes clear now that Git is much more established.

Regarding the first argument, a long time ago some people had trouble with CVS as it introduced the idea of merging instead of locking files. Git represents a similar paradigm shift between the centralized and the decentralized. It can scare people in not so rational ways. You could lock files with CVS as you did with visual sourcesafe or any other crappy old source control system. It’s just that people favored merges as it was effectively more convenient and more productive. You can also use Git with a centralized workflow. Another more scary paradigm shift with Git is to move away from the idea that branches are separate folders. With Git you just switch branches as it is instantaneous even though, again, you could use it in the old fashioned SVN way.

Now onto the second argument, SVN is proven to work well. But so is Cobol. Today it should be clear that SVN is essentially dead. Most big projects move or have moved to git. Tools work better with Git, even Eclipse works natively with Git but requires buggy plugins for SVN. New developers don’t know SVN.

I heard other much worse arguments against Git since. For example, some people believed that, with Git, they could lose some of their code changes. This was partially due to sensational news article such as the Gitlab.com data loss, where in reality some administrator deleted some directory and had non-working backups. As a result, some Git repositories were deleted, but in reality it’s a common data loss situation, unrelated to the use of Git as version control system. This Stackoverflow question gives a nice overview of data loss risks with Git.

What I feel is true however is that Git is more complex than SVN, because it is more powerful and more flexible. But if you adopt a simple workflow, it’s not necessarily more complicated.

The Neural Network in Your CPU

Machine learning and artificial intelligence are the current hype (again). In their new Ryzen processors, AMD advertises the Neural Net Prediction. It turns out this is was already used in their older (2012) Piledriver architecture used for example in the AMD A10-4600M. It is also present in recent Samsung processors such as the one powering the Galaxy S7. What is it really?

The basic idea can be traced to a paper from Daniel Jimenez and Calvin Lin “Dynamic Branch Prediction with Perceptrons”, more precisely described in the subsequent paper “Neural methods for dynamic branch prediction”. Branches typically occur in if-then-else statements. Branch prediction consists in guessing which code branch, the then or the else, the code will execute, thus allowing to precompute the branch in parallel for faster evaluation.

Jimenez and Lin rely on a simple single-layer perceptron neural network whose input are the branch outcome (global or hybrid local and global) histories and the output predicts which branch will be taken. In reality, because there is a single layer, the output y is simply a weighted average of the input (x, and the constant 1):

$$ y = w_0 + \sum_{i=1}^n x_i w_i $$

\( x_i = \pm 1 \) for a taken or not taken. \( y > 0 \) predicts to take the branch.

Ideally, each static branch is allocated its own perceptron. In practice, a hash of the branch address is used.

The training consists in updating each weight according to the actual branch outcome t : \( w_i = w_i + 1 \) if \( x_i = t \) otherwise \( w_i = w_i - 1 \). But this is done only if the predicted outcome is lower than the training (stopping) threshold or if the branch was mispredicted. The threshold keeps from overtraining and allow to adapt quickly to changing behavior.

The perceptron is one of those algorithms created by a psychologist. In this case, the culprit is Frank Rosenblatt. Another more recent algorithm created by a psychologist is the particle swarm optimization from James Kennedy. As in the case of particle swarm optimization, there is not a single well defined perceptron, but many variations around some key principles. A reference seems to be the perceptron from H.D. Block, probably because he describes the perceptron in terms closer to code, while Rosenblatt was really describing a perceptron machine.

The perceptron from H.D. Block is slightly more general than the perceptron used for branch prediction:

  • the output can be -1, 0 or 1. The output is zero if the weighted average is below a threshold (a different constant from the training threshold of the branch prediction perceptron).
  • reinforcement is not done on inactive connections, that is for \( x_i = 0 \).
  • a learning rate \( \alpha \) is used to update the weight: \( w_i += \alpha t x_i \)

The perceptron used for branch prediction is quite different from the deep learning neural networks fad, which have many more layers, with some feedback loop. The challenge of those is the training: when many layers are added to the perceptron, the gradients of each layer activation function multiply in the backpropagation algorithm. This makes the “effective” gradient at the first layers to be very small, which translates to tiny changes in the weights, making training not only very slow but also likely stuck in a sub-optimal local minimum. Beside the vanishing gradient problem, there is also the catastrophic interference problem to pay attention to. Those issues are today dealt with the use of specific strategies to train / structure the network combined with raw computational power that was unavailable in the 90s.

Benham disc in web canvas

Around 15 years ago, I wrote a small Java applet to try and show the Benham disk effect. Even back then applets were already passé and Flash would have been more appropriate. These days, no browser support Java applets anymore, and very few web users have Java installed. Flash also mostly disappeared. The web canvas is today’s standard allowing to embbed animations in a web page.

This effect shows color perception from a succession of black and white pictures. It is a computer reproduction from the Benham disc with ideas borrowed from “Pour La Science Avril/Juin 2003”. Using a delay between 40 and 60ms, the inner circle should appear red, the one in the middle blue and the outer one green. When you reverse the rotation direction, blue and red circles should be inverted.

Delay: Reverse

Blogs on Quantitative Finance

There are not many blogs on quantitative finance that I read. Blogs are not so popular anymore with the advent of the various social networks (facebook, stackoverflow, google plus, reddit, …). Here is a small list:

Another way to find out what’s going on in the quantitative finance world is to scan regularly recent papers on arxiv, SSRN or the suggestions of Google scholar.

Typo in Hyman non-negative constraint - 28 years later

In their paper “Nonnegativity-, Monotonicity-, or Convexity-Preserving Cubic and Quintic Hermite Interpolation”, Dougherty, Edelman and Hyman present a simple filter on the first derivatives to maintain positivity of a cubic spline interpolant.

Unfortunately, in their main formula for non-negativity, they made a typo: the equation (3.3) is not consistent with the equation (3.1): the \( \Delta x_{i-12} \) is interverted with \( \Delta x_{i+12} \).

It was not obvious to find out which equation was wrong since there is no proof in the paper. Fortunately, the proof is in the reference paper “Monotone piecewise bicubic interpolation” from Carlson and Fritsch and it is clear then that equation (3.1) is the correct one.

Here is a counter example for equation (3.3) for a Hermite cubic spline with natural boundary conditions

 x := []float64{-2.427680355823974, -2.3481443181227126, -2.268608280421452, -2.189072242720191, -2.1095362050189297, -2.0300001673176684, -1.9504641296164076, -1.8709280919151465, -1.7913920542138855, -1.7118560165126244, -1.6323199788113634, -1.5527839411101023, -1.4732479034088413, -1.3937118657075802, -1.3141758280063192, -1.234639790305058, -1.155103752603797, -1.075567714902536, -0.996031677201275, -0.9164956395000139, -0.8369596017987528, -0.7574235640974918, -0.6778875263962307, -0.5983514886949697, -0.5188154509937086, -0.4392794132924476, -0.3597433755911865, -0.28020733788992525, -0.20067130018866441, -0.12113526248740358, -0.04159922478614231, 0.03793681291511897, 0.1174728506163798, 0.19700888831764063, 0.2765449260189019, 0.3560809637201632, 0.435617001421424, 0.5151530391226848, 0.5946890768239461, 0.6742251145252074, 0.7537611522264682}
y := []float64{0.22303659708933243, 0.22655584607177778, 0.16305913092318047, 0.10448859033589929, 0.16002349837315494, 0.15632461201014838, 0.20216975718723287, 0.15780097999496998, 0.1293436080647528, 0.13101037880853464, 0.13496819587884762, 0.1241082758487616, 0.13677319399415233, 0.11493379360854317, 0.10181073092072719, 0.10390553149978735, 0.09705141597989163, 0.09469310717587004, 0.09397968424452849, 0.08115526370674754, 0.07021707117654993, 0.07896827656600293, 0.0733165278111029, 0.06626131630031683, 0.06036157053511007, 0.05598416466538939, 0.049613807117723785, 0.04821691404672292, 0.04274830629369516, 0.04053002586588819, 0.03690680495068648, 0.03232925805830648, 0.02686034599416645, 0.02152456554605186, 0.01281016667726421, 0.005672278716525797, 0.0021938540693167367, 0.0015582304062002356, 0.0002050615897307207, 3.232807285235909e-07, 4.059553937573009e-06}
z := 0.7410184269884271

In the above example, even though the step size \Delta x is constant, the error is visible at the last node since then only one inequation apply, and it will be different with the typo.

It is quite annoying to stumble upon such typos, especially when the equations stem from a derived correct paper. We wonder then where are the other typos in the paper and our trust in the equations is greatly weakened. Unfortunately, such mistakes happen to everybody, including myself, and they are rarely caught by reviewers.

Implied Volatility from Black-Scholes price

Dan Stefanica and Rados Radoicic propose a quite good initial guess in their very recent paper An Explicit Implied Volatility Formula. Their formula is simple, fast to compute and results in an implied volatility guess with a relative error of less than 10%.

It is more robust than the rational fraction from Minquiang Li: his rational fraction is only valid for a fixed range of strikes and maturities. The new approximation is mathematically proved accurate across all strikes and all maturities. There is only the need to be careful in the numerical implementation with the case where the price is very small (a Taylor expansion of the variable C will be useful in this case).

As mentioned in an earlier post, Peter Jäckel solved the real problem by providing the code for a fast, very accurate and robust solver along with his paper Let’s be rational. This new formula used as initial guess to Minquiang Li SOR-TS solver provides an interesting alternative: the resulting code is very simple and efficient. The accuracy, relative or absolute can be set to eventually speedup the calculation.

Below is an example of the performance on a few different cases for strike 150, forward 100, time to maturity 1.0 year and a relative tolerance of 1E-8 using Go 1.8.

Original volatility Method Implied Volatility Time
64% Jäckel 0.6400000000000002 1005 ns
64% Rational 0.6495154924570236 72 ns
64% SR 0.6338265040549524 200 ns
64% Rational-Li 0.6400000010047917 436 ns
64% SR-Li 0.6400000001905617 568 ns
16% Rational 0.1575005551326285 72 ns
16% SR 0.15117970813645165 200 ns
16% Jäckel 0.16000000000000025 1323 ns
16% Rational-Li 0.16000000000219483 714 ns
16% SR-Li 0.16000000000018844 1030 ns
4% Rational 0.1528010258201771 72 ns
4% SR 0.043006234681405076 190 ns
4% Jäckel 0.03999999999999886 1519 ns
4% Rational-Li 0.040000000056277685 10235 ns
4% SR-Li 0.040000000000453895 2405 ns

The case 4% was an example of a particularly challenging setting in a Wilmott forum. It results in a very small call option price (9E-25).

The VIX starts smiling

The VIX implied volatilities used to look like a logarithmic function of the strikes. I don’t look at them often, but today, I noticed that the VIX had the start of a smile shape.

1m VIX implied volatilities on March 21, 2017 with strictly positive volume.

Very few strikes trades below the VIX future level (12.9). All of this is likely because the VIX is unusually low: not many people are looking to trade it much lower.

Update March 22: Actually the smile in VIX is not particularly new, it is visible in Jim Gatheral 2013 presentation Joint modeling of SPX and VIX for the short maturities. Interestingly, the issue with SVI is also visible in those slides in the shortest maturity.

When SVI Breaks Down

In order to fit the implied volatility smile of equity options, one of the most popular parameterization is Jim Gatheral’s SVI, which I have written about before here.

It turns out that in the current market conditions, SVI does not work well for short maturities. SPX options expiring on March 24, 2017 (one week) offer a good example. I paid attention to include only options with non zero volume, that is options that are actually traded.

SPXW implied volatilities on March 16, 2017 with strictly positive volume.

SVI does not fit well near the money (the SPX index is at 2385) and has an obviously wrong right wing. This is not due to the choice of weights used for the calibration (I used the volume as weight; equal weights would be even worse). Interestingly, SABR (with beta=1) does much better, even though it has two less parameters than SVI. Also a simple least squares parabola here does not work at all as it ends up fitting only the left wing.

If we include all options with non zero open interest and use ask/(bid-ask) weights, SVI is even worse:

SPXW implied volatilities on March 16, 2017 with strictly positive open interest.

We can do a bit better with SVI by squaring the weights and it shows the problem of SVI more clearly:

SPXW implied volatilities on March 16, 2017 with strictly positive open interest and SVI weights squared.

These days, the VIX index is particularly low. Today, it is at 11.32. Usually, a low VIX translates to growing stocks as investors are confident in a low volatility environment. The catch with those extremely low ATM vols, is that the curvature is more pronounced, and so people are not so confident after all.

Brownian Bridge and Discrete Random Variables

The new Heston discretisation scheme I wrote about a few weeks ago makes use a discrete random variable matching the first five moments of the normal distribution instead of the usual normally distributed random variable, computed via the inverse cumulative distribution function. Their discrete random variable is: $$\xi = \sqrt{1-\frac{\sqrt{6}}{3}} \quad \text{ if } U_1 < 3\,,$$ $$ \xi =-\sqrt{1-\frac{\sqrt{6}}{3}} \quad \text{ if } U_1 > 4\,,$$ $$\xi = \sqrt{1+\sqrt{6}} \quad \text{ if } U_1 = 3\,,$$ $$\xi = -\sqrt{1+\sqrt{6}} \quad \text{ if } U_1 = 4\,,$$ with \(U_1 \in \{0,1,…,7\}\)

The advantage of the discrete variable is that it is much faster to generate. But there are some interesting side-effects. The first clue I found is a loss of accuracy on forward-start vanilla options.

By accident, I found a much more interesting side-effect: you can not use the Brownian-Bridge variance reduction on the discrete random variable. This is very well illustrated by the case of a digital option in the Black model, for example with volatility 10% and a 3 months maturity, zero interest rate and dividends. For the following graph, I use 16000 Sobol paths composed of 100 time-steps.

Digital Call price with different random variables.

The “-BB” suffix stands for the Brownian-Bridge path construction, “Five” for five moments discrete variable and “AS241” for the inverse cumulative distribution function (continuous approach). As you can see, the price is discrete, and follows directly from the discrete distribution. The use of any random number generator with a large enough number of paths would lead to the same conclusion.

This is because with the Brownian-Bridge technique, the last point in the path, corresponding to the maturity, is sampled first, and the other path points are then completed inside from the first and last points. But the digital option depends only on the value of the path at maturity, that is, on this last point. As this point corresponds follows our discrete distribution, the price of the digital option is a step function.

In contrast, for the incremental path construction, each point is computed from the previous point. The last point will thus include the variation of all points in the path, which will be very close to normal, even with a discrete distribution per point.

The take-out to price more exotic derivatives (including forward-start options) with discrete random variables and the incremental path construction, is that several intermediate time-steps (between payoff observations) are a must-have with discrete random variables, however small is the original time-step size.

Furthermore, one can notice the discrete staircase even with a relavely small time-step for example of 132 (meaning 8 intermediate time-steps in our digital option example). I suppose this is a direct consequence of the digital payoff discontinuity. In Talay “Efficient numerical schemes for the approximation of expectations of functionals of the solution of a SDE, and applications” (which you can read by adding .sci-hub.cc to the URL host name), second order convergence is proven only if the payoff function and its derivatives up to order 6 are continuous. There is something natural that a discrete random variable imposes continuity conditions on the payoff, not necessary with a continuous, smooth random variable: either the payoff or the distribution needs to be smooth.

Tweet Submit to reddit
© 2006-16 Fabien Creative Commons License This work is licensed under a Creative Commons Attribution 4.0 International License.