The Role of Erfcx and Erfc in the Black-Scholes Formula
This note compares special-function choices at the level that matters to the implied volatility solvers: the normalized Black beta price, not standalone erfcx(x).
For x = log(F/K) <= 0 and total volatility s = sigma sqrt(T), the beta-space OTM call price is
B(x, s) = 0.5 * (exp(x/2) * erfc(q1) - exp(-x/2) * erfc(q2))
q1 = -(x/s + s/2) / sqrt(2)
q2 = -(x/s - s/2) / sqrt(2)
The corresponding forward-normalized OTM-call price is c = B(x, s) / sqrt(ex), with ex = exp(x).
Implementations Compared
Default/current: Jaeckel Region I/II expansions and falls back to the Apache Commons mixederfc/erfcxformula in the middle region.Jaeckel acc libm+Commons: the same Jaeckel region-dispatched, included into the temporary harness, using systemerf/erfcand the local Apache Commonserfcx.Jaeckel acc Cody+Cody: the same Jaeckel region-dispatched formula, using Codyerf,erfc, anderfcxfromjaeckel = 0.2.0.Jaeckel acc Johnson+Johnson: the same Jaeckel region-dispatched formula, using Johnson/Faddeevaerf,erfc, anderfcxthrougherrorfunctions = 0.2.0.Jaeckel acc libm+CodyandJaeckel acc libm+Johnson: the same formula with systemerf/erfcfixed and onlyerfcxswapped, to isolate the scaled-erfc backend effect.Jaeckel I/II + pure Commons: Jaeckel Region I/II expansions are kept, but the Region III/middle fallback is replaced by the always-erfcx identity with the local Apache Commonserfcx.IG statmod upper: the inverse-Gaussian upper-tail representation of the Black beta price, using a scalar Rust port of the CRANstatmod1.5.2.pinvgaussmean-one upper-tail log-probability core, with the finallog(1-exp(r))evaluated by a Rustlog1mexphelper near cancellation. This is not a fullstatmodport: it omits R vectorization, NA/Inf handling, finite/infinite-mean special cases, and R’s exactpnormimplementation. The public-wrapper small-CV gamma approximation is also omitted because, under the Black mapping here, it would requiredispersion=-2/x < 1e-14, orx < -2e14, far outside the finiteFloat64x=log(ex)domain. No purpose-built Rust inverse-Gaussian crate was found incargo search inverse-gaussian/cargo search invgauss;statrs0.18.0 source also did not contain an inverse-Gaussian/Wald distribution implementation.Commons erfcx-mixed: the Cody/Jaeckel threshold formula with the local Apache Commons Numberserfcxport and systemerfc.Cody erfcx-mixed: the same threshold formula usingjaeckel = 0.2.0erfcx_cody.Johnson erfcx-mixed: the same threshold formula usingerrorfunctions = 0.2.0Johnson/Faddeevaerfcx.libm erfc direct: the direct formula above, using system Cerfcthrough FFI. Ruststddoes not provideerfcorerfcx.Commons/Cody/Johnson pure-erfcx: the always-erfcx identity0.5 * exp(-0.5*(h^2+t^2)) * (erfcx(q1) - erfcx(q2)), included to show why the mixed formula avoids some high-volatility tail problems.
Mixed Erfc/Erfcx Fallback Logic
The mixed formula starts from the direct beta-space Black expression:
2B = exp(x/2) * erfc(q1) - exp(-x/2) * erfc(q2)
with
h = x / s
t = s / 2
q1 = -(h + t) / sqrt(2)
q2 = -(h - t) / sqrt(2)
For either term, the stable tail identity is:
exp( x/2) * erfc(q1) = exp(-0.5 * (h^2 + t^2)) * erfcx(q1)
exp(-x/2) * erfc(q2) = exp(-0.5 * (h^2 + t^2)) * erfcx(q2)
because q1^2 = 0.5 * (h + t)^2, q2^2 = 0.5 * (h - t)^2, and x = h*s = 2*h*t. The mixed implementation uses that identity term by term instead of forcing both terms through one representation.
The branch threshold is Cody’s 0.46875, the same split used by the classic error-function approximations. Arguments below it are kept in the ordinary erfc representation; arguments above it use scaled erfcx so that the small tail probability is not represented as a nearly-underflowed erfc multiplied by a compensating exponential.
if q1 < 0.46875 and q2 < 0.46875:
2B = exp(x/2) * erfc(q1)
- exp(-x/2) * erfc(q2)
if q1 < 0.46875 and q2 >= 0.46875:
2B = exp(x/2) * erfc(q1)
- exp(-0.5 * (h^2 + t^2)) * erfcx(q2)
if q1 >= 0.46875 and q2 < 0.46875:
2B = exp(-0.5 * (h^2 + t^2)) * erfcx(q1)
- exp(-x/2) * erfc(q2)
if q1 >= 0.46875 and q2 >= 0.46875:
2B = exp(-0.5 * (h^2 + t^2)) * (erfcx(q1) - erfcx(q2))
On the valid Black domain, the third formal case is actually unreachable. Since t = s/2 >= 0,
q2 - q1 = sqrt(2) * t >= 0,
so q1 >= rho implies q2 >= rho as well. The erfcx, erfc branch would require q1 >= rho and q2 < rho at the same time; that would need negative t, not merely positive h = x/s. It is kept in the branch table only as the symmetric algebraic case.
So “mixed” does not mean that erfc is a fallback after erfcx fails. It means each of the two Black terms is routed to the representation that is locally better conditioned. Negative or central arguments stay with erfc, where erfc(q) is order one and cheap. Positive tail arguments switch to erfcx, where the scaled value remains order 1/q and avoids underflow.
This explains two patterns in the tables below. In the low-volatility OTM tail, direct erfc loses many ulps because it subtracts two tiny tail probabilities after ordinary scaling; the mixed erfcx versions are much better. In the high-volatility upper-price window, the pure-erfcx identity is worse because it exposes the cancellation in erfcx(q1) - erfcx(q2) even when ordinary erfc would have kept one side well behaved.
Reference And Local Windows
Reference values are Julia BigFloat at 512-bit precision, rounded back to Float64. ULP error is the positive-Float64 bit distance to that rounded reference. Each window contains 512 consecutive Float64 values in either s or x, with the other coordinate fixed.
| id | zone | axis | x | s |
|---|---|---|---|---|
| central-atm | central | s | -1.000000e-2 | 2.000000e-1 |
| tiny-near-atm | tiny-near-atm | s | -1.000000e-8 | 1.000000e-4 |
| erfc-threshold | branch-transition | s | -1.000000e-1 | 1.364500e-1 |
| moderate-otm | moderate-otm | s | -1.000000e0 | 5.000000e-1 |
| low-vol-otm | lower-tail | s | -1.000000e-1 | 2.000000e-2 |
| deep-otm | lower-tail | s | -3.000000e0 | 7.000000e-1 |
| far-otm | lower-tail | s | -6.000000e0 | 1.000000e0 |
| high-vol | upper-price | s | -1.000000e-2 | 1.000000e1 |
| very-high-vol | upper-price | s | -1.000000e0 | 5.000000e1 |
| near-overflow-vol | upper-price | s | -1.000000e0 | 7.500000e1 |
| x-central | x-window | x | -1.000000e-2 | 2.000000e-1 |
| x-lower-tail | x-window | x | -4.000000e0 | 8.000000e-1 |
Global Accuracy
| backend | samples | invalid | exact_bits | p50 | p95 | p99 | max | max_rel | worst_window |
|---|---|---|---|---|---|---|---|---|---|
| Default/current | 6144 | 0 | 2301 | 1 | 15 | 24 | 40 | 7.4e-15 | far-otm |
| Jaeckel acc libm+Commons | 6144 | 0 | 2301 | 1 | 15 | 24 | 40 | 7.4e-15 | far-otm |
| Jaeckel acc Cody+Cody | 6144 | 0 | 2291 | 1 | 14 | 22 | 40 | 7.4e-15 | far-otm |
| Jaeckel acc Johnson+Johnson | 6144 | 0 | 2296 | 1 | 15 | 25 | 42 | 7.8e-15 | far-otm |
| Jaeckel acc libm+Cody | 6144 | 0 | 2291 | 1 | 14 | 22 | 40 | 7.4e-15 | far-otm |
| Jaeckel acc libm+Johnson | 6144 | 0 | 2296 | 1 | 15 | 25 | 42 | 7.8e-15 | far-otm |
| Jaeckel I/II + pure Commons | 6144 | 0 | 1277 | 4 | 686 | 1202 | 1570 | 2.9e-13 | near-overflow-vol |
| Commons erfcx-mixed | 6144 | 0 | 1212 | 5 | 2823 | 3019 | 3068 | 5.2e-13 | tiny-near-atm |
| IG statmod upper | 6144 | 0 | 1158 | 7 | 6372 | 6568 | 7666 | 1.5e-12 | low-vol-otm |
| Cody erfcx-mixed | 6144 | 0 | 1180 | 6 | 2823 | 3019 | 3068 | 5.2e-13 | tiny-near-atm |
| Johnson erfcx-mixed | 6144 | 0 | 1161 | 6 | 2823 | 3019 | 3068 | 5.2e-13 | tiny-near-atm |
| libm erfc direct | 6144 | 0 | 1116 | 10 | 3049 | 5952 | 10329 | 2.0e-12 | low-vol-otm |
| Commons pure-erfcx | 6144 | 0 | 255 | 13 | 1776 | 1972 | 2021 | 3.4e-13 | tiny-near-atm |
| Cody pure-erfcx | 6144 | 0 | 174 | 15 | 9968 | 10164 | 10213 | 1.7e-12 | tiny-near-atm |
| Johnson pure-erfcx | 6144 | 0 | 257 | 19 | 59120 | 59316 | 59365 | 1.0e-11 | tiny-near-atm |
The headline is different from the raw erfcx comparison. In the normalized Black formula, the largest errors in these local windows are not the raw negative-erfcx overflow tail. They are tiny near-ATM and low-volatility OTM cancellation zones. The default is much better there because it uses Jaeckel Region I/II expansions before falling back to the mixed erfc/erfcx formula. Once that region-dispatched formula is kept, backend choice barely moves the result: all Jaeckel-accurate variants have global p50 = 1 ulp and max = 40 to 42 ulps on this sample set.
Jaeckel Accurate Backend Sensitivity
The Jaeckel accurate path is not just the mixed formula with a different erfcx. It routes first through:
- Region I: asymptotic expansion times the normalized vega. This path is independent of
erf,erfc, anderfcx. - Region II: small-
texpansion times the normalized vega. This path useserfcxonly in the centraly_primebranch; deeper left-taily_primeis rational. - Region III/middle: the mixed Cody/Jaeckel
erfc/erfcxformula described above.
That dispatch is why the special-function backend differences collapse compared with the generic mixed or pure-erfcx formulas.
Selected Jaeckel-accurate rows:
| backend | global p50 | p95 | p99 | max | tiny max | low-vol max | erfc-threshold max | far-otm max |
|---|---|---|---|---|---|---|---|---|
| Default/current | 1 | 15 | 24 | 40 | 1 | 17 | 2 | 40 |
| Jaeckel acc libm+Commons | 1 | 15 | 24 | 40 | 1 | 17 | 2 | 40 |
| Jaeckel acc Cody+Cody | 1 | 14 | 22 | 40 | 1 | 17 | 2 | 40 |
| Jaeckel acc Johnson+Johnson | 1 | 15 | 25 | 42 | 1 | 17 | 2 | 42 |
| Jaeckel acc libm+Cody | 1 | 14 | 22 | 40 | 1 | 17 | 2 | 40 |
| Jaeckel acc libm+Johnson | 1 | 15 | 25 | 42 | 1 | 17 | 2 | 42 |
The paired Cody rows match the libm+Cody rows in aggregate, and the paired Johnson rows match the libm+Johnson rows in aggregate. In these windows, the small residual backend sensitivity comes from erfcx, not from swapping the ordinary erf/erfc implementation. Cody is fractionally better than Commons by p95/p99 in the far-OTM-dominated aggregate; Johnson is fractionally worse, with a 42-ulp maximum instead of 40. Those differences are tiny next to the thousands of ulps lost by the non-Jaeckel mixed/direct formulas in the tiny near-ATM and low-vol OTM windows.
Jaeckel I/II With Pure-Erfcx Fallback
A tempting simplification is to keep Jaeckel Regions I and II, but replace the Region III/middle fallback with the pure-erfcx identity:
B = 0.5 * exp(-0.5 * (h^2 + t^2)) * (erfcx(q1) - erfcx(q2))
With Apache Commons erfcx, this is not as accurate as the default (Apache Commons erfc and erfcx with all Jaeckel zones) overall:
| backend | global p50 | p95 | p99 | max | tiny max | low-vol max | high-vol max | very-high max | near-overflow max |
|---|---|---|---|---|---|---|---|---|---|
| Default/current | 1 | 15 | 24 | 40 | 1 | 17 | 1 | 0 | 0 |
| Jaeckel acc libm+Commons | 1 | 15 | 24 | 40 | 1 | 17 | 1 | 0 | 0 |
| Jaeckel I/II + pure Commons | 4 | 686 | 1202 | 1570 | 1 | 17 | 60 | 745 | 1570 |
The simplification keeps the important Region II tiny near-ATM behavior and the Region I/II lower-tail behavior, so those windows match the default. The loss comes from middle/fallback cases, especially the high-volatility upper-price windows. There the pure-erfcx identity exposes cancellation in erfcx(q1) - erfcx(q2), while the mixed fallback keeps the locally better erfc representation for negative or central arguments.
It is not meaningfully faster in this diagnostic either. On the same optimized local-window timing run, the pure fallback row was 24.83 ns/call, compared with 24.80 ns/call for the same included Jaeckel formula with the mixed Commons fallback and 25.07 ns/call for the Default/current path. That difference is inside normal microbenchmark noise, and the generic pure-erfcx kernel was slower than the generic mixed Commons kernel (21.32 vs 19.80 ns/call).
Inverse Gaussian Formula Experiment
For x < 0, the Black beta price can be written as an inverse-Gaussian upper tail. In the statmod mean-one parameterization, let
q = -2x / s^2
dispersion = -2 / x
Y ~ IG(mean = 1, dispersion)
Then
B(x, s) = exp(x/2) * P(Y > q).
The statmod upper-tail formula is
P(Y > q) = Phi(-(q-1)/sqrt(dispersion*q))
- exp(2/dispersion) * Phi(-(q+1)/sqrt(dispersion*q)).
Substituting q=-2x/s^2 and dispersion=-2/x gives the ordinary Black expression in a different parameterization. The temporary Rust harness ports the statmod .pinvgauss core branch for this scalar mean-one upper tail, including its asymptotic right-tail override, and evaluates the normal log tails with the local Apache Commons erfcx support.
One Rust-side numerical improvement was kept: statmod writes the upper-tail combination as a + log1p(-exp(b-a)); the harness uses the equivalent a + log1mexp(b-a), switching to log(-expm1(b-a)) when b-a is close to zero. That slightly reduces cancellation in the tiny near-ATM and low-volatility OTM windows, but it does not change the conclusion.
The one public-wrapper branch that looks relevant at first glance is statmod’s small-CV gamma approximation:
if mean * dispersion < 1e-14:
pgamma(q, shape=1/(mean*dispersion), scale=(mean*dispersion)*mean)
For this Black identity mean=1 and dispersion=-2/x, so the branch would need x < -2e14. A finite Float64 normalized Black input derived from ex=exp(x)>0 has x >= log(5e-324) ~= -744.44. The gamma approximation therefore cannot trigger on the valid Black domain tested here; implementing it would not change these results.
After installing R, the temporary harness was checked against the actual R code by sourcing the CRAN statmod 1.5.2 R/invgauss.R file. The statmod package was not installed as an R library on this machine, so this is a source-level check of the same public R implementation rather than a namespace-loaded package check. On the 6144 Black-mapped local-window samples, R’s public pinvgauss(..., lower.tail=FALSE, log.p=TRUE) and the literal .pinvgauss core agree exactly in price space, confirming that no wrapper special case is active on this domain.
The remaining Rust-vs-R price differences come from the normal log-tail backend and the final cancellation. Comparing prices directly:
| comparison | exact | p50 | p95 | p99 | max | max_rel | worst |
|---|---|---|---|---|---|---|---|
| Rust log1mexp price vs R pinvgauss | 3020 | 1 | 648 | 2406 | 7348 | 1.4e-12 | low-vol-otm |
| Rust literal price vs R core literal | 4671 | 0 | 64 | 2461 | 7421 | 1.4e-12 | low-vol-otm |
| R wrapper vs R core literal price | 6144 | 0 | 0 | 0 | 0 | 0.0e0 | central-atm |
| R log1mexp core vs R literal price | 4115 | 0 | 648 | 648 | 648 | 1.1e-13 | tiny-near-atm |
Against the Julia BigFloat references, literal R/statmod does not improve the Black price accuracy versus the literal Rust port; it lands on the same conclusion as the Rust experiment:
| backend/formula | exact | p50 | p95 | p99 | max | max_rel | worst |
|---|---|---|---|---|---|---|---|
| Rust IG with log1mexp | 1158 | 7 | 6372 | 6568 | 7666 | 1.5e-12 | low-vol-otm |
| Rust literal statmod | 1174 | 7 | 7017 | 7212 | 7739 | 1.5e-12 | low-vol-otm |
| R pinvgauss literal statmod | 1691 | 8 | 7017 | 7214 | 7739 | 1.5e-12 | low-vol-otm |
| R core with log1mexp | 1665 | 8 | 6373 | 6570 | 7666 | 1.5e-12 | low-vol-otm |
So R fidelity is good for the relevant scalar statmod formula, and the only measurable Rust-side deviation kept here is the log1mexp substitution. It slightly improves the IG formula relative to literal R/statmod, but the formula remains far less accurate than the Jaeckel Region I/II expansions in the sensitive Black windows.
This representation is not Jaeckel-accurate on the sensitive windows:
| backend | global p50 | p95 | p99 | max | tiny max | low-vol max | high-vol max | very-high max | near-overflow max |
|---|---|---|---|---|---|---|---|---|---|
| Default/current | 1 | 15 | 24 | 40 | 1 | 17 | 1 | 0 | 0 |
| Commons erfcx-mixed | 5 | 2823 | 3019 | 3068 | 3068 | 593 | 1 | 0 | 0 |
| IG statmod upper | 7 | 6372 | 6568 | 7666 | 6608 | 7666 | 1 | 0 | 0 |
The inverse-Gaussian form is fine in the high-volatility upper-price windows, but it is worse than the mixed erfc/erfcx formula in the tiny near-ATM and low-volatility OTM cancellation windows. It is also slower in this scalar Rust port: 63.00 ns/call on the same local-window timing harness, compared with about 25 ns/call for the Jaeckel-dispatched default and about 20 ns/call for the generic Commons mixed formula.
Selected Windows

Black formula local window examples.
Selected 512-ULP window stats:
window = tiny-near-atm
| backend | p50 | p95 | p99 | max |
|---|---|---|---|---|
| Default/current | 0 | 1 | 1 | 1 |
| Commons erfcx-mixed | 2864 | 3048 | 3064 | 3068 |
| IG statmod upper | 6404 | 6588 | 6604 | 6608 |
| Cody erfcx-mixed | 2864 | 3048 | 3064 | 3068 |
| Johnson erfcx-mixed | 2864 | 3048 | 3064 | 3068 |
| libm erfc direct | 2864 | 3048 | 3064 | 3068 |
| Johnson pure-erfcx | 59161 | 59345 | 59361 | 59365 |
window = low-vol-otm
| backend | p50 | p95 | p99 | max |
|---|---|---|---|---|
| Default/current | 6 | 14 | 16 | 17 |
| Commons erfcx-mixed | 146 | 429 | 548 | 593 |
| IG statmod upper | 2299 | 5108 | 7130 | 7666 |
| Cody erfcx-mixed | 254 | 683 | 885 | 1345 |
| Johnson erfcx-mixed | 180 | 493 | 649 | 831 |
| libm erfc direct | 3526 | 6657 | 7242 | 10329 |
window = erfc-threshold
| backend | p50 | p95 | p99 | max |
|---|---|---|---|---|
| Default/current | 1 | 2 | 2 | 2 |
| Commons erfcx-mixed | 7 | 18 | 23 | 28 |
| IG statmod upper | 9 | 24 | 28 | 32 |
| Cody erfcx-mixed | 8 | 22 | 28 | 31 |
| Johnson erfcx-mixed | 20 | 43 | 57 | 74 |
| libm erfc direct | 7 | 21 | 27 | 34 |
window = high-vol
| backend | p50 | p95 | p99 | max |
|---|---|---|---|---|
| Default/current | 1 | 1 | 1 | 1 |
| Commons erfcx-mixed | 1 | 1 | 1 | 1 |
| IG statmod upper | 1 | 1 | 1 | 1 |
| Cody erfcx-mixed | 1 | 1 | 1 | 1 |
| Johnson erfcx-mixed | 1 | 1 | 1 | 1 |
| libm erfc direct | 1 | 1 | 1 | 1 |
| Johnson pure-erfcx | 39 | 56 | 56 | 57 |
Interpretation by region:
- Tiny near-ATM: backend choice is almost irrelevant for the mixed formula; all mixed/direct forms lose about 3000 ulps. The default Region II expansion stays within 1 ulp.
- Low-volatility OTM tail: direct
erfcis worst because it subtracts two close tail probabilities. The erfcx-mixed formulas are much better, and Commons is best among the three erfcx backends in this window. The default expansion is better again. - Branch transition: this is where backend differences are visible but modest. Commons and Cody are close; Johnson is weaker but still far from the direct-erfc lower-tail failure.
- Upper-price high-volatility: the mixed formula and direct-erfc formula are all within 1 ulp in this window. The pure-erfcx identity is worse because it exposes cancellation/rounding in
erfcx(q1) - erfcx(q2)instead of routing negative-side arguments througherfc.
Timing
Timing was measured on the same 6144 local-window samples, best of five passes with 2000 sweeps per pass, using cargo run --release.
| backend | ns/call |
|---|---|
| Default/current | 25.07 |
| Jaeckel acc libm+Commons | 24.80 |
| Jaeckel acc Cody+Cody | 26.55 |
| Jaeckel acc Johnson+Johnson | 25.39 |
| Jaeckel acc libm+Cody | 23.44 |
| Jaeckel acc libm+Johnson | 24.28 |
| Jaeckel I/II + pure Commons | 24.83 |
| Commons erfcx-mixed | 19.80 |
| IG statmod upper | 63.00 |
| Cody erfcx-mixed | 17.92 |
| Johnson erfcx-mixed | 19.27 |
| libm erfc direct | 26.89 |
| Commons pure-erfcx | 21.32 |
| Cody pure-erfcx | 19.20 |
| Johnson pure-erfcx | 19.59 |
These timings are for the formula kernels on this artificial local-window mix, not for the full IV solvers. The temporary Jaeckel backend rows include a small runtime selector used only by the diagnostic harness. The default is slower than the generic mixed formulas because it does more routing and uses the Region I/II expansion machinery, but the accuracy gain in tiny near-ATM and low-volatility OTM zones is large.
Takeaways
- A raw
erfcx(x)benchmark overstates the relevance of the negative-erfcx tail for the stable Black price. The mixed formula avoids much of that tail by usingerfcon negative-side arguments. - The most important Black-formula accuracy zones here are tiny near-ATM and low-volatility OTM, where formula structure dominates backend choice.
- For a generic erfcx-mixed Black formula, Apache Commons is the best of the three tested erfcx backends on the low-vol OTM window, Cody is usually close, and Johnson is somewhat weaker near the branch transition.
- For the full Jaeckel accurate formula, all tested backend combinations are essentially tied: global p50 = 1 ulp, max = 40 to 42 ulps, and the sensitive tiny near-ATM window stays within 1 ulp for every backend.
- In these sampled windows, swapping Cody or Johnson
erf/erfctogether witherfcxgives the same aggregate result as keeping libmerf/erfcand swapping onlyerfcx. - Keeping Jaeckel Regions I/II but using a pure Commons-erfcx fallback is simpler, but not accuracy-equivalent to the default: it matches the default in tiny near-ATM and lower-tail windows, then loses up to 1570 ulps in the high-volatility upper-price fallback region.
- The pure-erfcx fallback is not a clear speed win either; in this release-mode diagnostic it was effectively tied with the mixed Jaeckel fallback, while the generic pure-erfcx kernel was slower than the generic mixed Commons kernel.
- The inverse-Gaussian/statmod representation is a valid Black formula, but not an accuracy replacement for Jaeckel’s expanded evaluator here: even with a Rust
log1mexpcancellation tweak, it loses up to 6608 ulps in the tiny near-ATM window and 7666 ulps in the low-volatility OTM window, and the straightforward scalar port is much slower. - The default remains the accuracy-favored choice because the Jaeckel Region I/II expansions reduce errors from thousands of ulps to tens of ulps or better in the sensitive windows.
References
- For the Cody Netlib Slatec code Rational Chebyshev approximations for the error function by W. J. Cody, Math. Comp., 1969, PP. 631-638
- For the Apache Commons Number code, this issue description gives a good overview of the implementation, which is a mix of Boost rational functions and Cody rational function to obtain the best accuracy.
- Steven G. Johnson mentions in his code the following:
(…) I originally used an erfcx routine derived from DERFC in SLATEC, but I have since replaced it with a much faster routine written by me which uses a combination of continued-fraction expansions and a lookup table of Chebyshev polynomials. For speed, I implemented a similar algorithm for Im[w(x)] of real x, since this comes up frequently in the other error functions.
- TensorFlow makes use of a Chebyshev approximation from M. Shepherd and J. Laframboise, Chebyshev approximation of (1 + 2 * x) * exp(x^2) * erfc(x) (1981)