How to calculate the inverse of the normal cumulative distribution function in python?

Question

How do I calculate the inverse of the cumulative distribution function (CDF) of the normal distribution in Python?

Which library should I use? Possibly scipy?

Do you mean the inverse Gaussian distribution (en.wikipedia.org/wiki/Inverse_Gaussian_distribution), or the inverse of the cumulative distribution function of the normal distribution (en.wikipedia.org/wiki/Normal_distribution), or something else? — Warren Weckesser, Dec 17, 2013 at 6:30
@WarrenWeckesser the second one: inverse of the cumulative distribution function of the normal distribution — Yueyoum, Dec 17, 2013 at 6:32
@WarrenWeckesser i mean the python version of "normsinv" function in excel. — Yueyoum, Dec 17, 2013 at 6:39

Warren Weckesser · Accepted Answer · 2022-08-15 14:05:56Z

NORMSINV (mentioned in a comment) is the inverse of the CDF of the standard normal distribution. Using scipy, you can compute this with the ppf method of the scipy.stats.norm object. The acronym ppf stands for percent point function, which is another name for the quantile function.

In [20]: from scipy.stats import norm

In [21]: norm.ppf(0.95)
Out[21]: 1.6448536269514722

Check that it is the inverse of the CDF:

In [34]: norm.cdf(norm.ppf(0.95))
Out[34]: 0.94999999999999996

By default, norm.ppf uses mean=0 and stddev=1, which is the "standard" normal distribution. You can use a different mean and standard deviation by specifying the loc and scale arguments, respectively.

In [35]: norm.ppf(0.95, loc=10, scale=2)
Out[35]: 13.289707253902945

If you look at the source code for scipy.stats.norm, you'll find that the ppf method ultimately calls scipy.special.ndtri. So to compute the inverse of the CDF of the standard normal distribution, you could use that function directly:

In [43]: from scipy.special import ndtri

In [44]: ndtri(0.95)
Out[44]: 1.6448536269514722

ndtri is much faster than norm.ppf:

In [46]: %timeit norm.ppf(0.95)
240 µs ± 1.75 µs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)

In [47]: %timeit ndtri(0.95)
1.47 µs ± 1.3 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)

I always think "percent point function" (ppf) is a terrible name. Most people in statistics just use "quantile function". — William Zhang, Oct 4, 2014 at 0:44
Don't you need to specify the mean and the std on both ppf and cdf? — bones.felipe, Jan 29, 2021 at 19:23
@bones.felipe, the "standard" normal distribution has mean 0 and standard deviation 1. These are the default values for the location and scale of the scipy.stats.norm methods. — Warren Weckesser, Jan 29, 2021 at 19:55
Right, I thought I saw this norm.cdf(norm.ppf(0.95, loc=10, scale=2)) and I thought it was weird norm.cdf did not have loc=10 and scale=2 too, I guess it should. — bones.felipe, Jan 30, 2021 at 5:33

Xavier Guihot · Accepted Answer · 2019-03-19 21:58:57Z

39

Starting Python 3.8, the standard library provides the NormalDist object as part of the statistics module.

It can be used to get the inverse cumulative distribution function (inv_cdf - inverse of the cdf), also known as the quantile function or the percent-point function for a given mean (mu) and standard deviation (sigma):

from statistics import NormalDist

NormalDist(mu=10, sigma=2).inv_cdf(0.95)
# 13.289707253902943

Which can be simplified for the standard normal distribution (mu = 0 and sigma = 1):

NormalDist().inv_cdf(0.95)
# 1.6448536269514715

answered Mar 19, 2019 at 21:58

Xavier Guihot

56.6k22 gold badges297 silver badges192 bronze badges

4

Great tip! This allows me to drop the dependency on scipy, which I needed just for the single stats.norm.ppf method
– Jethro Cao
Feb 21, 2020 at 16:56
can you use that to transform data with uniform distribution to normal ?
– vanetoj
Mar 31, 2022 at 20:51

Add a comment |

o0omycomputero0o · Accepted Answer · 2015-09-26 15:04:38Z

# given random variable X (house price) with population muy = 60, sigma = 40
import scipy as sc
import scipy.stats as sct
sc.version.full_version # 0.15.1

#a. Find P(X<50)
sct.norm.cdf(x=50,loc=60,scale=40) # 0.4012936743170763

#b. Find P(X>=50)
sct.norm.sf(x=50,loc=60,scale=40) # 0.5987063256829237

#c. Find P(60<=X<=80)
sct.norm.cdf(x=80,loc=60,scale=40) - sct.norm.cdf(x=60,loc=60,scale=40)

#d. how much top most 5% expensive house cost at least? or find x where P(X>=x) = 0.05
sct.norm.isf(q=0.05,loc=60,scale=40)

#e. how much top most 5% cheapest house cost at least? or find x where P(X<=x) = 0.05
sct.norm.ppf(q=0.05,loc=60,scale=40)

PS: You can assume 'loc' as 'mean' and 'scale' as 'standard deviation' — Suresh2692, Jul 5, 2017 at 11:11

Collectives™ on Stack Overflow

How to calculate the inverse of the normal cumulative distribution function in python?

3 Answers 3

Not the answer you're looking for? Browse other questions tagged
python
scipy
normal-distribution
or ask your own question.

Linked

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

Not the answer you're looking for? Browse other questions tagged pythonscipynormal-distribution or ask your own question.

Linked

Related

Not the answer you're looking for? Browse other questions tagged
python
scipy
normal-distribution
or ask your own question.