prob


prob, a C++ code which handles various discrete and continuous probability density functions (PDF).

For a discrete variable X, PDF(X) is the probability that the value X will occur; for a continuous variable, PDF(X) is the probability density of X, that is, the probability of a value between X and X+dX is PDF(X) * dX.

The corresponding cumulative density functions or "CDF"'s are also handled. For a discrete or continuous variable, CDF(X) is the probability that the variable takes on a value less than or equal to X.

In some cases, the inverse of the CDF can easily be computed. If


        X = CDF_INV ( P )
      
then we are asserting that the value X has a cumulative probability density function of P, in other words, the probability that the variable is less than or equal to X is P. If the CDF cannot be analytically inverted, there are simple ways to try to estimate the inverse. Depending on the PDF, these methods may be rapid and accurate, or not.

For most distributions, the mean or "average value" or "expected value" is also available. For a discrete variable, MEAN is simply the sum of the products X * PDF(X); for a continuous variable, MEAN is the integral of X * PDF(X) over the range. For the distributions covered here, the means are known beforehand, and no summation or integration is required.

For most distributions, the variance is available. For a discrete variable, the variance is the sum of the products ( X - MEAN )^2 * PDF(X); for a continuous variable, the variance is the integral of ( X - MEAN )^2 * PDF(X) over the range. The square root of the variance is known as the standard deviation. For the distributions covered here, the variances are often known beforehand, and no summation or integration is required.

For many of the distributions, it is possible to repeatedly request "samples", that is, a pseudorandom sequence of realizations of the PDF. These samples are always associated with an integer seed, which controls the calculation. Using the same seed as input will guarantee the same sample value on output. Ultimately, a random number generator must be invoked internally. In most cases, the current code will call a routine called R8_RANDOM or I4_RANDOM, each of which in turn calls a routine called R8_UNIFORM_01. You may prefer a different random number generator for this purpose.

Licensing:

The information on this web page is distributed under the MIT license.

Languages:

prob is available in a C version and a C++ version and a Fortran77 version and a Fortran90 version and a MATLAB version and an Octave version and a Python version.

Related Data and Programs:

prob_test

asa152, a C++ code which evaluates point and cumulative probabilities associated with the hypergeometric distribution; this is Applied Statistics Algorithm 152;

asa226, a C++ code which evaluates the CDF of the noncentral Beta distribution.

asa241, a C++ code which evaluates the percentage points of the normal distribution.

asa243, a C++ code which evaluates the CDF of the noncentral T distribution.

asa310, a C++ code which computes the CDF of the noncentral Beta distribution.

beta_nc, a C++ code which evaluates the CDF of the noncentral Beta distribution.

cdflib, a C++ code which evaluates the cumulative density function (CDF), inverse CDF, and certain other inverse functions, for distributions including beta, binomial, chi-square, noncentral chi-square, F, noncentral F, gamma, negative binomial, normal, Poisson, and students T, by Barry Brown, James Lovato, Kathy Russell.

discrete_pdf_sample_2d, a C++ code which demonstrates how to construct a Probability Density Function (PDF) from a table of sample data, and then to use that PDF to create new samples.

gsl_test, a C++ code which includes many routines for evaluating probability distributions.

log_normal, a C++ code which returns quantities associated with the log normal Probability Distribution Function (PDF).

log_normal_truncated_ab, a C++ code which returns quantities associated with the log normal Probability Distribution Function (PDF) truncated to the interval [A,B].

normal, a C++ code which samples the normal distribution.

random_data, a C++ code which generates sample points for various probability distributions, spatial dimensions, and geometries;

test_values, a C++ code which contains sample values for a number of distributions.

truncated_normal, a C++ code which works with the truncated normal distribution over [A,B], or [A,+oo) or (-oo,B], returning the probability density function (PDF), the cumulative density function (CDF), the inverse CDF, the mean, the variance, and sample values.

uniform, a C++ code which samples the uniform distribution.

ziggurat, a C++ code which generates points from a uniform, normal or exponential distribution, using the ziggurat method.

Reference:

  1. Roger Abernathy, Robert Smith,
    Algorithm 724,
    Program to Calculate F Percentiles,
    ACM Transactions on Mathematical Software,
    Volume 19, Number 4, December 1993, pages 481-483.
  2. Milton Abramowitz, Irene Stegun,
    Handbook of Mathematical Functions,
    National Bureau of Standards, 1964,
    ISBN: 0-486-61272-4,
    LC: QA47.A34.
  3. AG Adams,
    Algorithm 39: Areas Under the Normal Curve,
    Computer Journal,
    Volume 12, 1969, pages 197-198.
  4. Joachim Ahrens, Ulrich Dieter,
    Generating Gamma Variates by a Modified Rejection Technique,
    Communications of the ACM,
    Volume 25, Number 1, January 1982, pages 47-54.
  5. Joachim Ahrens, Ulrich Dieter,
    Computer Methods for Sampling from Gamma, Beta, Poisson and Binomial Distributions.
    Computing,
    Volume 12, 1974, pages 223-246.
  6. Joachim Ahrens, Klaus-Dieter Kohrt, Ulrich Dieter,
    Algorithm 599: Sampling from Gamma and Poisson Distributions,
    ACM Transactions on Mathematical Software,
    Volume 9, Number 2, June 1983, pages 255-257.
  7. Jerry Banks, editor,
    Handbook of Simulation,
    Wiley, 1998,
    ISBN: 0471134031,
    LC: T57.62.H37.
  8. JD Beasley, SG Springer,
    Algorithm AS 111: The Percentage Points of the Normal Distribution,
    Applied Statistics,
    Volume 26, 1977, pages 118-121.
  9. Frank Benford,
    The Law of Anomalous Numbers,
    Proceedings of the American Philosophical Society,
    Volume 78, 1938, pages 551-572.
  10. Jose Bernardo,
    Algorithm AS 103: Psi ( Digamma ) Function,
    Applied Statistics,
    Volume 25, Number 3, 1976, pages 315-317.
  11. Donald Best, Nicholas Fisher,
    Efficient Simulation of the von Mises Distribution,
    Applied Statistics,
    Volume 28, Number 2, pages 152-157.
  12. Donald Best, Roberts,
    Algorithm AS 91: The Percentage Points of the Chi-Squared Distribution,
    Applied Statistics,
    Volume 24, Number 3, 1975, pages 385-390.
  13. Paul Bratley, Bennett Fox, Linus Schrage,
    A Guide to Simulation,
    Second Edition,
    Springer, 1987,
    ISBN: 0387964673.
  14. William Cody,
    An Overview of Software Development for Special Functions, in Numerical Analysis Dundee, 1975,
    edited by GA Watson,
    Lecture Notes in Mathematics, 506,
    Springer, 1976.
  15. William Cody,
    Rational Chebyshev Approximations for the Error Function,
    Mathematics of Computation,
    Volume 23, Number 107, July 1969, pages 631-638.
  16. William Cody, Kenneth Hillstrom,
    Chebyshev Approximations for the Natural Logarithm of the Gamma Function, Mathematics of Computation,
    Volume 21, Number 98, April 1967, pages 198-203.
  17. BE Cooper,
    Algorithm AS 5: The Integral of the Non-Central T-Distribution,
    Applied Statistics,
    Volume 17, 1968, page 193.
  18. Luc Devroye,
    Non-Uniform Random Variate Generation,
    Springer, 1986,
    ISBN: 0387963057,
    LC: QA274.D48
  19. Merran Evans, Nicholas Hastings, Brian Peacock,
    Statistical Distributions,
    Wiley, 2000,
    ISBN: 0471371246,
    LC: QA273.6E92.
  20. Nicholas Fisher,
    Statistical Analysis of Circular Data,
    Cambridge, 1993,
    ISBN: 0521568900,
    LC: QA276.F488
  21. Nicholas Fisher, Toby Lewis, Brian Embleton,
    Statistical Analysis of Spherical Data,
    Cambridge, 2003,
    ISBN13: 978-0521456999,
    LC: QA276.F489
  22. Darren Glass, Philip Lowry,
    Quasigeometric Distributions and Extra Inning Baseball Games,
    Mathematics Magazine,
    Volume 81, Number 2, April 2008, pages 127-137.
  23. John Hart, Ward Cheney, Charles Lawson, Hans Maehly, Charles Mesztenyi, John Rice, Henry Thatcher, Christoph Witzgall,
    Computer Approximations,
    Wiley, 1968,
    LC: QA297.C64.
  24. Geoffrey Hill,
    Algorithm 518: Incomplete Bessel Function I0: The Von Mises Distribution,
    ACM Transactions on Mathematical Software,
    Volume 3, Number 3, September 1977, pages 279-284.
  25. Ted Hill,
    The First Digit Phenomenon,
    American Scientist,
    Volume 86, July/August 1998, pages 358-363.
  26. Mark Johnson,
    Multivariate Statistical Simulation: A Guid to Selecting and Generating Continuous Multivariate Distributions,
    Wiley, 1987,
    ISBN: 0471822906,
    LC: QA278.J62
  27. Norman Johnson, Samuel Kotz, Narayanaswamy Balakrishnan,
    Continuous Univariate Distributions,
    Second edition,
    Wiley, 1994,
    ISBN: 0471584940,
    LC: QA273.6.J6
  28. Norman Johnson, Samuel Kotz, Adrienne Kemp,
    Univariate Discrete Distributions,
    Third edition,
    Wiley, 2005,
    ISBN: 0471272469,
    LC: QA273.6.J64
  29. William Kennedy, James Gentle,
    Statistical Computing,
    Marcel Dekker, 1980,
    ISBN: 0824768981,
    LC: QA276.4 K46.
  30. Robert Knop,
    Algorithm 441: Random Deviates from the Dipole Distribution,
    ACM Transactions on Mathematical Software,
    Volume 16, Number 1, January 1973, page 51.
  31. Kalimutha Krishnamoorthy,
    Handbook of Statistical Distributions with Applications,
    Chapman and Hall, 2006,
    ISBN: 1-58488-635-8,
    LC: QA273.6.K75.
  32. Henry Kucera, Winthrop Francis,
    Computational Analysis of Present-Day American English,
    Brown University Press, 1967,
    LC: PE2839.K8.
  33. Kenneth Lange,
    Mathematical and Statistical Methods for Genetic Analysis,
    Springer, 1997,
    ISBN: 0387953892,
    LC: QH438.4.M33.L36.
  34. Alfred Lotka,
    The frequency distribution of scientific productivity,
    Journal of the Washington Academy of Sciences,
    Volume 16, Number 12, 1926, pages 317-324.
  35. KL Majumder, GP Bhattacharjee,
    Algorithm AS63: The incomplete Beta Integral,
    Applied Statistics,
    Volume 22, number 3, 1973, pages 409-411.
  36. Kanti Mardia, Peter Jupp,
    Directional Statistics,
    Wiley, 2000,
    ISBN: 0471953334,
    LC: QA276.M335
  37. Michael McLaughlin
    A Compendium of Common Probability Distributions
  38. Paul Nahin,
    Digital Dice: Computational Solutions to Practical Probability Problems,
    Princeton University Press, 2008,
    ISBN13: 978-0-691-12698-2,
    LC: QA273.25.N34.
  39. Keith Ord,
    Families of Frequency Distributions,
    Lubrecht & Cramer, 1972,
    ISBN: 0852641370.
  40. Donald Owen,
    Tables for Computing Bivariate Normal Probabilities,
    The Annals of Mathematical Statistics,
    Volume 27, Number 4, December 1956, pages 1075-1090.
  41. Frank Powell,
    Statistical Tables for Sociology, Biology and Physical Sciences,
    Cambridge University Press, 1982,
    ISBN: 0521284732,
    LC: QA276.25.S73.
  42. Sudarshan Raghunathan,
    Making a Supercomputer Do What You Want: High Level Tools for Parallel Programming,
    Computing in Science and Engineering,
    Volume 8, Number 5, September/October 2006, pages 70-80.
  43. Ralph Raimi,
    The Peculiar Distribution of First Digits,
    Scientific American,
    December 1969, pages 109-119.
  44. Reuven Rubinstein,
    Monte Carlo Optimization, Simulation and Sensitivity of Queueing Networks,
    Krieger, August 1992,
    ISBN: 0894647644,
    LC: QA298.R79
  45. BE Schneider,
    Algorithm AS 121: Trigamma Function,
    Applied Statistics,
    Volume 27, Number 1, 1978, page 97-99.
  46. BL Shea,
    Algorithm AS 239: Chi-squared and Incomplete Gamma Integral,
    Applied Statistics,
    Volume 37, Number 3, 1988, pages 466-473.
  47. Eric Weisstein,
    CRC Concise Encyclopedia of Mathematics,
    CRC Press, 2002,
    Second edition,
    ISBN: 1584883472,
    LC: QA5.W45
  48. Michael Wichura,
    Algorithm AS 241: The Percentage Points of the Normal Distribution,
    Applied Statistics,
    Volume 37, Number 3, 1988, pages 477-484.
  49. Herbert Wilf,
    Some New Aspects of the Coupon Collector's Problem,
    SIAM Review,
    Volume 48, Number 3, September 2006, pages 549-565.
  50. ML Wolfson, HV Wright,
    Algorithm 160: Combinatorial of M Things Taken N at a Time,
    Communications of the ACM,
    Volume 6, Number 4, April 1963, page 161.
  51. JC Young, CE Minder,
    Algorithm AS 76: An Algorithm Useful in Calculating Non-Central T and Bivariate Normal Distributions,
    Applied Statistics,
    Volume 23, Number 3, 1974, pages 455-457.
  52. Daniel Zwillinger, Steven Kokoska,
    Standard Probability and Statistical Tables,
    CRC Press, 2000,
    ISBN: 1-58488-059-7,
    LC: QA273.3.Z95.

Source Code:


Last revised on 01 April 2020.