18 Probability in R

We will keep short here. Instead of learning all the concepts of probability, we will see how to calculate probability, densities, quantiles for nearly any type of distribution. R’s powerhorse has four types of functions for each of the distributions associated called pqdr functions. Actually all these are prefixes. Consider a probability function \(P(X=x) = p\) for a variable \(x\) and \(p\) be the associated probability.

Distribution	P	Q	D	R
Beta	pbeta	qbeta	dbeta	rbeta
Binomial	pbinom	qbinom	dbinom	rbinom
Cauchy	pcauchy	qcauchy	dcauchy	rcauchy
Chi-Square	pchisq	qchisq	dchisq	rchisq
Exponential	pexp	qexp	dexp	rexp
F	pf	qf	df	rf
Gamma	pgamma	qgamma	dgamma	rgamma
Geometric	pgeom	qgeom	dgeom	rgeom
Hypergeometric	phyper	qhyper	dhyper	rhyper
Logistic	plogis	qlogis	dlogis	rlogis
Log Normal	plnorm	qlnorm	dlnorm	rlnorm
Negative Binomial	pnbinom	qnbinom	dnbinom	rnbinom
Normal	pnorm	qnorm	dnorm	rnorm
Poisson	ppois	qpois	dpois	rpois
Student t	pt	qt	dt	rt
Studentized Range	ptukey	qtukey	dtukey	rtukey
Uniform	punif	qunif	dunif	runif
Weibull	pweibull	qweibull	dweibull	rweibull
Wilcoxon Rank Sum Statistic	pwilcox	qwilcox	dwilcox	rwilcox
Wilcoxon Signed Rank Statistic	psignrank	qsignrank	dsignrank	rsignrank

All these functions are vectorised. Let us explore these one by one.

18.1 `p*()` set of functions

These set of functions give the cumulative probability distribution of that probability function.

Example-1. What is the probability of a number being less than or equal to 25 in Normal distribution with mean = 50 and sd = 10.

pnorm(25, mean = 50, sd = 10)

## [1] 0.006209665

On the contrary, the probability of a number being greater than or equal to 25 in the above distribution is-

# Either deduct probability from 1 
1 - pnorm(25, mean = 50, sd = 10)

## [1] 0.9937903

# Or provide FALSE to lower.tail argument
pnorm(25, mean = 50, sd = 10, lower.tail = FALSE)

## [1] 0.9937903

Example-2: What is the probability of one or more heads out of two tosses of a fair coin (binomial distribution with p = 0.5).

pbinom(1, size = 2, p = 0.5)

## [1] 0.75

18.2 `q*()` set of functions

These set of functions, give quantile which is the inverse of cumulative probability function. So if \(f\) is cdf (cumulative distribution function) of a given probability distribution then \(F\) the quantile is inverse of f i.e. \(F = f^{-1}\). These are related by

\[\begin{equation} p = f(x) \tag{18.1} \end{equation}\]

\[\begin{equation} x = F(x) = f^{-1}(x) \tag{18.2} \end{equation}\]

Example- In the above same normal distribution (mean = 50 and sd = 10) What is number below which 90% of population will be distributed.

qnorm(0.9, mean = 50, sd = 10)

## [1] 62.81552

Similar to cdf here we may use lower.tail argument to find the number above which a population percent is distributed.

qnorm(0.9, mean = 50, sd = 10, lower.tail = FALSE)

## [1] 37.18448

18.3 `d*()` set of functions

We saw that p group denotes cdf, q group denotes inverse cdf, but d group actually denotes probability density function of a given distribution. Simply stating, this returns the height of probability distribution function for a given x value.

So what is expected probability of drawing exactly 2 heads out of two tosses of a single fair coin (i.e. from a binomial distribution with probability p = 0.5).

dbinom(2, 2, prob = 0.5)

## [1] 0.25

18.4 `r*()` set of functions

These set of functions are used to generate random numbers from a Statistical distribution. So to generate 10 random numbers from Normal distribution with mean = 50 and sd = 10, we can use rnorm.

rnorm(10, mean = 50, sd = 10)

##  [1] 33.10444 62.39496 48.91034 48.82758 51.83083 62.80555 32.72729 66.90184
##  [9] 55.03812 75.28337

We can actually check this using histogram.

set.seed(1234)
hist(rnorm(10000, 50, 10), breaks = 50)

Figure 18.1: Histogram of Random numbers generated out of Normal distribution

Part-III: Probability and Sampling in R

19 Random sampling in R

18 Probability in R

18.1 p*() set of functions

18.2 q*() set of functions

18.3 d*() set of functions

18.4 r*() set of functions

18.1 `p*()` set of functions

18.2 `q*()` set of functions

18.3 `d*()` set of functions

18.4 `r*()` set of functions