/
Basic Probability Distributions in Basic Probability Distributions in

Basic Probability Distributions in - PowerPoint Presentation

alyssa
alyssa . @alyssa
Follow
69 views
Uploaded On 2023-11-03

Basic Probability Distributions in - PPT Presentation

R Programming By Dr Mohamed Surputheen probability distributions in R Many statistical tools and techniques used in data analysis are based on probability Probability measures how likely it is for an event to occur on a scale from 0 the event never occurs to 1 the event always occurs ID: 1028211

distribution probability random function probability distribution function random binomial number normal prob poisson size cdf variable cumulative lambda success

Share:

Link:

Embed:

Download Presentation from below link

Download Presentation The PPT/PDF document "Basic Probability Distributions in" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

1. Basic Probability Distributions in R ProgrammingBy Dr. Mohamed Surputheen

2. probability distributions in RMany statistical tools and techniques used in data analysis are based on probability. Probability measures how likely it is for an event to occur on a scale from 0 (the event never occurs) to 1 (the event always occurs). A probability distribution describes how a random variable is distributed; it tells us which values a random variable is most likely to take on and which values are less likely.R comes with built-in implementations of many probability distributions. Each probability distribution in R is associated with four functions which follow a naming convention:The d-prefix function calculates the probability density function (PDF) of a continuous probability distribution, or the probability mass function (PMF) of a discrete probability distribution, at a specific value of the random variable.The p-prefix function calculates the cumulative distribution function (CDF) of a probability distribution, which gives the probability of observing a value less than or equal to a given value of the random variableThe q-prefix function calculates the quantile of a probability distribution, which is the inverse of the CDF.The r-prefix function generates random numbers from a probability distribution

3. Binomial DistributionThe binomial distribution is a discrete probability distribution that describes the number of successes in a fixed number of independent trials with two possible outcomes (success or failure) and a constant probability of success for each trial.A binomial experiment has the following properties:experiment consists of n identical and independent trialseach trial results in one of two outcomes: success or failureP(success) = p P(failure) = q = 1 - p for all trialsThe random variable of interest, X, is the number of successes in the n trials.X has a binomial distribution with parameters n and p

4. Binomial DistributionIf the probability of success in each trial is given by p , then the probability of getting exactly x successful events among n trials is given by the Binomial PMF or-n is the total number of trials-x is the number of successes-p is the probability of success on each trial.

5. Mean , variance and Standard deviation of Binomial DistributionThe mean, E(X) = p + p + … + p = n*pThe variance, V(X) = pq + pq + … + pq = n*pq The standard deviation =

6. Probability Computations Related to Binomial DistributionsR has several functions related to the binomial distribution. Here are some commonly used ones: 1. dbinom(x, size, prob) - Probability Mass Function (PMF) or probability distribution of the binomial distribution. Calculates the probability of getting exactly x successes in size trials, given a probability prob of success on each trial.2. pbinom(q, size, prob) - Cumulative Distribution Function (CDF) of the binomial distribution. Calculates the probability of getting up to q successes in size trials, given a probability prob of success on each trial.3. qbinom(p, size, prob) - Inverse CDF of the binomial distribution. Calculates the smallest number q such that the CDF is less than or equal to p, given size trials and a probability prob of success on each trial.(ie) This function takes the probability value and gives a number whose cumulative value matches the probability 4. rbinom(n, size, prob) - Random number generator for the binomial distribution. Generates n random samples from a binomial distribution with size trials and a probability prob of success on each trial.

7. Binomial probabilities using dbinom() function in Rdbinom is the function used to find the probability mass function for the binomial distribution.The function ‘dbinom’ is used to obtain the exact probability using Binomial distribution, i.e. P(X=x).The syntax to compute the probability at x for binomial distribution using R isdbinom(x,size,prob)where x : the value(s) of the variable, size : the number of trials, and prob : the probability of success (prob).The dbinom() function gives the probability for given value(s) x (no. of successes), size (no. of trials) and prob (probability of success).

8. Example: dbinomA coin is tossed 5 times. What is the probability of getting one head and three heads?To solve this problem using R language, we can use the dbinom() function, which calculates the binomial probability mass function.For getting one head:> dbinom(1, size=5, prob=0.5)[1] 0.15625For getting three heads:> dbinom(3, size=5, prob=0.5)[1] 0.3125Manual verificationThe probability of getting a head on a single coin toss is 1/2, and the probability of getting a tail is also 1/2.To find the probability of getting a certain number of heads in 5 coin tosses, we can use the binomial probability formula:P(x) = nCx pxq(n-x)Given n=5, p=0.5,q=0.5So for getting one head, we have:P(X=1) = (5C1) * (1/2)^1 * (1/2)^4 = 5/32 =0.15625For getting three heads, we have:P(X=3) = (5 C 3) * (1/2)^3 * (1/2)^2 = 10/32 = 5/16 =0.3125

9. Binomial cumulative probability using pbinom() function in RThe syntax to compute the cumulative probability distribution function (CDF) for binomial distribution using R ispbinom(q,size,prob)whereq : the value(s) of the variable,size : the number of trials, andprob : the probability of success (prob).This function is very useful for calculating the cumulative binomial probabilities for given value(s) of q (value of the variable x), size (no. of trials) and prob (probability of success).

10. In a university 45% of the students are female. A random sample of 10 students are selected. What is the probability that 2 or less female students are selected?Answer:pbinom(q,size,prob)pbinom(2,10,0.45)[1] 0.09955965

11. Binomial Distribution Quantiles using qbinom() in Rqbinom is the R function that calculates the inverse CDF (or quqntiles) of the binomial distribution.W.k.t, This function takes the probability value and gives a number whose cumulative value matches the probability value.The syntax to compute the inverse CDF or quantiles of binomial distribution using R isqbinom(p,size,prob)where p : the value(s) of the probabilities, size : the number of trials, and prob : the probability of success (prob).The function qbinom(p,size,prob) gives the Inverse CDF of Binomial distribution for given value of p, size and prob.Note: qbinom is the inverse of the pbinom function. pbinom calculates the cumulative probability distribution function (CDF) of a binomial random variable, while qbinom calculates the inverse CDF or the quantile function of the binomial distribution.Diff between CDF and invers CDF The CDF represents the probability that a random variable takes on a value less than or equal to a given value. The inverse CDF, on the other hand, does the opposite. It takes a probability as input and returns the value of the random variable that corresponds to that probability.

12. Example problem that demonstrates the relationship between pbinom and qbinom:Suppose we flip a fair coin 10 times. What is the probability of getting 3 or fewer heads?To solve this problem using pbinom, we can set n = 10 and p = 0.5 (since the coin is fair) and use the following code:> pbinom(3, 10, 0.5)[1] 0.171875This returns a probability of approximately 0.1719, meaning there is a 17.19% chance of getting 3 or fewer heads in 10 coin flips.To solve this problem using qbinom, we can again set n = 10 and p = 0.5 and use the following code:> qbinom( 0.171875, 10, 0.5) # This function takes the probability value and gives a number whose cumulative value matches the probability value.[1] 3This returns a value of 3, which confirms that the probability of getting 3 or fewer heads is approximately 0.1719. Here, we used qbinom to find the value of k such that P(X ≤ k) = 0.1719.

13. Simulating Binomial random variable using rbinom() function in R The general R function to generate random numbers from Binomial distribution is rbinom(n,size,prob)where, n is the sample size, size is the number of trials, and prob is the the probability of success in binomial distribution.The function rbinom(n,size,prob) generates n random numbers from Binomial distribution with the number of trials size and the probability of success prob.Example: Generate 8 random values from a sample of 150 with probability of 0.4. > x <- rbinom(8,150,.4) > x[1] 61 51 54 54 56 62 62 48

14. Poisson DistributionThe Poisson distribution is a probability distribution that describes the probability of a certain number of events occurring within a fixed time or space interval, given the average rate of occurrence()of those events(ie) The Poisson distribution models the probability of a certain number of events occurring in a fixed interval of time, given the average rate at which the events occur.The binomial distribution models the probability of a fixed number of successes in a fixed number of independent trials, while the Poisson distribution models the probability of a fixed number of occurrences in a fixed time or space interval.

15. In 1837 French mathematician Simeon Dennis Poisson derived the distribution as a limiting case of Binomial distribution. It is called after his name as Poisson distribution. Conditions: (i) The number of trails ‘n’ is indefinitely large i.e., n→∞ (ii) The probability of a success ‘p’ for each trial is very small i.e., p→0 (iii) np=  is finite (iv) Events are Independent

16. The random variable X is said to follow the Poisson probability distribution if it has the probability function:The pmf is given byP(X=x)= p(x) = e-l lx / x! , for x=0,1,2…whereP(x) = the probability of x successes over a given period of time or space, given   = the expected number of successes per time  > 0e = 2.71828 (the base for natural logarithms)The mean of the distribution is λ.The variance of the distribution is also λ.The standard deviation of the distribution is √λ.

17. Probability Computations Related to Poisson Distributions in RIn R, you can use the dpois(), ppois(), qpois(), and rpois() functions to work with the Poisson distribution. 1.dpois(x, lambda) calculates the Probability Mass Function (PMF) of the Poisson distribution at a specific value of x, given a Poisson parameter lambda.2. ppois(q, lambda) calculates the Cumulative Distribution Function (CDF) of the Poisson distribution at a specific value of q, given a Poisson parameter lambda.3. qpois(p, lambda) calculates the Inverse Cumulative Distribution Function (quantile function) of the Poisson distribution at a specific probability value p, given a Poisson parameter lambda.4. rpois(n, lambda) generates n random samples from a Poisson distribution with a Poisson parameter lambda.

18. dpoisThe dpois function calculates the probability mass function for a Poisson distribution, given a particular value x and a parameter lambda.dpois(x, lambda) x: number of successeslambda: average rate of successExample:Suppose a call center receives an average of 10 customer calls per hour. What is the probability that the call center will receive exactly 7 calls in the next hour?Ans: > lambda <- 10> x <- 7> prob <- dpois(x, lambda)> prob[1] 0.09007923To solve this problem manually using the Poisson distribution, we can use the formula:P(X=x)= p(x) = e-l lx / x! where lambda is the average number of events per interval (in this case, 10 customer calls per hour), x is the number of events we're interested in (in this case, 7 customer calls in the next hour), and e is the mathematical constant approximately equal to 2.71828.P(X = 7) = (e -10 * 10 7) / 7!= (0.0000454 * 10,000,000) / (7 * 6 * 5 * 4 * 3 * 2 * 1)= 0.09008Therefore, the probability of receiving exactly 7 calls in the next hour is approximately 0.090 or 9.0%.

19. ppoisIn R, you can use the ppois function to calculate the Cumulative Distribution Function (CDF) of the Poisson distribution. The CDF gives the probability of getting k or fewer events in a certain interval of time, given the average rate of events per unit time.ppois(q, lambda) q: number of successeslambda: average rate of successExamples: It is known that a certain hospital experience 4 births per hour. In a given hour, what is the probability that 4 or less births occur?Answer: Using the Poisson Distribution with λ = 4 and x = 4, we find that P(X≤4) = 0.62884.> ppois(4,4)[1] 0.6288369So the probability of 4 or fewer births in an hour is approximately 0.6288 or 62.88%, which matches the result we obtained earlier.To solve this problem manually using the Poisson distribution, we can use the formula:P(X=x)= p(x) = e-l lx / x! where λ is the average rate of events per hour and x is the number of events.To find the probability of 4 or fewer births in an hour, we need to calculate the probabilities for k = 0, 1, 2, 3, and 4, and add them up:P(0) = (40 * e (-4) )/ 0! = 0.0183P(1) = (41 * e (-4)) / 1! = 0.0733P(2) = (4 2 * e (-4) ) / 2! = 0.1465P(3) = (43 * e (-4) ) / 3! = 0.1953P(4) = (4 4 * e (-4) ) / 4! = 0.1953Therefore, the probability of 4 or fewer births in an hour is:P(0 or 1 or 2 or 3 or 4) = P(0) + P(1) + P(2) + P(3) + P(4) = 0.6287

20. Qpoisqpois is a function in the R programming language that calculates the inverse cumulative distribution function (also known as the quantile function) for the Poisson distribution. qpois(p, lambda) p:  the probability value for which you want to find the Inverse CDF.lambda: the average rate of events per unit time.Example:Suppose that the number of people who visit a website in a day follows a Poisson distribution with a mean of 500 people. What is the minimum number of people that we can expect to visit the website in a day with a probability of at least 95%?To solve this problem using qpois, we can first find the Poisson distribution value that corresponds to a probability of 0.95 using the qpois function and the mean value of 500:> qpois(0.95, 500)[1] 537This means that we can expect at least 537 people to visit the website in a day with a probability of at least 95%

21. rpoisrpois is a function in R that generates random numbers from a Poisson distribution with a specified mean. The function takes two arguments: the number of random numbers to generate (n) and the mean of the Poisson distribution (lambda).rpois(n, lambda) n: number of random variables to generatelambda: mean of the Poisson distributionExample: suppose we want to generate 10 random numbers from a Poisson distribution with a mean of 5> rpois(10, 5) [1] 5 3 8 2 6 6 4 2 3 8

22. The Normal DistributionIn probability theory and statistics, the Normal Distribution, also called the Gaussian Distribution, is the most significant continuous probability distribution.The normal distribution is a bell-shaped, symmetrical distribution(the values to the left of the mean are a mirror image of the values to the right of the mean.) in which the mean, median and mode are all equal. If the mean, median and mode are unequal, the distribution will be either positively or negatively skewed.A continuous random variable X having the bell-shaped distribution is called a normal random variable.

23. A random variable X is said to have a Normal distribution with parameters with mean μ and variance 2 if its probability density function is given byIt is denoted by X ~ N (μ,  2)Wheref(x) = frequency of random variable x  = 3.14159; e = 2.71828  = population standard deviation ( > 0 )x = value of random variable -∞ < x <∞ µ = population mean(-∞< μ <∞ )

24. Properties of Normal DistributionIt is a continuous distributionThe normal distribution curve is bell-shaped. The mean, median, and mode are equal(Mean = Median = Mode = μ ) and located at the center of the distribution. The normal distribution curve is unimodal (single mode). The curve is symmetrical about the mean. (ie) Each half of the distribution is a mirror image of the other half.It is asymptotic to the horizontal axis. That is, it does not touch the x-axis and it goes on forever in each direction.The random variable 𝑥 can take any value from −∞ 𝑡𝑜 ∞.The total area under the normal distribution curve is equal to 1 or 100%.(ie)  

25. Standard Normal Distribution The simplest case of a normal distribution is known as the standard normal distribution. This is a special case when µ=0 and =1, and it is described by this probability density function.If X ~ N(µ, σ2), let Z = (X - µ) / σ, [Z-transformation] then E(Z) = 0, V (Z) = 1. (i.e)Z ~ N(0, 1), Z is said to have a standard normal distribution.

26. Probability Computations Related to Normal Distributions in Rdnorm: density function of the normal distributionpnorm: cumulative density function of the normal distributionqnorm: quantile function of the normal distributionrnorm: random sampling from the normal distribution

27. Normal probabilities using dnorm() function in RdnormThe function dnorm returns the value of the probability density function (pdf) of the normal distribution given a certain random variable x, a population mean μ and population standard deviation σ. The syntax for using dnorm is as follows:dnorm(x, mean, sd) Example:The GRE(Graduate Record Examinations ) is widely used to help predict the performance of applicants to graduate schools. The range of possible scores on a GRE is 200 to 900. The psychology department at a university finds that the students in their department have scores with a mean of 544 and standard deviation of 103. Find the value of the density function at x=550> dnorm(550,544,103)[1] 0.00386666Manual verificationThe value of the density function at x=550 is

28. Normal cumulative Density Function using pnorm() function in RThe function pnorm returns the value of the Cumulative Density Function (CDF) of the normal distribution given a certain random variable q, a population mean μ and population standard deviation σ. The syntax for using pnorm is as follows:pnorm(q, mean, sd) (ie) pnorm is the cumulative density function for the normal distribution. By definition pnorm(x) = P(X ≤ x) Example :  The GRE(Graduate Record Examinations ) is widely used to help predict the performance of applicants to graduate schools. The range of possible scores on a GRE is 200 to 900. The psychology department at a university finds that the students in their department have scores with a mean of 544 and standard deviation of 103. Find the probability that a student in psychology department has a score less than 480we need to find the probability P(X≤480)> pnorm(480,544,103)[1] 0.2671816

29. Normal Distribution Quantiles using qnorm() in RqnormThe function qnorm returns the value of the inverse cumulative density function (cdf) of the normal distribution given a certain random variable p, a population mean μ and population standard deviation σ. The syntax for using qnorm is as follows:qnorm(p, mean, sd) qnorm is the inverse function for pnorm. Example: Suppose that the heights of a certain population follow a normal distribution with a mean of 170 cm and a standard deviation of 5 cm. What is the height below which 90% of the population lies?> qnorm(0.9,170,5)[1] 176.4078So the height below which 90% of the population lies is approximately 178.16 cm.

30. Simulating Normal random variable using rnorm() function in Rrnorm is a function in R that generates random numbers from a normal distribution.rnorm(n, mean, sd) This function generates n random numbers from Normal distribution with given mean and sdrnorm generates random values from a standard normal distribution. The required argument is a number specifying the number of normal variates to produce. Example:generate 10 random numbers from a normal distribution with a mean of 5 and a standard deviation of 2:> rnorm( 10, 5, 2) [1] 7.448164 5.719628 5.801543 5.221365 3.888318 8.573826 5.995701 1.066766 [9] 6.402712 4.054417