For those struggling to understand discrete probability distributions, this section will summarize the definition, clarify difficult jargon, introduce useful formulas, and find important parameters.
What is a Discrete Probability Distribution?
A discrete probability distribution is a spread of theoretical outcomes where each countable outcome (called a random variable) is assigned a specific probability.
It sounds confusing because it is confusing -- but, let's break it down by term:
π Discrete: The outcome is countable numbers-- no fractions or decimals.
(1) The number of people (note: you cannot have half a person--hence countable).
π Probability: The theoretical chance of an outcome occurring. This means NO data collection is needed.
(1) A six-sided die (each side has 1/6 a chance of being an outcome).
(2) A coin -- heads or tails (each side has 1/2 a chance of being an outcome).
π Distribution: The spread of probability across each possible value of a random variable.
Other Related Jargon:
π Random Variable: A numerical value that represents the outcome of an experiment.
For example, if you're counting how many people in a room use social media, X (the random variable) could range anywhere from 0 to the total number of people in the room.
**Each random variable (the outcome) is unique. Each outcome can only be counted once (such as 1, 2, or 3 people in a room).
π Parameter: The numerical measurements used to describe the characteristics of a population. For example, population mean, standard deviation, variances, etc.
**Population: The entire group being studied.
Requirements for Discrete Probability Distribution:
πBefore using the formulas, these two requirements must be met:
1) Each probability must be between 0 and 1
0 ≤ P(x) ≤ 1 for all probabilities.
2) The sum of all probabilities must equal 1.
∑P(x) = 1
What are the formulas?
1) Mean ΞΌ = ∑xP(x)
π Mean: The measure of the center of a data set.
-Another word for mean is expected value. The expected value is what we expect for the random variable to be on average.
Understanding the formula:
x: The random variable -- this represents the possible outcomes (e.g. dice rolls (faces: 1, 2, 3, 4, 5, 6), number of people (0, 1, 2, 3...)). NOTE: A random variable is not a "real" quantity, it is only a label for an outcome.
P(x): The probability of the outcome occurring.
► THE GOAL: Each random variable is multiplied to its assigned probability, then all products are added up. The probability acts as a "weight" for the random variable, by pulling and pushing the mean closer to the outcomes that occur more frequently.
Example: Let's say we rolled a six-sided die 100 times, and found that rolling a 1 happened 60% of the time. We can agree that 1 has the highest probability, and thus will occur more often than any other number. This results in a mean that reflects a value that is closer to 1.
2) Variance: Ο² = ∑(x - ΞΌ)² P(x)
π Variance: The measure of variety in a data set, and how spread out the data is. It is a hint to how far the data varies from the mean, but not a numerical measure.
*A small variance means the data has less variety, and the values are more tightly packed around the mean (e.g. 1, 2, 3, 4 -- less extreme differences in the values). A large variance means the data is spread out and more extreme (e.g. 2, 5, 10, 12 -- bigger variety in numbers).
*INSIGHT: A small variance = a more precise mean; a larger variance = a less precise mean.
Understanding the formula:
x-ΞΌ: This stands for the difference between a data point and the the data's mean, also known as deviation. By subtracting the difference, we can find out how far the inputted data point will lie in comparison to the data's center (i.e. the "distance" from the center).
∑(x - ΞΌ)²: The inner part: x-ΞΌ, is explained above, but now, why square it?
In short, squaring captures the variety in the data.
**In longer terms, the mean already accounts for the differences in the data values--remember, we had to add all data values to get the mean. So, if we do not square, subtracting each data points from the mean and adding all the differences will get a zero variance.
Squaring the differences helps avoid cancellation and makes the values positives. It gives the true measure of how spread out the data is.
Example: There are three values: 3, 4, 8.
1) First, we must calculate the mean: (3 + 4 + 8) / 3 = 5.
2) Next, we compare with and without squaring:
►Without squaring: (3 - 5) + (4 - 5) + (8 - 5) = 0.
0 means that there is no variability between the mean and the points.
►With squaring: (3 - 5)² + (4 - 5)² + (8 - 5)² = 14.
14 represents the spread of the data.
NOTE: The number given by variance can show approximately how far the spread of data is from the mean. In the example, however, the value 14 doesn't mean the data points are 14 units away from the mean. Variance is more like a hint used to determine the variety. Typically, if the variance is much larger than 1, it is considered to have a large spread. If it the variance is closer to 0, the spread is smaller and closer together.
P(x): In this equation, P(x) represents probability that the outcome will occur. Probability acts as a weight-- the more often it occurs, the greater the impact on the variance.
3) Standard Deviation: Ο = √(∑(x - ΞΌ)² P(x))
π Standard deviation: Similarly to variance, standard deviation is a measure of variety in a data, and the spread of that data. It uses real units to tell how far exactly the point deviates from the mean.
Understanding the formula:
Standard deviation has a very similar explanation to variance when it comes to the formula, so if you want the break-down of the formula, scroll up to variance.
The square root allows the standard deviation to be put back in to the original units of the data; e.g. if the data unit is in terms of people, so is the standard deviation.
1) Variance provides a rough estimate of the data's variability, whereas standard deviation shows how far a data point is from the mean, using the same units as the data.
2) Standard deviation is used in real-world scenarios where precision is important, while variance is used theoretical examples because there is no real-data, and approximation is sufficient.
Comments
Post a Comment