There are three important concepts to understand: probabilities, proportions, and distributions.
DEFINITIONS:
π Probability: The theoretical likeliness of an outcome. No data is given or collected -- this is based on logic.
π Proportion: The frequency of an observed outcome. Data is collected, and considered categorical.
π Distribution: A distribution shows how the data is spread out and helps us understand patterns about the population; it can be shown graphically. Data collected is considered quantitative.
Longer explanation:
A distribution is a large collection of data values (of varying outcomes) organized to show how often each value occurs. These values can be used to calculate parameters like the mean, variance, or standard deviation. Overall, the data spread is used to find patterns, trends, and variability within a population.
Want to learn more about distributions? Link coming soon.
FOR COMPARISON:
Example 1: Max wants to understand the differences between probability and proportion. He knows in theory, a die has equal chances of rolling 1/6 (0.167 or 16.7%) for each face. But, after several attempts, he found the outcome of the trials, the observed proportions, did not match the theoretical probabilities.
REQUIREMENTS AND WHEN TO USE FO EACH: PROBABILITY, PROPORTION, AND DISTRIBUTION
❗ Probability: Use when you need to find the theoretical outcome or calculate the likelihood of an event when no data has been collected.
Example: When you flip a coin, how likely is it that you will flip heads?
Requirements:
- 0 ≤ P(A) ≤ 1 ( 0 = outcome does not occur; 1 = outcome does occur [Think of it as the scale between 0 and 100]).
- n = 1 (All outcomes must add up to 1).
❗ Proportion: Use when you need to find the frequency of an observed outcome based on collected data.
Example: When flipping a coin 100 times, and heads appeared 43 times, what proportion of flips resulted in heads?
Requirements:
- 0 ≤ p ≤ 1 ( 0 means the outcome does not occur; 1 means the outcome does occur- -- think of it as the scale between 0 and 100).
- n = 1 (All outcomes must add up to 1).
❗ Distribution: Use when you want to understand the spread of a population, e.g. how variable is the data.
Example: After collecting an entire class's score, how close together are the scores? Are they dispersed or clustered?
Distribution will be discussed further in-depth in a different post: Link coming soon.
NOTATION:
Probability (THEORETICAL):
- P(A): Probability of A occurring.
- P(A'): Probability of A not occurring.
- A: Represents the outcome.
- 0 ≤ P(A) ≤ 1: Probability Range.
Proportion (OBSERVED):
- p: Population proportion of an observed outcome.
- x: Number of successful outcomes.
- n: Total number of trials.
- p̂: Sample proportion (estimate of the population proportion(p)).
Distribution:
**Link coming soon:

Comments
Post a Comment