| Return to Statistics Topics | Return to masstutor.net Homepage |
When a sample space has to be processed in a certain way to create a more appropriate set of data, this process is called a random variable. A random variable maps the sample space onto another set. This other set can be the real number line or an axis in the coordinate plane. This mapping processes also called a function. Random variables are usually described with capital letters such as X, Y or Z. It can be defined as:
An example would be a coin flipped three times. The outcomes would be the sample space S.
| S = { | TTT, | TTH, | THT, | HTT, | HTH, | HHT, | THH, | HHH | } |
| s1 | s2 | s3 | s4 | s5 | s6 | s7 | s8 |
Counting the number of heads that occur for each element would be a function of the sample space and therefore a random variable.
| 0 heads | 1 head | 2 heads | 3 heads |
| X(s1) = 0 |
X(s2) = 1 X(s3) = 1 X(s4) = 1 |
X(s5) = 2 X(s6) = 2 X(s7) = 2 |
X(s8) = 3 |
As a result, the range of values that the random variable can have is:
X = 0, 1, 2, 3
If the range of the random variable contains values that can be counted, it is a discrete random variable.
When probabilities are associated with random variables, the result is a probability density function which is sometimes written as pdf.
For example, the random variable, X, that counts the number of heads in each outcome has the following range of values: 0, 1, 2, 3. Since there are eight outcomes, and only one outcome has 0 heads, the probability density function would produce 1/8 as the probability of getting an outcome was no heads. This would be written as:
P( X = 0 ) = 1/8
The probability for the other outcomes would be:
P( X = 1 ) = 3/8
P( X = 2 ) = 3/8
P( X = 3 ) = 1/8
When placed in the coordinate plane this would look like:
Notice all the probabilities add to 1.
If the range of a random variable contains values that are intervals with an infinite number of values, then it is a continuous random variable. Continuous random variables also have pdf's. Note that the value of the pdf itself isn't equal to probability. It is the interval of the pdf that produces a probability value.
If y is a continuous random variable and if the pdf is a function fY(y) where y is a value of the random variable. Then the probability of the interval on the pdf is:
P( a ≤ Y ≤ b ) = &intabfY(y)dy
Where a and b are endpoints on the interval.
If there are two random variables in the sample space, the pdf that describes both random variables is a joint pdf and for discrete random variables would be noted as:
PX,Y(x,y) or P(X=x, Y=y)
Where X and Y are the random variables.
For continuous random variables, a joint pdf would correspond to a region on the surface of a plane:
P( a ≤ X ≤ b, c ≤ Y ≤ d) = &intab&intcdfX,Y(x,y)dydx
If the random variables are related to independent events, they can be defined as independent random variables as follows:
X and Y are independent if and only if:
fX,Y(x,y) = fX(x) * fY(y)
| Return to Statistics Topics | Return to masstutor.net Homepage |