Basic definitions¶
Random events¶
Special events¶
The following operations on events are fundamental:
: and
: or (non-exclusive “or”)
: without
: the complementary event “not ”
: all
: at least one
Probability measure¶
Consider a random experiment with sample space . Recall that we write for the set of all possible events . A probability on is a function which assigns to every event a real number and satisfies the following properties known as the Axioms of Kolmogorov:
We gather now some simple consequences which can be derived from the axioms of a probability measure.
Conditional probabilities¶
In general, the conditional probability is defined in the following manner:
In the preceding display, we have used that . This rule is called multiplication rule and generalizes to the intersection of three (and in fact more) events:
The concept of conditional probabilities can be used to calculate probabilities in the following manner:
Let be a finite partition of the sample space (i.e., , for and for every ).
If is an arbitrary event, we may write
If we apply the multiplication rule times, we obtain the law of total probabilities:
Another useful formula is named after Thomas Bayes. This formula is applied when we have to quantify the likelihood of a hypothesis given an event (i.e., instead of we are interested in ).
A simple application of the definition of conditional probabilities followed by a formal multiplication by one yields
Now, if we partition into the events and and calculate using the law of total probabilities, we get Bayes’ rule:
Due to an Internet configuration error, packets sent from Zurich to Lausanne are routed through Bellinzona with probability 3/4. Given that the packet is routed through Bellinzona, suppose it has conditional probability 1/3 of being dropped. Given that a packet is not routed through Bellinzona, suppose it has conditional probability 1/4 of being dropped.
a) Find the probability that a packet is dropped. Answer: 5/16 b) Can you find the conditional probability that a packet is routed through Bellinzona given it is not dropped? Answer: 8/11
In a game show, a contestant is told the rules as follows: There are three doors, labelled 1, 2, 3. A single prize is hidden behind one of them. You get to select one door. Initially your chosen door will not be opened. Instead, the gameshow host will open one of the two other doors, and he will do so in such a way as not to reveal the prize.
At this point, you will be given a fresh choice of door: you can either stick with your first choice, or you can switch to the other closed door. All the doors will then be opened and you will receive whatever is behind your final choice of door.
Imagine that the contestant chooses door 1 first; then the gameshow host opens door 3, revealing nothing behind the door, as promised. Should the contestant (a) stick with door 1, (b) switch to door 2, or (c) does it make no difference?
Solution. Let denote the hypothesis that the prize is behind door , . We make the following assumptions: The three hypotheses are equiprobable a priori, i.e.:
The datum we receive, after choosing door 1, is one of and (meaning door 2 or 3 is opened, respectively). If the prize is behind door 1 then the host has a free choice; in this case we assume that the host selects at random between and . Otherwise the choice of the host is forced and the probabilities are 0 and 1: $$
$$
Now, using Bayes’ rule, we evaluate the posterior probabilities of the hypotheses:
{#eq-monty-hall}
Hence $$
$$
So the contestant should switch to door 2 in order to have the biggest chance of getting the prize.
Bayes’ formula allows for a fruitful interpretation. Suppose that is observed data (e.g., measurements) and a specific set of parameters (e.g., measurement conditions). Then Bayes’ rule
tells us how probable the parameter set is in view of the measurements : One has to multiply the likelihood of the observed data given the parameter set by the a priori probability and normalize the expression by . The likelihood can be viewed as a function of . It expresses how probable the observed data is for different parameter sets .
In view of this interpretation, Bayes’ rule may be stated in words as follows:
where all of these terms are viewed as functions of .