![]() ![]() ![]() |
Frogs 2 : Likelihood | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
ObjectivesIn "Frogs 1 : binomial distribution" we simulated some data for detections of frogs at ponds where we knew frogs were present. Here we will use those data to explore the concepts of likelihood and maximum likelihood estimation. Likelihood was developed by Ronald A. Fisher in the 1920s and it became a cornerstone of information theory in the 1960s. Quick-and-dirty methods to calculate maximum likelihood estimators are available for simple problems, but complex models had to wait until modern computers became available. The table of probabilities and likelihoods was done in an Excel spreadsheet which you can download here. Estimating detection probabilityWhen we simulated data for the detection of frog calls at
10 ponds, participants 'detected' frogs at varying numbers
of ponds, mostly between 5 and 9, but with one person
detecting frogs at only 3 ponds. As a result, the estimates
of detection probability,
I 'detected' frogs at x = 6 ponds out of n = 10, and my estimate of
LikelihoodTo answer that question, we need to look at the probability of detecting frogs at 6 ponds out of 10 with various values of p(detect). The table below shows the probabilities of the different outcomes (x = 0, 1, 2, ... 9, 10) for different values for p(detect). These were calculated using BINOMDIST in the Excel spreadsheet "Frogs2_likelihood.xls".
Each row of the table gives the probability Prob(X = x | p) given a specific hypothetical value of p from 0 to 1 (the vertical bar, |, means "given"). Each row adds up to 1. In the previous lab ("Frogs 1...") we displayed a table with just the row for p = 0.7, ie. Prob(X = x | p = 0.7); that's the row highlighted in yellow in the table above. My result was x = 6. Look down the column corresponding to x = 6, which is highlighted in blue. Note that the values in the column do not add up to 1; this column adds up to 0.909. In the Excel spreadsheet, I put in extra rows for p = 0.55, 0.59, 0.61 and 0.65; with those extra rows the column total went up to 1.886. In fact, we can put in as many rows for hypothetical values of p as we like, and the column total will change each time. The values in the highlighted column are the likelihoods of
the different values of p for the observed result of x =
6. We use a curly L for likelihood:
If both p and x are fixed, they are the same; for example (the grey square in the table):
An analogy with streets and avenuesThink of a city such as New York where the roads are laid out in a grid. Avenues run north-south and Streets go east-west. Suppose you are at the intersection of 5th Avenue and 34th Street, right outside the Empire State Building. If you head west, you go along a Street; if you head north, it's an Avenue. In the same way, if you head west from the grey cell in the table, it's Probability; if you head north, it's Likelihood. But at the intersection, it's ... well, both. Plots of probability and likelihoodThe graphs below show the probability distribution of x for p = 0.6 (left) and the likelihood curve for p when x = 6 (right). The red line in each graph corresponds to the value when p = 0.6 and x = 6, which is the maximum likelihood, 0.251.
Maximum Likelihood EstimateNow look down the x = 6 column (blue) and see which value of p has the maximum likelihood. You'll see it's p = 0.6. In the spreadsheet, I put in rows for p = 0.59 and 0.61, just to make sure that p = 0.6 really was the maximum. This value,
To put it the other way around, it's the value of p which maximizes the probability of observing x = 6, rather than other possible values of x. We can calculate the estimate just with
Strictly speaking...Strictly speaking, the likelihood for given x and p is proportional to the probability for those values, ie.
where C is a constant for all values of p and x;
we have simply taken C = 1. This works for discrete data such as
we have in this example. In many cases, such as when using a probability
density function, we can only calculate relative likelihoods, ie.
C
Main points
What next?For more on modeling occupancy - including using likelihood and AIC to select models -go here. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Text by Mike Meredith, updated 11 May 2009 | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
![]() ![]() |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||