|
"There are
three degrees of untruth:
lies, damned lies, and statistics."
Whatever its
origin, this sentiment has a grain of truth: dubious
statistics are frequently used to bamboozle the unwary (see
examples here). On the other hand, plenty of
unconscious biases can arise if we don't look at figures
properly.
A more positive approach to the topic is:
"There are
three degrees of near-truth:
guesses, guestimates, and statistics."
A quick experiment
Consider the following questions:
- How many Iban are there in the Kuching Division of
Sarawak?
- How many registered voters are there in the Bukit
Selambau constituency?
- What is the expenditure per employee of the bank
HSBC Malaysia Berhad?
First, try a guess at the correct answer.
Second, get together some easily accessible information
and make a better guess - a guestimate - of the
number. For example:
- The total population of Sarawak is about 2 million;
Kuching is the most populous of 9 divisions; the Iban are one
of the largest ethnic groups in Sarawak.
- The total population of Malaysia is about 27 million
and there are 219 members of parliament.
- Make a rough breakdown of employee costs such as
salary, pension and Social Security costs, training,
etc. and add them up.
When you are satisfied that you have the best possible
guestimates, click
here to see the true values.
Third, consider how you would design a study to provide
a better estimate of each quantity if the true answer was
not available. In most cases you will need to collect data
about a sample of the people involved and then make an
inference about the true value. This requires statistics!
Main points:
- The true values include brief explanations of
what is being measured - in particular the date -
and the source of information.
- The true values are referred to as parameters;
they are not statistics. In ecology we rarely know the
true values, and must use estimates inferred from
samples.
- Because our inferences are not exact, they must be
accompanied by an indication of precision, such
as Standard Errors or Confidence Intervals.
A statistic is not a 'true' value, but is an inference
made in a mathematically sound manner from a sample, and
accompanied by a statement of how precise we think it is.
What next?
For more on statistical inference look at the
"squirrels" exercise.
|