Biodiversity indices
 stats main page

 R   Biodiversity scripts (zip, 22KB)

Making sense of biodiversity indices

Biodiversity indices purport to be a combined measure of species richness and species evenness. As we've seen, richness is often impossible to estimate, and evenness is difficult to define (more on that later). A large number of formulae have been proposed - see a partial list here.

An R script to try out a range of diversity indices, together with the functions and data sets you need, is here.

At one point I concluded that indices were just a mathematical carpet under which the problems with estimation of richness and evenness had been swept. However, Mark Hill has ideas which make sense of some of the indices.

Hill's diversity numbers

Hill (1973) noted that rare species - the ones that are so difficult to count - are ecologically less important than the common ones. (If that observation doesn't apply to the situation you have in mind, you need to (a) redefine the population so that it does apply, or (b) abandon the idea of counting species.)

Hill's 'diversity numbers' can be thought of as the number of common species in the population. But which count as 'common' and which are 'rare'?

Suppose we were brutal, and decided to look only at the most common species. The Berger-Parker index does that: it's just the proportion of the commonest species. A patch of coniferous forest in Canada may be 90% white spruce, and the BP index would be 0.9. Hill's Ninf = 1/BP = 1/0.9 = 1.1. This forest is almost a monoculture, and the value just over 1 reflects this. It doesn't matter much what the remaining 10% are, nor how many rare species it might contain.

Would that work in our Bornean forests? An inventory of a 1ha plot in Temburong (Brunei) identified 1011 trees; the most common, Fordia splendissima, accounted for just 6%. Hill's Ninf = 16.7, which is rather low, considering that 276 species were actually recorded! Looking only at the most common species, ignoring the rest, is a bit too extreme.

Let's try Hill's N2. This involves the proportional abundances of each species (pi), ie. for each species (i), we take the number of individuals of that species in the sample (ni) and divide by the total number of individuals in the sample (N = Σni). We square the pi's, add them up, and take the reciprocal:

Squaring the pi's means that the common species have greater weight than rare ones: a species with 50% in the sample has pi2 = 0.25, but for a species with only 1% it's 0.0001.

Hill's N2 for the Temburong forest works out at 75.1. If we look at the data, we'll see that 75 species account for 68% of the trees in the plot, the other 201 species only 32%. That looks like a reasonable split between 'common' and 'rare' species.

A whole range of Hill's diversity numbers exists: the general rule is:

So to calculate N3 we cube all the  pi's, add them up, then take the square root before inverting. The main ones of interest are:

The numbers go down steadily as a increases: N0 is the biggest, Ninf  the smallest.

To sum up:

  • Hill's numbers are related to well-known indices,
  • Rare species have decreasing weight from N0 to Ninf
  • Missing out rare species has less effect on the index and reflects the relative ecological importance of common species,
  • N2 or 1/Simpson's index seems a reasonable compromise.

Evenness revisited

In principle, diversity is a combination of richness and evenness. In practice however, evenness measures are defined in terms of diversity. If I is an index of diversity, the corresponding evenness index, E, is defined as:

E = I / Imax

where Imax is the value I would take if the abundances in the sample were all equal. Unfortunately, Imax is usually highly sensitive to the number of species in the sample. We have managed to devise a diversity index which is not overly sensitive to the number of rare species captured in our sample, but the trade-off is an evenness index which is more sensitive to missing rare species.

Hill points out that the ratio of any pair of Hill diversity numbers can be used as an evenness measure, but he prefers to avoid N0 because it reflects the sampling event rather than the actual value for the community. He suggests using N2 / N1, both of which are reasonably well insulated from the effects of sampling.

Hill's 1973 paper is well worth reading: 'Diversity and evenness: a unifying notation and its consequences'. Ecology 54:427-431


Most biodiversity indices lack a way of quantifying the precision of our estimates with a standard error or confidence interval. This can be done with a bootstrap: see here for details of this.

wcsmalaysia.org home
Text by Mike Meredith, updated 7 April 2010