The iframe on this page is empty and contains no contentSkip to content

NERC Centre for Population Biology

The GPDD Data

 
 
 

As a source of animal and plant population data, The Global Population Dynamics Database is unrivalled. Nearly five thousand separate time series are available here. In addition to all the population counts, there are taxonomic details of over 1400 species. Read on to see how we have dealt with quality control, what sorts of data types were available to us and where we obtained the data from.

Quality control

To be analytically useful we have limited the database to time series which have a minimum of ten generations. Usually, this means ten years. Occasionally, where data sets are particularly interesting—they may be of a poorly studied species, for example—we have included time series which are at least ten years from the beginning to the end of sampling, i.e. they just fulfil our minimum series length criterion, but which also have one or more missing data points.

Most of the datasets are of natural, i.e. unmanaged populations, or sometimes of the unmanipulated controls from experimental studies. Even apparently unmanaged populations may be subject to human influence, either indirect or direct, for example some of the primate populations contained in the database have been supported by supplemental feeding in some years. In spite of this we have included them because primate time series are comparatively unusual. Notes in the database record facts such as this, and the user is referred to the original source to determine whether, or to what extent, this is likely to prejudice any analysis. Population data from some laboratory experiments are also included, and are marked as such.

Because we have included population counts from a very wide range of sources, there is considerable variation in the quality of the data. Although it does not guarantee accuracy, the peer review process which is applied to much published work may filter out some of the more unreliable data. It is usually difficult to obtain an accurate, objective measure of data quality, and it is often necessary to fall back on subjective judgement. This we have done, based on criteria such as the type of environment or habitat sampled, the species in question, the area of the sampling site, and the method of sampling. Each dataset has been ranked, on a scale of 1 (low) to 5 (high), for apparent data quality. For example, the database contains numerous very long datasets of fur trapping records from North America. As animal population data they are highly unreliable, because the numbers of skins exported, which is what they record, depend heavily on factors other than the numbers of animals available for trapping. Nevertheless they are unique, and have been included to provide a context for other contemporaneous datasets, rather than as hard ecological data themselves. These we have given a rank of 1. At the other end of the scale, the database contains a number of UK estuarine datasets collected by automatic sieve sampling, which has been completely consistent over the entire sampling period (17 years). They exemplify the highest possible quality of sampled population data, and these we have ranked as 5. In all cases the ranking is provided as a guide only, and it will be for you to determine whether each dataset meets your specific requirements.

The collecting of ecological data, especially over long periods of time, may be subject to all sorts of interruptions and variation as circumstances, funding or staff changes over the years. Usually, where such changes are relevant they are referred to in the published material, and we have endeavoured to mirror any warnings, caveats or similar points in the Notes field for each dataset. Some of the longer data sets collected this century, including the one for North Sea plaice on our home page, have gaps due to the war years 1914-18 and 1939-45.

Data types

The most reliable time series are obtained when an entire population (say large mammals on a small island, or in a nature reserve) is counted at fixed intervals. The resulting numbers are unambiguous, and may include demographic details. Most organisms, however, do not yield quite so readily to sampling and, of necessity, many other measures frequently appear in the literature. Typically, the values entered into the database refer to samples from a larger population, or to an index which might represent the sum of counts or samples over more than one site. For example the GPDD contains aggregated indices of abundance of about twenty wading bird species from numerous sites around the British coastline. Sometimes, if biases are suspected, a correction factor may be applied to data before they are published. In these cases it is the published values which we have entered into the database, as usually the raw data are not readily available.

Data sources

The time series in the database come from both published and unpublished sources. They have been located in a variety of ways:

  • By systematically searching back issues of the predominant ecological and science journals.
  • By following citation trails: when a suitable time series is located in a publication, there are usually one or more references to similar or comparable studies in the citations list: every paper tends to lead to another paper.
  • By searching the world wide web, where an increasing number of ecological datasets are being made available.
  • By searching promising book titles. The vintage literature often contains a wealth of tabulated data of varying types, and we have drawn extensively upon the resources of the Imper ial Col l ege library and the British Library to locate many long-out-of-print volumes.
  • By negotiating access to unpublishe d dat a. Th r ough our network of professional contacts we have endeavoured to locate unpublished data which collectors are prepared to contribute to the project. Quite often, and quite understandably, collectors prefer to retain data for their own use, at least until they have published. We have, however been delighted at the selfless response of several collectors, who have donated unpublished data with no, or few, restrictions.

All datasets are fully referenced in the database, and, before embarking on any analysis, we encourage you to locate and refer to the original source material wherever possible. In most case we have been able to provide the complete citation to enable you to do this.