Baselines

A baseline is the average performance of a global set of publications with the same subject area, document type and year.

For example, a global set might consist of all articles in the field of chemistry published in 2006. Baselines and subject schemas create useful reference points for comparison, and they are the basis of normalization to overcome subject bias.

Baselines are calculated using a whole counting method, this means that all papers in a subject area are counted towards the baseline calculation regardless of whether those papers are also in other subject areas or not.

This table shows some sample publications A-D that are in different subjects and have different document types. For simplicity of the demonstration of the calculation all papers are in the same year, but in reality, baselines are also calculated for each year. The citation impact (average citations per paper) baseline for each variant of subject, year and document type will be calculated as the mean average.

Baseline Calculation Example
ArticleID Times Cited Subject Areas Document Type Year
A 0 Chemistry, Organic Article 2010
B 12 Chemistry, Organic & Chemistry Physical Article 2010
C 5 Chemistry, Physical Article 2010
D 8 Chemistry, Organic Review 2010

Calculation for Citation Impact Baseline

Where: e = the expected citation rate or baseline, c = Times Cited, p = the number of papers f = the field or subject area, t = year and d = document type. For Articles in the filed Chemistry, Organic published in 2010 (A&B) it would be:

Organic Chemistry Baseline Articles Calculation

For articles in Chemistry, Physical published in 2010 (B&C) it would be:

Physical Chemistry Baseline Calculation

For reviews in Chemistry, Organic published in 2010 (D) it would be:

Organic Chemistry Baseline Review Calculation

The citation distribution for any set of publications is typically skewed towards a small number of highly cited papers and a large number of papers with relatively few citations. Because baselines are based on the mean of a set of papers, and the mean is influenced by the presence of highly cited papers, expect a considerably higher mean in comparison to the median. Therefore, more than half the publications are below the mean average.

The following chart shows the differences between the Citation Impact of various subject categories. Mathematics has a lower Citation Impact than Biochemisty & Molecular Biology. Recent publications exhibit lower citation impact because older papers had more time to accrue citations, and will have a higher average citation count.

Citation Impact can vary significantly across different disciplines and time periods, so it cannot be effectively used to compare entities that are in different subjects or years. In these cases, we suggest using some form of normalization to allow for the differences in fields and time (see Normalized Citation Impact, % Documents in Top 1% ,% Documents in Top 10%, and Average Percentile).

Citation Baseline Impact Chart

Five-Year Baseline

Five-year baselines are used in the five-year trend graph.

Each document will be assigned five new baselines: one for each five-year time period in which it appears.  

Each five-year time period acts as a single year baseline as described in the information on this page, normalizing for document type, category/journal, and using an average of baselines for documents appearing in multiple categories.