Citation Resolution

The amount of scientific information contained in online publications is staggering. PubMed, the primary source of biomedical research articles with over 33 million publications, adds approximately 4,600 new ones each day. Other digital archives such as rXiv and bioRxiv contain millions of versioned pre-prints as well.

Although access to such huge amounts of data can be seen as a boon to the promulgation of scientific knowledge, the sheer number of articles available presents its own challenges to researchers. Finding and downloading a specific paper based on its citation in another article is often a time consuming and laborious task. Many websites that provide citation-based search force users to perform structured data entry, tediously entering designated fields mandated by a specified citation format. This error-prone process is not viable for large-scale citation analyses such as building citation networks or tracking research impact. Instead, these tasks require a fast, accurate, and easy-to-use method for automated citation resolution.

Lexical’s automated citation resolution service addresses this problem by resolving references in any format to a unique document identifier (PMID or DOI). It is capable of accurately resolving references with missing fields (such as publication year or journal volume), partial author lists, typos, and oddly abbreviated journal names. 

With speeds of thousands of resolutions per second, it is suitable for the most demanding of analytical tasks, such as:

  • constructing large citation networks for huge areas of science
  • resolving references in patents and other types of literature
  • tracking emerging trends
  • providing institution and author-level citation metrics

If you are interested in learning more about how citation resolution can be used for your analysis, contact us.

Citation resolution was used to map which fields of science were being cited by COVID-19 preprints early in the pandemic.