The Web of Science database consists of a number of citation indices. The Leiden Ranking uses data from the Science Citation Index Expanded, the Social Sciences Citation Index, and the Arts & Humanities Citation Index. The Leiden Ranking is based on Web of Science data because Web of Science offers a good coverage of the international scientific literature and generally provides high quality data.
The Leiden Ranking does not take into account conference proceedings publications and book publications. This is an important limitation in certain research fields, especially in computer science, engineering, and the social sciences and humanities.
CWTS enriches Web of Science data in a number of ways. First of all, CWTS performs its own citation matching (i.e., matching of cited references to the publications they refer to). Furthermore, in order to calculate the various indicators included in the Leiden Ranking, CWTS identifies publications by industrial organizations in Web of Science, CWTS performs geocoding of the addresses listed in publications, CWTS assigns open access labels (gold, hybrid, bronze, green) to publications, and CWTS disambiguates authors and attempts to determine their gender. Most importantly, CWTS puts a lot of effort in assigning publications to universities in a consistent and accurate way. This is by no means a trivial issue. Universities may be referred to using many different name variants, and the definition and delimitation of universities is not obvious at all. The methodology employed in the Leiden Ranking to assign publications to universities is discussed here.
More information on the citation matching that is performed by CWTS is provided in a paper by Olensky, Schmidt, and Van Eck (2016). For more information on the geocoding of addresses, we refer to a paper by Waltman, Tijssen, and Van Eck (2011). The author disambiguation algorithm used by CWTS is documented in a paper by Caron and Van Eck (2014).