Identifying Bibliographic Families in Records on Scholarly Monographs
The aim of this study is to explore the possibilities to use the notion ‘bibliographic family’ in book metrics for scholarly books in the social sciences and humanities. Our focus is the most basic bibliometric indicator, the number of books, which is often used in research evaluation and funding allocation settings. Here we follow the Functional Requirements for Bibliographic Records (FRBR) and acknowledge that books belong to bibliographic families. This means that a scholarly monograph is conceptualised as a 'work' with multiple expressions (e.g. translations) and manifestations (paperback, hardback, e-book, etc.). Within the context of research evaluation the fact that a scholarly monograph has been translated into several languages or it has been published in several editions, can be treated as an indication of the scholarly value of the monograph. Hence the notion 'bibliographic family' leads to a more detailed and for research evaluation crucial contextual information that is not available when using raw publication counts.
This exploratory bibliometric study is based on metadata for scholarly monographs retrieved from two national bibliographic databases for research output: VABB-SHW in Belgium (Flanders) and CROSBI in Croatia (more on the databases in 1). The set of metadata is limited to scholarly monographs in SSH published in 2016 (nFlanders=101, nCroatia=176). This is an on-going pilot study for a study that covers a longer timeframe (2000-2017) and metadata records from multiple national bibliographic databases for research output. In addition, we consult WorldCat.org (OCLC) to identify additional ISBN related to monographs (bibliographic families) in the analysed datasets. This step was required since a preliminary exploration of metadata for scholarly monographs in the two national databases showed that these metadata are insufficient to identify bibliographic families. In further steps we follow the approach used by Zuccala and colleagues (2). First, we delineate a list of unique ISBNs. Second, we search and retrieve in WorldCat.org all related ISBNs. Finally, we explore relationships between the related ISBNs.
Results and Discussion
In the first phase we sought for related ISBNs in WorldCat.org using a list of ISBNs retrieved from the two national databases. Coverage of ISBNs in WorldCat.org turned out to be uneven with respect to the two databases. While for VABB-SHW, 95 % (n=96) ISBNs could be identified in OCLC, for CROSBI it was only 63 % (n=111). For these records we identified, in total, 155 unique additional ISBNs for the VABB-SHW dataset and 35 unique additional ISBNs for the CROSBI dataset. On average each scholarly work (the original ISBN as recorded in the national databases) was represented with 2.6 (Md=2) ISBNs for the VABB-SHW set and 1.3 (Md=1) ISBNs for the CROSBI set. These numbers include the original ISBN. The total number of ISBNs varied from one to fourteen in the VABB-SHW set and to six in the CROSBI set. From the VABB-SHW set an example of a scholarly work represented with 14 ISBNs is ‘World city network’ by Peter J. Taylor. This work was first published in 2003 in four different manifestations: as paperback, hardcover, and two e-book versions. Two other e-book versions were published in 2004 followed by a new edition in 2015 in multiple manifestations (e.g. e-book versions in 2015 and 2016). For one of the e-book versions a record was created in VABB-SHW. From the CROSBI set an example of a scholarly work represented with 6 ISBNs is ‘Green jobs for sustainable development’ by Ana-Maria Boromisa, Sanja Tišma and Anastasya Raditya Ležaić. All the 6 ISBNs appear to be different manifestations of the same scholarly work. However, sufficient metadata are not available for all ISBNs to fully describe what each of the ISBNs stands for. This limits the amount of detail for descriptions of entities in the identified bibliographic families.
Our analysis shows that the usefulness of this notion in book metric context is highly dependent on the availability of rich metadata. Furthermore, the availability of metadata appears to vary by national context. While for the dataset from VABB-SHW nearly all ISBNs could be identified in OCLC, this is not the case for CROSBI. Aside from these limitations, it is evident that the notion ‘bibliographic family’ adds additional contextual information for book metrics. When data on additional ISBNs are available at all, they do occasionally reveal that a single scholarly work has over time had different expressions and manifestations. For example, a scholarly monograph which in standard settings would be represented with a bare number ‘1’, can now be accompanied with information that this is a revised version of a monograph published in year X and that Y number of translations are to be published in year Z. Overall, however, the identification of bibliometric families for even such small sets of bibliographic records as explored here turns out to be a time and work-consuming task thus limiting the usability of this approach for book metrics.
The main conclusion from this study is two-fold. On the one hand, it is evident that for book metrics the notion ‘bibliographic family’ leads to richer contextual information that is not available with basic publication counts. On the other hand, the differences in data availability for VABB-SHW and CROSBI data show that the currently available bibliographic metadata for scholarly monographs are insufficient to incorporate the notion ‘bibliographic family’ in book metrics. A possible way to overcome this is to explore how to enrich records for scholarly records using other bibliographic resources. This includes resources such as national bibliographies, publisher records, and international services such as, for example, Google Books and GoodReads.