Who Needs Controlled Vocabularies When We Have Keywords & Free Text Searching?


Aim of this paper is to compare keyword searching in Hrčak: Portal of Croatian scientific and professional journals and subject searching in library catalogues.


Methods used in this research include content analysis and comparison. The research will be done in two phases. The first phase is to analyse Guidelines for authors of 54 active journals in the field of Biomedicine and Healthcare included in Hrčak in order to find out what type of instructions are given to authors regarding the creation of the keywords for the manuscript prepared for the selected journal. In the second phase, research will be done following 4 steps 1. step is to choose a sample of articles from the journals with keywords made by using MeSH thesaurus. All data will be collected into a table with journal title, article title, abstract and keywords. 2. step is to identify all synonyms and all keywords within articles around a similar subject (e.g. abortion, miscarriage). 3. step is to make searches using all variations of terms used by authors (e.g. synonyms and close synonyms) and compare the results in order to see how results are changed if different keywords for the same subject. 4. step is to extract an exhaustive list of Main Heading (Descriptor) Terms and Entry terms from the MeSH thesaurus in order to compare a) authors’ keywords extracted from chosen articles and subject headings from library catalogue assigned to the same articles.

Chosen research topics are abortion and homosexuality (including all keywords closely connected to those terms). Extraction, analysis and keywords test searches will be done only in English and only English keywords examples will be included in the sample of the study.

Results and Discussion

Analysis of Author’s guidelines from 54 journals will give an overview of what is expected from authors on how to approach the keywords creation task. The field of Biomedicine and Healthcare were chosen because it was noticed that some (almost all) journals either suggest or ‘prescribe’ usage of the Medical Subject Headings (MeSH) Index Medicus for keywords (1). In literature review, several approaches to keyword creation can be identified. In most cases, authors themselves choose usually around six keywords which best describe topics of their article (2). Also, sometimes authors have to choose keywords from the existing list (e.g. MeSH). Some examples included automatic abstracting and keyword extraction and another approach is to give that task to a professional indexer, i.e. librarian to the subject indexing/keyword assigning to articles. Keywords are usually not controlled but it would help if they are connected, at least making possible to collocate similar subject areas. On the other hand, subject headings are controlled and often not understandable to regular users. There is a need for controlled vocabularies in the catalogues, since keyword searching retrieves only part of results when there are no subject headings included (3). Research showed that keywords provided by authors have an important presence in database descriptors (4). It is a great challenge to find satisfactory solutions which will provide the best possible search results for both users of Hrčak and library catalogues.


In order to give more reliable resources and attract more (happy and satisfied) users, the Editorial Board of Hrčak Portal should rethink Hrčak’s indexing and search options. There is always an option of a free text searching, but the results of that usually end up with more results and more (un)useful results. On the other hand, journals should think wheatear they should employ professionals to deal with keywords (i.e. information scientists) or introduce controlled vocabulary (thesaurus, subject heading lists) for indexing their articles in order to enhance recall of the more relevant results through existing search options. Since there are interesting and innovative approaches to the representation of the article’s content, journals and Hrčak should consider creating word clouds for an article or even an issue or volume of the journal. Visualization of the content could help in better usage of articles, i.e. journals included in Hrčak.

