Open Research Data in the Field of Phraseology


The aim of this paper is to investigate the concept of open research data in phraseology. Namely, the key factors of open science within European Union are: digital technology, belief in free circulation and criticism of ideas, as well as considering the role of data by researches. Digital technology nowadays enables a fast exchange and new ways of sharing and accessing the data. The existing exchange of research data in the field of phraseology is usually realized through publications (e.g. research articles, dictionaries). The complexity of the form, meaning and usage provides a challenge for describing phrasemes in lexicographic sources. So far, there are open research data in the field of linguistics, but not phraseology in particular.

The main research questions are: What are open research data in the field of phraseology? Which metadata elements are important for phraseology in the context of openness?


The analysis was done according to approaches in the field of phraseology (1), demands defined through FAIR principles, and to generic tasks taken by the user referring to: find, identification, select and reuse. The generic tasks derive from the FRBR concept.

The phraseme in digital environment is specified by a digital object, which is described by metadata elements. These elements are analyzed and identified on two levels. The first level refers to scientific content, and the second one is related to its digital representation

Phrasemes of the German and the Croatian language of fashion and football language are used as the corpus for this research.

Results and Discussion

Phrasemes are multiword combinations of various forms whose constituents create a new meaning (e.g. as cool as cucumber, someone’s right hand, under one roof, to be under the weather). Their main features, beside polylexicality, are stability and idiomaticity, i.e. their form is fixed and the meaning is figurative. Moreover, they can be used in stylistically various contexts and can create or attribute to expressivity of texts (2). Phrasemes can be used in various text types, e.g. in journalistic texts, literature, slogans, but are also often used in spoken language. They are considered a challenge in foreign language teaching and especially in translation and transcultural studies, due to the fact that they are usually described as culturally specific. Some phrasemes from different languages share their origin (e.g. the Bible, folk tales, fables), but most of them are language specific.

Considering phrasemes as data means that they are, just like other lexemes of a language, the result of writing down what had been heard, read or written. They can be noted in monolingual, bilingual, or multilingual dictionaries as well as in corpora, and are described according to their meaning, stylistic markedness, and in some cases the context of their use as well as its source are given. The existence of phrasemes as data, besides in dictionaries, can be confirmed in literary works, magazines, newspapers, different types of texts, and in everyday speech. Regarding their figurative meaning, some phrasemes can be linked to concepts such as space or time. They can also serve to entertain, decrease or increase the negative meaning, or to present something vividly. Phrasemes derive from the way of life in a certain period of human history, from cultural specificity, beliefs or customs.

Open research data are the results of scientific research, they can be freely digitally accessed, are published in a machine readable form and can be reused. According to Pampel and Dallmeier-Tiessen (3), open research data are available on the Internet and users can access, copy, analyze, re-process, and use them for any purpose. An important element of open research data are the following FAIR principles: they should be findable, accessible, interoperable, and reusable (4). Sharing research data includes various users, and requirements like searchability, availability, and usage (5). The importance of metadata for open research data is given through various country and research group directions, as well as through scientific research. Metadata are used to present all data related to the content (e.g. what the object includes), the context (e.g. who made the object), and to the structure (e.g. information about the object) (6). In order to access phraseological data or a group of data in the digital environment, they need to be described with the appropriate metadata.

The research identified and described the initial metadata elements that can help to exchange and search phraseological data in digital environments. The elements can be divided in two categories: research and digital representation. The research category consists of the following groups of elements: basic elements, contextual elements, methodological elements, and specific elements: The basic elements comprise the persistent identifier, the author/organization, the source, the phraseme, its meaning, its structure, the phraseological class, the grade of idiomaticity, the grade of motivation, modification, semantic fields, stylistic markedness, and equivalents in different languages. Contextual elements refer to the type of text and topic. Methodological elements refer to descriptive and contrastive method, as well as to the approach of the Systemic Functional Linguistics - Appraisal Theory, all used in the research of the phrasemes of fashion and football language. Specific elements, with regard to the investigated corpus, comprise position in the text, producer of the phraseme, the object described with the phraseme, the behaviour described with the phraseme, loanwords as components, emotions expressed with the phraseme. The second category, the digital representation, refers to datastream and elements related to the version, the organization, legal information and access rights.


This investigation shows that phrasemes can be analyzed as open research data. They have important characteristics and properties for exchange among researchers in the field of phraseology. Basic categories and groups of elements were identified. Further investigation will include the evaluation of results by other researchers and users.

  • Gwen FranckGwen FranckEIFL, Lithuania

    Gwen Franck is consultant and facilitator, interested in the ‘hands on’ aspects of Open Science such open access publishing, self-archiving… More →

  • Victoria TsoukalaVictoria TsoukalaEuropean Commission

    Victoria Tsoukala works as a Policy Officer in the European Commission, DG RTD.G2: Open Science, in Secondment from her position at the… More →

  • Adriaan van der WeelAdriaan van der WeelLeiden University

    Adriaan van der Weel is Bohn extraordinary professor of Modern Dutch Book History at the University of Leiden and lecturer in Book and… More →

  • Sami SyrjämäkiSami SyrjämäkiFederation of Finnish Learned Societies

    Dr Sami Syrjämäki is the head of publications at the Federation of Finnish Learned Societies. His expert work focuses on science policies… More →

  • Thed van LeeuwenThed van LeeuwenLeiden University

    Thed van Leeuwen is a senior researcher at the Centre for Science and Technology Studies (CWTS) of Leiden University in the Netherlands. As… More →

  • Andrei RostovtsevAndrei RostovtsevDissernet, Russia

    Prof Andrei Rostovtsev is a Russian physicist, doctor of physical and mathematical sciences. He graduated from the National Research Nuclear… More →

  • Vanessa ProudmanVanessa ProudmanSPARC Europe

    Vanessa Proudman is Director of SPARC Europe; she is working to make Open the default in Europe. Vanessa has 20 years’ international… More →

  • Ana MarušićAna MarušićUniversity of Split

    Ana Marušić is Professor of Anatomy and Chair of the Department of Research in Biomedicine and Health at the University of Split School of… More →

  • Alen VodopijevecAlen VodopijevecRuđer Bošković Institute

    MSc Alen Vodopijevec obtained his diploma in 2003 at the University of Zagreb, Faculty of Social Sciences and Humanities, and currently is… More →

  • Anita Pavić Pintarić
  • Damien VannsonDamien VannsonThunken

    Builder at heart, driven by the satisfaction of turning shower thoughts and back-of-the-envelope plans into full-fledged, user-friendly… More →

  • Danijel GudeljDanijel GudeljUniversity of Zagreb

    Danijel Gudelj is M.A. of sociology and croatology, graduated at Centre for Croatian Studies, University of Zagreb. Currently, he is a… More →

  • Blaž RebernjakBlaž RebernjakUniversity of Zagreb

    Blaž Rebernjak was born in Zagreb in 1983, where he finished primary and secondary schools. In 2007 he obtained his MA and in 2013 his PhD… More →

  • Evgenia Arh
  • Drahomira CuparDrahomira CuparUniversity of Zadar

    Drahomira Cupar, Phd, is an assistant professor at the University of Zadar, Department of Information Sciences. She obtained her PhD in… More →

  • Elizabeth WagerElizabeth WagerSideview

    Elizabeth (Liz) Wager, PhD is a freelance consultant and trainer who has worked on six continents. She chaired the Committee on Publication… More →

  • Filip HorvatFilip HorvatUniversity of Rijeka

    Filip Horvat is a librarian at the Faculty of Civil Engineering, University of Rijeka. He received his Master’s degree in Information… More →

  • Goranka MitrovićGoranka MitrovićNational and University Library in Zagreb

    Goranka Mitrović, senior librarian, works at the National and University Library in Zagreb, Croatia (NUL) since 1993. Her research interest… More →

  • Draženko CeljakDraženko CeljakUniversity Computing Centre

    MSc Draženko Celjak is the head of data services at SRCE – University of Zagreb University Computing Centre. He coordinates and leads the… More →

  • Iva Melinščak ZlodiIva Melinščak ZlodiUniversity of Zagreb

    Iva Melinščak Zlodi works as an e-resources librarian at the Library of the University of Zagreb Faculty of Humanities and Social Sciences… More →

  • Ivana MajerIvana MajerUniversity of Zagreb

    Ivana Majer graduated from the Faculty of Humanities and Social Sciences at the University of Zagreb, and got her degree in Croatian… More →

  • Irena KranjecIrena KranjecUniversity of Zagreb

    Irena Kranjec works as a subject librarian for information sciences at the Library of the Faculty of Humanities and Social Sciences… More →

  • Jasminka MaravićJasminka MaravićCARNet Department for Education Support

    Jasminka Maravić is Project Manager at CARNet Department for Education Support. During her 14 years in CARNet she has been involved in… More →

  • Krešimir ZauderKrešimir ZauderUniversity of Zadar

    Krešimir Zauder was born in Zagreb, Croatia in 1980. He graduated Information science and English language and literature in 2006. In 201… More →

  • Jure TriglavJure TriglavCollaborative Knowledge Foundation

    Jure is the lead developer at the Collaborative Knowledge Foundation, where he develops the PubSweet framework and supports its community. More →

  • Josipa Zetović
  • Kristina RomićKristina RomićNational and University Library in Zagreb

    Kristina Romić works at the Acquisition Department, National and University Library in Zagreb, Croatia. She graduated from the Faculty of… More →

  • Ksenija Baždarić
  • Ksenija Švenda RadeljakKsenija Švenda RadeljakUniversity of Zagreb

    Ksenija Švenda Radeljak is employed at the Library of Department of Social Work at the Faculty of Law University in Zagreb. The areas of her… More →

  • Linda SīleLinda SīleUniversity of Antwerp

    Linda Sīle is doctoral student at the University of Antwerp within the Centre for R&D Monitoring (ECOOM). My current work spans somewhat… More →

  • Lovela Machala PoplašenLovela Machala PoplašenUniversity of Zagreb

    Lovela Machala Poplašen is a head librarian at the Andrija Štampar Library, School of Public Health, School of Medicine, University of… More →

  • Ljiljana Poljak
  • Luc BorutaLuc BorutaThunken

    Ph.D. in computational linguistics, natural language processor, interested in linked data and linguistic diversity. In previous lives, Luc… More →

  • Ljiljana Jertec MusapLjiljana Jertec MusapSRCE – University Computing Centre, University of Zagreb

    MSc Ljiljana Jertec is a librarian and computer specialist at SRCE – University of Zagreb University Computing Centre. She has a Master’s… More →

  • Lucija VejmelkaLucija VejmelkaUniversity of Zagreb

    Lucija Vejmelka is an assistant professor at the University of Zagreb, Faculty of Law, Department of Social Works, where she leads the… More →

  • Marijana Briški Gudelj
  • Marijana GlavicaMarijana GlavicaUniversity of Zagreb

    MSc Marijana Glavica works as a systems librarian at the University of Zagreb Faculty of Humanities and Social Sciences Library, where she… More →

  • Marina Cvitanušić BrečićMarina Cvitanušić BrečićCroatian Agency for Science and Higher Education

    Marina Cvitanušić Brečić works at the Analytics and Statistics Department of the Croatian Agency for Science and Higher Education (ASHE… More →

  • Marina GrubišićMarina GrubišićCroatian Agency for Science and Higher Education

    Marina Grubišić is the head of the Analytics and Statistics Department of the Croatian Agency for Science and Higher Education (ASHE). She… More →

  • Matko MarušićMatko MarušićUniversity of Split

    Matko Marušić is Professor Emeritus at the University of Split, Split, Croatia. He was a Professor at Medical Schools (in Zagreb and Split… More →

  • Nicolas Robinson-Garcia
  • Neven Pintarić
  • Paulin RibbePaulin RibbeOpenEdition

    Paulin Ribbe is Project Manager for the OPERAS infrastructure at OpenEdition (France, Marseille - CNRS, AMU, EHESS, Avignon Univ.). He holds… More →

  • Radovan VranaRadovan VranaUniversity of Zagreb

    Born in Zagreb, Croatia. Primary and secondary education completed in Zagreb. Croatia. Graduated information sciences and the English… More →

  • Rafaelly StavaleRafaelly StavaleUniversity of Brasília

    Rafaelly Stavale is a current student of Nursing at Universidade de Brasília – UnB. She has recently completed the Principles and Practices… More →

  • Olga KirillovaOlga KirillovaAssociation of Science Editors and Publishers (ASEP), Moscow, Russia

    Olga V. Kirillova, Candidate of Science (Engineering, 2004), the President of the Association of Science Editors and Publishers (ASEP, since… More →

  • Pierre MounierPierre MounierOpenEdition

    Pierre Mounier is deputy director of OpenEdition , a comprehensive infrastructure based in France for open access publication and… More →

  • Rodrigo CostasRodrigo CostasLeiden University

    Rodrigo Costas is an experienced researcher in the field of information science and bibliometrics. With a PhD in Library and Information… More →

  • Tihana RubićTihana RubićUniversity of Zagreb

    Tihana Rubić is an assistant professor at the Department of Ethnology and Cultural Anthropology, Faculty of Humanities and Social Sciences… More →

  • Vicko TomićVicko TomićUniversity of Split

    Vicko Tomić is a research assistant at the Department of Research in Biomedicine and Health at the University of Split School of Medicine… More →

  • Vanessa FairhurstVanessa FairhurstCrossref

    Vanessa Fairhurst joined Crossref in 2017 and is based at the Oxford office. As Community Outreach Manager, her role involves working… More →

  • Želimir KurtanjekŽelimir KurtanjekUniversity of Zagreb

    Želimir Kurtanjek is a retired professor of chemical engineering with an interest in biotechnology, biostatistics and big data analytics… More →

  • Vlatka BožičevićVlatka BožičevićUniversity of Zagreb

    Vlatka Božičević gratuated from Religious Pedagogy and Catechetics at the Catholic Faculty of Theology University of Zagreb and the… More →

  • Željka Salopek
  • Zoran VelagićZoran VelagićUniversity of Osijek

    Zoran Velagić is a professor of book history and publishing studies at the University of Osijek, Faculty of Humanities and Social Sciences… More →