• home
  • resources - terminology & construction

Terminology & Construction.

1. Entities

Research Products
There are four different types of research products in the OpenAIRE Graph:
  • Publications
  • Research data
  • Research software
  • Other research products.
We deduplicate (merge) different records of research products and keep the metadata of all instances.

Publication
Research products intended for human reading (published articles, pre-prints, conference papers, presentations, technical reports, etc.)

Research data
The sources from which the description of the research data has been collected reflect and support their own granularity, we do not define it.

Research software
Source code or software package developed and/or used in a research context

Other research product
Anything that does not fall into the previous categories (e.g. workflow, methods, protocols)

Projects
Projects refer to project identifiers/grant IDs used by funders.

Unidentified project
For some funders, we have agreed to include research outputs that cannot be linked to a specific project. In some cases, authors might acknowledge funding by a funder but do not provide additional information regarding the project number. Currently, such mined links are associated in the research graph with a project entity we name "unidentified" project. Some indicators are affected by this, such as the 'number of projects granted' that will appear increased by one (+1) compared to the actual number of project identifiers provided by the funder. However, this solution helps provide more accurate numbers for the research output of funders for which this applies, and for which otherwise these funded outputs would have been missed. In the extreme case, a couple of funders have not provided any grantIDs, as authors rarely acknowledge them (e.g., the Canadian funders in OpenAIRE). In those cases, Project indicators have been disabled as only one project will show.

Organization

For research products, this refers to the affiliated organizations of its authors

For projects: the organizations participating in the project (i.e. beneficiaries of the grant)

We are improving the organization database with the use of our OpenOrgs tool. It allows curators to disambiguate organizations (merge different names of the same organization) and identify parent-child relationships (schools, departments, etc.).


Country

The country of the organization.

Country code mapping: https://api.openaire.eu/vocabularies/dnet:countries


Funder

Funders that have joined OpenAIRE, i.e. their project data have gone through a validation process.

You can visit https://explore.openaire.eu/search/find if you would like to explore the research products and projects of all funders in OpenAIRE (the list of funders can be seen under the "Funder" Filter shown on the left side of the page).

For funder who want to join OpenAIRE: https://www.openaire.eu/funders-how-to-join-guide

2. Inherited and Inferred Attributes

We either inherit the attributes of entities via entries in the harvested metadata records or automatically generate them using our inference system (text and data mining algorithms).
Type

The sub-type of a research outcome (e.g., a publication can be a pre-print, conference proceeding, article, etc.)

Resource type mapping: https://api.openaire.eu/vocabularies/dnet:result_typologies (click on the code to see the specific types for each result type)


Access rights

The best available (across all instances) access rights of a research product

Types (by best available):

Open: Open Access

  • w/ Licence: Following the Budapest Open Access Initiative definition, accompanied by a specified licence that outlines how the material can be used, shared, and distributed.
  • w/o Licence:Without specifying any particular licence​​​​​​ thus without specifying usage permissions or restrictions.

Embargo: Closed for a specific period of time, then open.

Restricted: Definition of restricted may vary by data source, it may refer to access rights being given to registered users, potentially behind a paywall.

Closed: Closed access

Not Available: Unspecified Access Rights


Article Processing Charges (APC)

The fee charged by publishers in order to publish a research publication in an open access journal. These charges are meant to cover the costs of publication and ensure the work is freely accessible to all. The APC information is sourced from OpenAPC, which is fully integrated into the OpenAIRE Graph. For a comprehensive guide:https://www.openaire.eu/openapc-guide.


CC license

A Creative Commons copyright license (https://creativecommons.org/)


PID (persistent identifier)

A long-lasting reference to a resource

Types: http://api.openaire.eu/vocabularies/dnet:pid_types


Context

Related research community, initiative or infrastructure.


Journal

The scientific journal an article is published in.


Publisher

The publisher of the venue (journal, book, etc.) of a research product.


Data sources (content providers)

The different data sources ingested in the OpenAIRE Graph.

Data Source Types:
  • Repositories
  • Open Access Publishers & Journals
  • Aggregators
  • Entity Registries
  • Journal Aggregators
  • CRIS (Current Research Information Systems)

Repositories

Information systems where scientists upload the bibliographic metadata and payloads of their research products (e.g. PDFs of their scientific articles, CSVs of their data, archive with their software), due to obligations from their organizations, their funders, or due to community practices (e.g. ArXiv, Europe PMC, Zenodo).


Open Access Publishers & Journals

Information systems of open access publishers or relative journals, which offer bibliographic metadata and PDFs of their published articles.


Aggregators

Information systems that collect descriptive metadata about research products from multiple sources in order to enable cross-data source discovery of given research products (e,g, DataCite, BASE, DOAJ).


Entity Registries

Information systems created with the intent of maintaining authoritative registries of given entities in the scholarly communication, such as OpenDOAR for the institutional repositories, re3data for the data repositories, CORDA and other funder databases for projects and funding information.


CRIS (Current Research Information System)

Information systems adopted by research and academic organizations to keep track of their research administration records and relative results; examples of CRIS content are articles or research data funded by projects, their principal investigators, facilities acquired thanks to funding, etc.


FoS & SDG Classifications

Fields of Science (FoS) - beta

This inferred attribute refers to the utilization of a Fields of Science taxonomy to categorize research publications within the OpenAIRE Graph. The algorithm classifies research across various levels of detail, from broad categories at Level 1 to more nuanced classifications at Level 3. For more: https://explore.openaire.eu/fields-of-science#01%20natural%20sciences.


Sustainable Development Goals (SDG) - beta

This inferred attribute, determined through our own classification system, associates research publications in the OpenAIRE Graph with specific UN Sustainable Development Goals. By doing so, it emphasizes how individual research works align with and address global challenges such as climate change, biodiversity loss, pollution, and poverty reduction. For more information:https://www.openaire.eu/openaire-explore-introducing-sdgs-and-fos.

3. Constructed Attributes

All attributes in this tab are constructed by us, with the methodology presented below.
Attribute
Definition
How we build it

Journal Business Models

OA (Gold)

A journal that publishes only in open access.

We follow Unpaywall’s approach on defining fully Open Access journals and publishers and we construct the lists of the latter using Unpaywall data.

In brief, a journal is fully Open Access if one or more of the following occur:

  1. It is in the Directory of Open Access Journals (DOAJ)
  2. It has a known fully OA Publisher (curated list).
  3. It only publishes OA articles.

Diamond OA

A fully OA journal that does not charge article processing charges (APCs).

We obtain APC data from DOAJ using DOAJ’s Public Data Dump (an exportable version of the journal metadata). We used it to determine whether a particular fully OA journal charges APCs.


Subscription

A journal that charges for access to its articles.

Journals without any open access articles.


Hybrid

A subscription journal where some of its articles are open access.

Journals with open access articles that are not fully OA journals.


Transformative

"A Transformative Journal is a subscription/hybrid journal that is actively committed to transitioning to a fully Open Access journal.

In addition, a Transformative Journal must:

  • gradually increase the share of Open Access content; and
  • offset subscription income from payments for publishing services (to avoid double payments)."

Source: Plan S initiative

We identify Transformative Journals by ISSN matching with the publicly available Transformative Journals data from Plan S initiative.


Under transformative agreements

Transformative agreements are those contracts negotiated between institutions (libraries, national and regional consortia) and publishers that transform the business model underlying scholarly journal publishing, moving from one based on toll access (subscription) to one in which publishers are remunerated a fair price for their open access publishing services.

Source: Plan S initiative

We have identified and retrieved from OpenAPCthe set of articles with metadata published under transformative agreements for Ireland.


Routes to Open Access (OA)

Green OA with licence

Green articles are published in toll-access journals, but archived in an OA archive, or "repository". These repositories may be discipline-specific (like ArXiv) or institutional repositories operated by universities or other institutions. Green articles may be published versions or preprints, and must be accompanied by a specified licence that outlines how the material can be used, shared, and distributed.


Hybrid OA

An open access scientific publication published in a hybrid journal with an open license. Hybrid articles are free to read at the time of publication, with an open license. These are usually published in exchange for an article processing charge (APC).

We define hybrid journals above.


Gold OA

Gold articles have all the same characteristics as Hybrid articles, but are published in all-Open Access journals, which are in turn called "Gold journals", or just "OA journals".

We define all-Open Access journals as OA (Gold) journals above.


Unrealised OA

Bronze

Bronze articles are free to read on the publisher's website, without a licence that grants any other rights. There may be a delay between publication and availability to read, and often articles can be removed unilaterally by the publisher.

As in definition


Green without licence

Green articles deposited in a repository without any particular licence​​​​​​ specified.

As in definition


Miscellaneous

Downloads

The number of downloads of a publication’s full text in a specific time frame, from a given set of data sources.

We utilize the usage data for the downloads from OpenAIRE’s Usage Counts service that harvests it from a set of datasources. The time range of available downloads varies for each datasource.


Citations

The number of citations received by a publication. A citation is a reference to the source of information used in a publication.

We utilize the number of citations of a publication from the calculated impact indicators, provided by BIP!. Precisely, we use the Citation Count (CC) impact indicator, which sums all citations received by each article. More information: https://graph.openaire.eu/docs/graph-production-workflow/indicators-ingestion/impact-indicators/


Popularity

Reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.


Influence

Reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).


Impulse

Reflects the initial momentum of an article directly after its publication, based on the underlying citation network.

  • home
  • resources - terminology & construction

Terminology & Construction.

1. Entities

Research Products
There are four different types of research products in the OpenAIRE Graph:
  • Publications
  • Research data
  • Research software
  • Other research products.
We deduplicate (merge) different records of research products and keep the metadata of all instances.

Publication
Research products intended for human reading (published articles, pre-prints, conference papers, presentations, technical reports, etc.)

Research data
The sources from which the description of the research data has been collected reflect and support their own granularity, we do not define it.

Research software
Source code or software package developed and/or used in a research context

Other research product
Anything that does not fall into the previous categories (e.g. workflow, methods, protocols)

Projects
Projects refer to project identifiers/grant IDs used by funders.

Unidentified project
For some funders, we have agreed to include research outputs that cannot be linked to a specific project. In some cases, authors might acknowledge funding by a funder but do not provide additional information regarding the project number. Currently, such mined links are associated in the research graph with a project entity we name "unidentified" project. Some indicators are affected by this, such as the 'number of projects granted' that will appear increased by one (+1) compared to the actual number of project identifiers provided by the funder. However, this solution helps provide more accurate numbers for the research output of funders for which this applies, and for which otherwise these funded outputs would have been missed. In the extreme case, a couple of funders have not provided any grantIDs, as authors rarely acknowledge them (e.g., the Canadian funders in OpenAIRE). In those cases, Project indicators have been disabled as only one project will show.

Organization

For research products, this refers to the affiliated organizations of its authors

For projects: the organizations participating in the project (i.e. beneficiaries of the grant)

We are improving the organization database with the use of our OpenOrgs tool. It allows curators to disambiguate organizations (merge different names of the same organization) and identify parent-child relationships (schools, departments, etc.).


Country

The country of the organization.

Country code mapping: https://api.openaire.eu/vocabularies/dnet:countries


Funder

Funders that have joined OpenAIRE, i.e. their project data have gone through a validation process.

You can visit https://explore.openaire.eu/search/find if you would like to explore the research products and projects of all funders in OpenAIRE (the list of funders can be seen under the "Funder" Filter shown on the left side of the page).

For funder who want to join OpenAIRE: https://www.openaire.eu/funders-how-to-join-guide

2. Inherited and Inferred Attributes

We either inherit the attributes of entities via entries in the harvested metadata records or automatically generate them using our inference system (text and data mining algorithms).
Type

The sub-type of a research outcome (e.g., a publication can be a pre-print, conference proceeding, article, etc.)

Resource type mapping: https://api.openaire.eu/vocabularies/dnet:result_typologies (click on the code to see the specific types for each result type)


Access rights

The best available (across all instances) access rights of a research product

Types (by best available):

Open: Open Access

  • w/ Licence: Following the Budapest Open Access Initiative definition, accompanied by a specified licence that outlines how the material can be used, shared, and distributed.
  • w/o Licence:Without specifying any particular licence​​​​​​ thus without specifying usage permissions or restrictions.

Embargo: Closed for a specific period of time, then open.

Restricted: Definition of restricted may vary by data source, it may refer to access rights being given to registered users, potentially behind a paywall.

Closed: Closed access

Not Available: Unspecified Access Rights


Article Processing Charges (APC)

The fee charged by publishers in order to publish a research publication in an open access journal. These charges are meant to cover the costs of publication and ensure the work is freely accessible to all. The APC information is sourced from OpenAPC, which is fully integrated into the OpenAIRE Graph. For a comprehensive guide:https://www.openaire.eu/openapc-guide.


CC license

A Creative Commons copyright license (https://creativecommons.org/)


PID (persistent identifier)

A long-lasting reference to a resource

Types: http://api.openaire.eu/vocabularies/dnet:pid_types


Context

Related research community, initiative or infrastructure.


Journal

The scientific journal an article is published in.


Publisher

The publisher of the venue (journal, book, etc.) of a research product.


Data sources (content providers)

The different data sources ingested in the OpenAIRE Graph.

Data Source Types:
  • Repositories
  • Open Access Publishers & Journals
  • Aggregators
  • Entity Registries
  • Journal Aggregators
  • CRIS (Current Research Information Systems)

Repositories

Information systems where scientists upload the bibliographic metadata and payloads of their research products (e.g. PDFs of their scientific articles, CSVs of their data, archive with their software), due to obligations from their organizations, their funders, or due to community practices (e.g. ArXiv, Europe PMC, Zenodo).


Open Access Publishers & Journals

Information systems of open access publishers or relative journals, which offer bibliographic metadata and PDFs of their published articles.


Aggregators

Information systems that collect descriptive metadata about research products from multiple sources in order to enable cross-data source discovery of given research products (e,g, DataCite, BASE, DOAJ).


Entity Registries

Information systems created with the intent of maintaining authoritative registries of given entities in the scholarly communication, such as OpenDOAR for the institutional repositories, re3data for the data repositories, CORDA and other funder databases for projects and funding information.


CRIS (Current Research Information System)

Information systems adopted by research and academic organizations to keep track of their research administration records and relative results; examples of CRIS content are articles or research data funded by projects, their principal investigators, facilities acquired thanks to funding, etc.


FoS & SDG Classifications

Fields of Science (FoS) - beta

This inferred attribute refers to the utilization of a Fields of Science taxonomy to categorize research publications within the OpenAIRE Graph. The algorithm classifies research across various levels of detail, from broad categories at Level 1 to more nuanced classifications at Level 3. For more: https://explore.openaire.eu/fields-of-science#01%20natural%20sciences.


Sustainable Development Goals (SDG) - beta

This inferred attribute, determined through our own classification system, associates research publications in the OpenAIRE Graph with specific UN Sustainable Development Goals. By doing so, it emphasizes how individual research works align with and address global challenges such as climate change, biodiversity loss, pollution, and poverty reduction. For more information:https://www.openaire.eu/openaire-explore-introducing-sdgs-and-fos.

3. Constructed Attributes

All attributes in this tab are constructed by us, with the methodology presented below.
Attribute
Definition
How we build it

Journal Business Models

OA (Gold)

A journal that publishes only in open access.

We follow Unpaywall’s approach on defining fully Open Access journals and publishers and we construct the lists of the latter using Unpaywall data.

In brief, a journal is fully Open Access if one or more of the following occur:

  1. It is in the Directory of Open Access Journals (DOAJ)
  2. It has a known fully OA Publisher (curated list).
  3. It only publishes OA articles.

Diamond OA

A fully OA journal that does not charge article processing charges (APCs).

We obtain APC data from DOAJ using DOAJ’s Public Data Dump (an exportable version of the journal metadata). We used it to determine whether a particular fully OA journal charges APCs.


Subscription

A journal that charges for access to its articles.

Journals without any open access articles.


Hybrid

A subscription journal where some of its articles are open access.

Journals with open access articles that are not fully OA journals.


Transformative

"A Transformative Journal is a subscription/hybrid journal that is actively committed to transitioning to a fully Open Access journal.

In addition, a Transformative Journal must:

  • gradually increase the share of Open Access content; and
  • offset subscription income from payments for publishing services (to avoid double payments)."

Source: Plan S initiative

We identify Transformative Journals by ISSN matching with the publicly available Transformative Journals data from Plan S initiative.


Under transformative agreements

Transformative agreements are those contracts negotiated between institutions (libraries, national and regional consortia) and publishers that transform the business model underlying scholarly journal publishing, moving from one based on toll access (subscription) to one in which publishers are remunerated a fair price for their open access publishing services.

Source: Plan S initiative

We have identified and retrieved from OpenAPCthe set of articles with metadata published under transformative agreements for Ireland.


Routes to Open Access (OA)

Green OA with licence

Green articles are published in toll-access journals, but archived in an OA archive, or "repository". These repositories may be discipline-specific (like ArXiv) or institutional repositories operated by universities or other institutions. Green articles may be published versions or preprints, and must be accompanied by a specified licence that outlines how the material can be used, shared, and distributed.


Hybrid OA

An open access scientific publication published in a hybrid journal with an open license. Hybrid articles are free to read at the time of publication, with an open license. These are usually published in exchange for an article processing charge (APC).

We define hybrid journals above.


Gold OA

Gold articles have all the same characteristics as Hybrid articles, but are published in all-Open Access journals, which are in turn called "Gold journals", or just "OA journals".

We define all-Open Access journals as OA (Gold) journals above.


Unrealised OA

Bronze

Bronze articles are free to read on the publisher's website, without a licence that grants any other rights. There may be a delay between publication and availability to read, and often articles can be removed unilaterally by the publisher.

As in definition


Green without licence

Green articles deposited in a repository without any particular licence​​​​​​ specified.

As in definition


Miscellaneous

Downloads

The number of downloads of a publication’s full text in a specific time frame, from a given set of data sources.

We utilize the usage data for the downloads from OpenAIRE’s Usage Counts service that harvests it from a set of datasources. The time range of available downloads varies for each datasource.


Citations

The number of citations received by a publication. A citation is a reference to the source of information used in a publication.

We utilize the number of citations of a publication from the calculated impact indicators, provided by BIP!. Precisely, we use the Citation Count (CC) impact indicator, which sums all citations received by each article. More information: https://graph.openaire.eu/docs/graph-production-workflow/indicators-ingestion/impact-indicators/


Popularity

Reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.


Influence

Reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).


Impulse

Reflects the initial momentum of an article directly after its publication, based on the underlying citation network.