OSDG is an open source initiative that aims to integrate various existing attempts to classify research according to Sustainable Development Goals, and to make this process open, transparent and user-friendly.
We integrate existing research into a comprehensive approach that avoids the shortcomings of previous individual approaches and the duplication of research efforts.
OSDG is a partnership between PPMI, UNDP SDG AI Lab, and a growing community of researchers led by Dr. Nuria Bautista Puig.
OSDG takes relevant features of the text (such as ontology items, features from machine-learning models or extracted keywords) from previous research, cleans them and merges them into a comprehensive, continuously expanding OSDG ontology.
Items from the ontology are mapped to the ever-growing list of topics/Fields of Study from an extensive ontology with over 2 milion items assembled from various sources, like patents, publications, MeSH Terms and Wikidata items.
In short, OSDG builds an integrated ontology from the feature sets identified in previous research.
By doing this, we:
You can learn more about the methodology in our article “OSDG – Open-Source Approach to Classify Text Data by UN Sustainable Development Goals (SDGs)“, available on arXiv.org.
OSDG processes user queries in the following steps:
Head to the Search page to put our methodology to practical use. If you see something that requires improvement or you would like to contact our data team, please submit your enquiry using our contact form.
OSDG takes relevant features of the text (such as ontology items, features from machine-learning models or extracted keywords) from previous research, cleans them and merges them into a comprehensive, continuously expanding OSDG ontology. Items from the ontology are mapped to the ever-growing list of Keywords.
We maintain an up-to-date list of data sources that are used to build the ontology as well as copies of the input datasets, processing scripts and the resulting combined ontology in the project GitHub repository.
The ontology of terms used in the OSDG tool is derived from the following data sources:
1 The threshold value is SDG-specific and can change between OSDG versions. It is computed by running a pre-selected panel of scientific publications through the OSDG tool and selecting the number that would cut-off the top 30% of the most relevant publications for each SDG. 10.1145/2740908.2742839
Wang, K., Shen, Z., Huang, C., Wu, C., Eide, D., Dong, Y., Qian, J., Kanakia, A., Chen, A.C., & Rogahn, R. (2019). A Review of Microsoft Academic Services for Science of Science Studies. Frontiers in Big Data, 2. doi: 10.3389/FDATA.2019.00045
2 The threshold value is SDG-specific and can change between OSDG versions. It is computed by running a pre-selected panel of scientific publications through the OSDG tool and selecting the number that would cut-off top 30% of the most relevant publications for each SDG.