The aim of the course is to train researchers to use the SMS (Semantically Mapping Science) platform for integrating, enriching and retrieving data for a specific research project. Participants will come with their own data and go with a specifically enlarged (linked to other SMS datasets) dataset needed for the intended study.
The promise of linked data is to have larger and richer research data than currently available. In this course, the participants are trained to produce and use linked data and the various possibilities provided by those data. Integrating and enriching of data are supported by the SMS platform, but there is still manual work to do, and this course is meant for training how to do those tasks. In the two days course, participants will
(i) get familiar with the platform and its functions;
(ii) learn how to convert their own data so these can be linked to other data in the platform, as this requires a specific format
(iii) enrich the data through entity recognition and geolocation services – the first is related to textual data where for example organization names can be recognized in a CV, or technical terms can be recognized as referring to a specific disease;
(iv) link one’s own data to the data in the SMS data store.
(v) use the SMS browser to select those data from the SMS data store dat the user needs for his/her research project;
(vi) understand how this data browsing results in a query that can be used to retrieve the data from the data store in a usable format.
(vii) Finally, these data then can be used for analysis.
Overview of the data in the SMS datastore
Overview of the data services of the SMS platform
Converting your data into RDF
Enriching textual data with the entity recognition service
Geolocating the data
Basic linking between the own data and the SMS datastore
Using the link tool and the lens service to improve and taylor the links with the data store
Applying the link/lens tool for linking
Checking the quality of the produced links
Selecting the datasets relevant for the research question
Browsing those to identify the entities and the properties relevant for the research question
Final selection of the data needed using the browser; inspecting the query.
Running (by the SMS staff) of the query
Inspecting the retrieved dataset – doing some statistics on it.
Researchers (early career and experienced) interested in data integration for science and innovation studies, from within the RISIS consortium and from outside.
Participants can either be active researchers in the field, or data/information/computer scientists working together with STI studies researchers
November 17th, 2017
Ali Khalili; Frank van Harmelen; Peter van den Besselaar