Improving Quality and Scalability in Semantic Data Management

This tutorial presents and discusses several complementary techniques which are important to help solve the challenges of improving quality and scalability of Semantic Data Management, giving an account to distributed nature, dynamics and evolution of the data. In its choice of the techniques it focuses on finding solutions to several important problems:
1. ​Refined temporal representation ­- the developments regarding the required temporal features which help cope with dataset dynamics and evolution are presented.
2. ​Ontology learning and knowledge extraction ­- the methodology is presented allowing to extract a consensual set of community requirements from the relevant professional document corpus, refine the ontology, and evaluate the quality of this refinement.
3. ​Distribution, autonomy, consistency and trust ­- the approach to implement a Read/Write Linked Open Data coping with participants autonomy and trust at scale is presented.
4. ​Entity resolution ­- the approaches to revisit traditional entity resolution workflows for to cope with the new challenges stemming from the Web openness and heterogeneity, data variety, complexity, and scale are discussed.
5. ​Large scale reasoning ­- the existing platforms and techniques that allow parallel and scalable data processing, enabling large­scale reasoning based on rule and data partitioning over various logics are overviewed.

What is SemData

SemData is a four year project, started on October 2013, funded under the International Research Staff Exchange Scheme (IRSES) of the EU Marie Curie Actions. As such, it is mainly focused on allowing exchanges of members among the participating institutions, bringing together research leaders across the globe from the relevant communities: the linked data, semantic web and the database systems.