Damien Graux, PhD

~ Research Fellow at Trinity College Dublin ~

Short Bio.

Since October 2019, I am a Research Fellow at the Trinity College Dublin (Ireland) working in the ADAPT Centre under the lead of Prof. Declan O'Sullivan. Practically, I am contributing to research efforts in Semantic Web technologies: mainly focusing on analyzing large distributed knowledge graphs and on designing complex transformation pipelines for heterogeneous Big Data.


From January 2018 to September 2019, I was a Senior Researcher at the Fraunhofer IAIS in Sankt Augustin (Germany, close to Bonn) focusing on the domain of Semantic Web and Linked Data in the context of large-scale datasets. My research topics include ontology management, ontology engineering, Semantic Web, Linked Data, clustering, machine learning methods. I also applied the results of my research in various European and Industry-funded projects. In parallel, I was an associated postdoc researcher of the Smart Data Analytics group at the University of Bonn, under the lead of Prof. Jens Lehmann.


In 2017, as a postdoc, still with the Tyrex group (in Inria, France), I pushed further what I developed during my PhD thesis by integrating SPARQL evaluators into larger systems where various kinds of data structures are involved: several query results are needed (and aggregated) to build a complex answer. More specifically, I was trying to design efficient languages to facilitate the development of optimized ETL pipelines in a semantic context.


From 2013 to 2016, during my PhD thesis at Inria, with the Tyrex group in Grenoble, I focused on Semantic Web standards, especially on the Resource Description Framework RDF and its dedicated query language SPARQL. My main goal was to design efficient tools to evaluate SPARQL queries on very large RDF datasets (i.e. ≥100GB). Indeed, I provided a new reading grid to rank SPARQL evaluators before designing several efficient ones.

As a past time during my PhD main activities, I also designed a semantic pipeline for trip planning aggregating heterogeneous datasets (e.g. GTFS, RDF, CSV) in order to provide users touristic alternatives at plane stopovers.


Previously, before 2013, I worked on designing and implementing broadcast algorithms with special properties such as UTO (uniform and totally ordered). This work, mainly developed in C, is also openly available from github.

Appearance as of May 2018


Publications

  1. Establishing a Strong Baseline for Privacy Policy Classification [PDF]
    Najmeh Mousavi Nejad, Pablo Jabat, Rostislav Nedelchev, Simon Scerri, Damien Graux
    IFIP-SEC, 2020
  2. MDE: Multiple Distance Embeddings for Link Prediction in Knowledge Graphs [PDF][arXiv]
    Afshin Sadeghi, Damien Graux, Hamed Shariat Yazdi, Jens Lehmann
    ECAI, 2020

  3. The Query Translation Landscape: a Survey [PDF]
    Mohamed Nadjib Mami, Damien Graux, Harsh Thakkar, Simon Scerri, Sören Auer, Jens Lehmann
    Pre-print version, 2019
  4. Uniform Access to Multiform Data Lakes using Semantic Technologies [PDF]
    Mohamed Nadjib Mami, Damien Graux, Simon Scerri, Hajira Jabeen, Sören Auer, Jens Lehmann
    iiWAS, 2019
  5. SemanGit: A Linked Dataset from git [PDF]
    Dennis Oliver Kubitza, Matthias Böckmann, Damien Graux
    ISWC, 2019
  6. Sparklify: A Scalable Software Component for Efficient evaluation of SPARQL queries over distributed RDF datasets [PDF]
    Claus Stadler, Gezim Sejdiu, Damien Graux, Jens Lehmann
    ISWC, 2019
  7. Squerall: Virtual Ontology-Based Access to Heterogeneous and Large Data Sources [PDF]
    Mohamed Nadjib Mami, Damien Graux, Simon Scerri, Hajira Jabeen, Sören Auer, Jens Lehmann
    ISWC, 2019
  8. Towards Semantically Structuring GitHub [PDF]
    Dennis Oliver Kubitza, Matthias Böckmann, Damien Graux
    ISWC (Posters and Demos), 2019
  9. Querying large-scale RDF datasets using the SANSA framework [PDF]
    Claus Stadler, Gezim Sejdiu, Damien Graux, Jens Lehmann
    ISWC (Posters and Demos), 2019
  10. How to feed the Squerall with RDF and other data nuts? [PDF]
    Mohamed Nadjib Mami, Damien Graux, Simon Scerri, Hajira Jabeen, Sören Auer, Jens Lehmann
    ISWC (Posters and Demos), 2019
  11. Interroger des Lacs de Données en utilisant Spark & Presto [PDF]
    Mohamed Nadjib Mami, Damien Graux, Simon Scerri, Hajira Jabeen, Sören Auer
    BDA (Demo Track), 2019
  12. Towards A Scalable Semantic-based Distributed Approach for SPARQL query evaluation [PDF]
    Gezim Sejdiu, Damien Graux, Imran Khan, Ioanna Lytra, Hajira Jabeen, Jens Lehmann
    SEMANTiCS, 2019
  13. The Hubs and Authorities Transaction Network Analysis using the SANSA framework [PDF]
    Danning Sui, Gezim Sejdiu, Damien Graux, Jens Lehmann
    SEMANTiCS (Poster Track), 2019
  14. COMET: A Contextualized Molecule-Based Matching Technique [PDF]
    Mayesha Tasnim, Diego Collarana, Damien Graux, Mikhail Galkin, Maria-Esther Vidal
    DEXA, 2019
  15. Towards Measuring Risk Factors in Privacy Policies [PDF]
    Najmeh Mousavi Nejad, Damien Graux, Diego Collarana
    Workshop on Artificial Intelligence and the Administrative State collocated in ICAIL'19 (Position Paper)
  16. Clustering Pipelines of large RDF POI Data [PDF]
    Rajjat Dadwal, Damien Graux, Gezim Sejdiu, Hajira Jabeen, Jens Lehmann
    ESWC 2019 (Poster Track)
  17. Summarizing Entity Temporal Evolution in Knowledge Graphs [PDF]
    Mayesha Tasnim, Diego Collarana, Damien Graux, Fabrizio Orlandi, Maria-Esther Vidal
    MepDaw, WWW (Companion Volume) 2019: 961-965
  18. Querying Data Lakes using Spark and Presto [PDF]
    Mohamed Nadjib Mami, Damien Graux, Simon Scerri, Hajira Jabeen, Sören Auer
    WWW 2019: 3574-3578
  19. Big POI Data Integration with Linked Data Technologies [PDF]
    Spiros Athanasiou, Giorgos Giannopoulos, Damien Graux, Nikos Karagiannakis, Jens Lehmann, Axel-Cyrille Ngonga Ngomo, Kostas Patroumpas, Mohamed Ahmed Sherif, Dimitrios Skoutas
    22nd International Conference on Extending Database Technology, Lisbon, Portugal. pp. 477–488 (EDBT 2019)

  20. A Multi-Criteria Experimental Ranking of Distributed SPARQL Evaluators [PDF]
    Damien Graux, Louis Jachiet, Pierre Genevès, Nabil Layaïda
    2018 IEEE International Conference on Big Data (Big Data). IEEE, 2018. p. 693-702
  21. Profiting from Kitties on Ethereum: Leveraging Blockchain RDF Data with SANSA [PDF]
    Damien Graux, Gezim Sejdiu, Hajira Jabeen, Jens Lehmann, Danning Sui, Dominik Muhs, Johannes Pfeffer
    Proceedings of the Posters and Demos Track of the 14th International Conference on Semantic Systems co-located with the 14th International Conference on Semantic Systems (SEMANTiCS 2018), Vienna, Austria, September 10-13, 2018.
  22. MINDS: a translator to embed mathematical expressions inside SPARQL queries [PDF]
    Damien Graux, Gezim Sejdiu, Claus Stadler, Giulio Napolitano, Jens Lehmann
    Technical Report, 2018.

  23. Une classification expérimentale multi-critère des évaluateurs SPARQL répartis [PDF]
    Damien Graux, Louis Jachiet, Pierre Genevès, Nabil Layaïda
    BDA 2017 - 33ème Conférence sur la Gestion de Données - Principes, Technologies et Applications, Nov 2017, Nancy, France. BDA2017
  24. SPARUB: SPARQL UPDATE Benchmark [PDF]
    Damien Graux, Pierre Genevès, Nabil Layaïda
    Technical report, 2017
  25. HAP: Building Pipelines with Heterogeneous Data and Hive [PDF]
    Damien Graux, Pierre Genevès, Nabil Layaïda
    Technical report, 2017

  26. On the Efficient Distributed Evaluation of SPARQL Queries [PDF]
    Damien Graux
    PhD Thesis, 2016
  27. SPARQLGX : Une Solution Distribuée pour RDF Traduisant SPARQL vers Spark [PDF]
    Damien Graux, Louis Jachiet, Pierre Genevès, Nabil Layaïda
    BDA 2016 - 32ème Conférence sur la Gestion de Données - Principes, Technologies et Applications, Nov 2016, Poitiers, France. BDA2016
  28. SPARQLGX in Action: Efficient Distributed Evaluation of SPARQL with Apache Spark [PDF]
    Damien Graux, Louis Jachiet, Pierre Genevès, Nabil Layaïda
    15th International Semantic Web Conference (ISWC 2016 demo paper), Oct 2016, Kobe, Japan. 15th International Semantic Web Conference
  29. Smart Trip Alternatives for the Curious [PDF]
    Damien Graux, Pierre Genevès, Nabil Layaïda
    15th International Semantic Web Conference (ISWC 2016 demo paper), Oct 2016, Kobe, Japan. 15th International Semantic Web Conference
  30. SPARQLGX: Efficient Distributed Evaluation of SPARQL with Apache Spark [PDF]
    Damien Graux, Louis Jachiet, Pierre Genevès, Nabil Layaïda
    The 15th International Semantic Web Conference, Oct 2016, Kobe, Japan. The 15th International Semantic Web Conference, <10.1007/978-3-319-46547-0_9>

  31. TRAINS : a Throughput-Efficient Uniform Total Order Broadcast Algorithm [PDF]
    Michel Simatic, Arthur Foltz, Damien Graux, Nicolas Hascoet, Stéphanie Ouillon, Nathan Reboud, Tiezhen Wang
    NTDS - ICPE 2015 : International Conference on Protocol Engineering (ICPE) and International Conference on New Technologies of Distributed Systems (NTDS), Jul 2015, Paris, France. IEEE, Proceedings NTDS - ICPE 2015 : International Conference on Protocol Engineering (ICPE) and International Conference on New Technologies of Distributed Systems (NTDS), pp.1 - 8, 2015, <10.1109/NOTERE.2015.7293477>

Funded Research Projects

Project [Role] Abstract Date
LAMBDA
[Lecturer]
LAMBDA defines a scientific strategy for stepping up and stimulating scientific excellence and innovation capacity, increasing research capacities and unlocking the research potential of the biggest and the oldest R&D Institute in the ICT area in the whole West Balkan region, turning the Institute Mihajlo Pupin into a regional point of reference when it comes to multidisciplinary ICT competence related to Big Data analytics. Since 2019
SemanGit
[Leader]
SemanGit provides a resource at the crossroads of both Semantic Web and git web-based version control systems. It is actually the first collection of linked data extracted from GitHub based on a git ontology we designed and extended to include specific GitHub features. Since 2018
QualiChain
[Tasks Leader]
QualiChain targets the creation, piloting and evaluation of a decentralised platform for storing, sharing and verifying education and employment qualifications and focuses on the assessment of the potential of blockchain technology, algorithmic techniques and computational intelligence for disrupting the domain of public education, as well as its interfaces with private education, the labour market, public sector administrative procedures and the wider socio-economic developments. 2019
Better
[Task Leader]
BETTER is implementing a Big Data intermediate service layer focused on creating user-centric services and tools, while addressing the full data lifecycle associated with EO data, to bring more downstream users to the EO market and maximise exploitation of Copernicus data and information services. 2018-2019
SLIPO
[Work Package Leader]
SLIPO develops software, models and processes for: transforming conventional POI formats and schemas into RDF data; interlinking POI entities from different datasets; enriching POI entities with additional metadata, including temporal, thematic and semantic properties; fusing Linked POI data in order to produce more complete and accurate POI profiles; assessing the quality of the integrated POI data; offering value added services based on spatial aggregation, association extraction and spatiotemporal prediction. 2018-2019
Clear
[Contributor]
Clear addresses one fundamental challenge of our time: the construction of effective programming models and compilation techniques for the correct, efficient and scalable exploitation of large amounts of data. 2017
Datalyse
[Contributor]
Datalyse is a smart treatment demonstrator dedicated to Big Data focusing on collecting, certificating, integrating, categorizing, securing, enriching and sharing data. 2013-2016

Software Projects

Mentoring Activities

PhD. Co-supervision


Bsc. & Msc. Theses Supervision


Software Engineer Supervision

Research Community Services