Since December 2022, I am working as a Principal Research Scientist for Huawei Technologies Ltd. in the United Kingdom. In particular, I am working in the Knowledge Graph lab. where we conduct cutting-edge research on knowledge computing challenges. My efforts are at the crossroads of data structuration (usually as Knowledge Graphs) and Large Language Models. My vision is that there should be a way to bridge memorisation (i.e. knowledge bases) and generalisation (i.e. parametric knowledge). Therefore, together with my team, we explore potential connections through different projects, such as using KGs to plan LLM actions, or building KGs with the sole use LLMs.
From January 2021 to December 2022, I was a tenured Researcher having an Inria starting faculty position. Practically, I was working in the Wimmics group, based in Sophia Antipolis, ca. Nice (France). I mainly focused on exploring downstream use-cases once Knowledge Graphs are built: by developing, for instance, novel visualisations to help Semantic Web lay-users accessing graphs or by setting-up analytics strategies on Knowledge Graphs through embeddings.
From October 2019 to December 2020, I was a Research Fellow at Trinity College Dublin (Ireland) working in the ADAPT Centre under the lead of Prof. Declan O'Sullivan. Practically, I was contributing to research efforts in Semantic Web technologies: mainly focusing on analyzing large distributed knowledge graphs and on designing complex transformation pipelines for heterogeneous Big Data. In particular, from July 2020, I could focus on these research topics thanks to a Marie Skłodowska-Curie ELITE-S fellowship.
From January 2018 to September 2019, I was a Senior Researcher at the Fraunhofer IAIS in Sankt Augustin (Germany, close to Bonn) focusing on the domain of Semantic Web and Linked Data in the context of large-scale datasets. My research topics include ontology management, ontology engineering, Semantic Web, Linked Data, clustering, machine learning methods. I also applied the results of my research in various European and Industry-funded projects. In parallel, I was an associated postdoc researcher of the Smart Data Analytics group at the University of Bonn, under the lead of Prof. Jens Lehmann.
In 2017, as a postdoc, still with the Tyrex group (in Inria, France), I pushed further what I developed during my PhD thesis by integrating SPARQL evaluators into larger systems where various kinds of data structures are involved: several query results are needed (and aggregated) to build a complex answer. More specifically, I was trying to design efficient languages to facilitate the development of optimized ETL pipelines in a semantic context.
From 2013 to 2016, during my PhD thesis at Inria, with the Tyrex group in Grenoble, I focused on Semantic Web standards, especially on the Resource Description Framework RDF and its dedicated query language SPARQL. My main goal was to design efficient tools to evaluate SPARQL queries on very large RDF datasets (i.e. ≥100GB). Indeed, I provided a new reading grid to rank SPARQL evaluators before designing several efficient ones.
As a past time during my PhD main activities, I also designed a semantic pipeline for trip planning aggregating heterogeneous datasets (e.g. GTFS, RDF, CSV) in order to provide users touristic alternatives at plane stopovers.
Previously, before 2013, I worked on designing and implementing broadcast algorithms with special properties such as UTO (uniform and totally ordered). This work, mainly developed in C, is also openly available from github.
Knowledge Graphs and Big Data Processing [Open Access]
Project [Role] | Abstract | Date |
---|---|---|
LAMBDA [Lecturer] |
LAMBDA defines a scientific strategy for stepping up and stimulating scientific excellence and innovation capacity, increasing research capacities and unlocking the research potential of the biggest and the oldest R&D Institute in the ICT area in the whole West Balkan region, turning the Institute Mihajlo Pupin into a regional point of reference when it comes to multidisciplinary ICT competence related to Big Data analytics. | Since 2019 |
SemanGit [Leader] |
SemanGit provides a resource at the crossroads of both Semantic Web and git web-based version control systems. It is actually the first collection of linked data extracted from GitHub based on a git ontology we designed and extended to include specific GitHub features. | Since 2018 |
QualiChain [Tasks Leader] |
QualiChain targets the creation, piloting and evaluation of a decentralised platform for storing, sharing and verifying education and employment qualifications and focuses on the assessment of the potential of blockchain technology, algorithmic techniques and computational intelligence for disrupting the domain of public education, as well as its interfaces with private education, the labour market, public sector administrative procedures and the wider socio-economic developments. | 2019 |
Better [Task Leader] |
BETTER is implementing a Big Data intermediate service layer focused on creating user-centric services and tools, while addressing the full data lifecycle associated with EO data, to bring more downstream users to the EO market and maximise exploitation of Copernicus data and information services. | 2018-2019 |
SLIPO [Work Package Leader] |
SLIPO develops software, models and processes for: transforming conventional POI formats and schemas into RDF data; interlinking POI entities from different datasets; enriching POI entities with additional metadata, including temporal, thematic and semantic properties; fusing Linked POI data in order to produce more complete and accurate POI profiles; assessing the quality of the integrated POI data; offering value added services based on spatial aggregation, association extraction and spatiotemporal prediction. | 2018-2019 |
Clear [Contributor] |
Clear addresses one fundamental challenge of our time: the construction of effective programming models and compilation techniques for the correct, efficient and scalable exploitation of large amounts of data. | 2017 |
Datalyse [Contributor] |
Datalyse is a smart treatment demonstrator dedicated to Big Data focusing on collecting, certificating, integrating, categorizing, securing, enriching and sharing data. | 2013-2016 |
PhD. Co-supervision
Bsc. & Msc. Theses Supervision
Software Engineer Supervision