Optimizing Parallel Recursive Datalog Evaluation on Multicore Machines

The Overall Architecture

Abstract

Over the past years, there has been a resurgence of interest in Datalog due to its superior ability of expressing applications that require recursive computations. However, in addition to expressive power, supporting analytical tasks with ever-increasing volume of data requires high performance and scalability. In this paper, we present DCDatalog, an in-memory Datalog engine specifically designed for modern shared-memory multicore architectures. Our key contribution is a novel system architecture that supports a wide scope of Datalog applications with a light-weight coordination scheme during parallel evaluation. To this end, we propose a dynamic scheduling strategy that can generate the parallel execution plan on-the-fly while reducing concurrent accesses to the shared memory. Experimental results on several large datasets show that our system significantly outperforms existing parallel Datalog engines and also scales well with increasing amount of data.

Publication
In ACM Special Interest Group in Management Of Data 2022
Jiacheng Wu
Jiacheng Wu
Researcher

My current research interests lie in the broad areas of system research.