Abstract

Datalog is a powerful yet elegant language that allows expressing recursive computation on relational data. Although Datalog evaluation has been extensively studied in the literature, which includes more recent study of evaluation of Datalog over different semirings, so far only loose upper bounds are known on how fast a Datalog program can be evaluated on semirings. In this talk, we will discuss some results on the following questions: given a Datalog program over a semiring S, what is the tightest possible runtime? How can we compactly and efficiently store the provenance from this evaluation using provenance circuits?  How do the runtime and size of the circuit vary when we vary the semirings and programs?  We will also discuss open questions.

(Joint work with Austen Fan, Hangdong Zhao, Shaleen Deep, and Val Tannen from collaborations started during the Fall 2023 Simons Institute program on Logic and Algorithms in Database Theory and AI.)