My research is primarily focused on randomized data structures and their applications to memory and storage systems. My research projects often start with a new theoretical idea which can be leveraged build into a surprising new system, which in turn can be hardened for real-world use.
SplinterDB exemplifies this theory-systems-practice pipeline. It leverages a new theoretically optimal data structure, the Mapped $B^\varepsilon$-tree to build a general-purpose key-value store that has world-beating performance on modern systems. Research papers have appeared on the data structure (ICALP 2018), the core system (ATC 2020), and a data structural extension (SIGMOD 2023). SplinterDB is available as open-source on GitHub, and is deployed in VMware products, such as vSAN 8.0.
Mosaic Pages and Iceberg Hash Tables are related research projects which also build systems using new ideas in theory. Iceberg Hash Tables (SIGMOD 2023) use new ideas from the theory of load-balancing to construct the fastest space-efficient hash tables to date. Mosaic Pages (ASPLOS 2023, distinguished paper) is a system that uses Tiny Pointers (SODA 2023, invited to special issue) based on Iceberg Hash Tables to redesign virtual memory to achieve the benefits of huge pages without the drawbacks. Iceberg Hash Tables are available as open-source on GitHub.
I have published work on the list-labeling problem (FOCS 2022, invited to special issue and HALG), file-system aging (FAST 2017), filters (SIGMOD 2021), and more.