Reproducible data science over data lakes: replayable data pipelines with Bauplan and Nessie.

Reproducible data science over data lakes: replayable data pipelines with Bauplan and Nessie.

Reproducible data science is enabled through Bauplan and Nessie, providing time-travel and branching semantics on data lakes, decoupling compute from data management.

arxiv.org

GitHub - rebremer/expose-deltatable-via-restapi

Bill Franks Taming The Big Data Tidal Wave: Finding Opportunities in Huge Data Streams with Advanced Analytics (Wiley and SAS Business Series)

Dune: The Data Must Flow | The Generalist

Mario Gabrielereadthegeneralist.com
Thumbnail of Dune: The Data Must Flow | The Generalist

How DoorDash Designed a Successful Write-Heavy Scalable and Reliable Inventory Platform

Chuanpin Zhudoordash.engineering