Reproducible data science over data lakes: replayable data pipelines with Bauplan and Nessie.

Reproducible data science over data lakes: replayable data pipelines with Bauplan and Nessie.

Reproducible data science is enabled through Bauplan and Nessie, providing time-travel and branching semantics on data lakes, decoupling compute from data management.

arxiv.org

Data composability: what it is + why it matters

Danny Zuckermandazuck.substack.com
Thumbnail of Data composability: what it is + why it matters

Data Engineering Data Orchestration Trends: The Shift From Data Pipelines to Data Products

Embark: Dynamic documents for making plans

inkandswitch.com
Thumbnail of Embark: Dynamic documents for making plans

ByteByteGo-Big-Archive-System-Design-2023

Covering a wide array of technical topics from API testing to cloud services, this document shares insights on various programming, networking, and system design concepts for tech enthusiasts and professionals alike.

Link