Powering your Copilot for Data – with Artem Keydunov of Cube.dev
So what abstractions do we have as of today? For example, let’s take the resource abstraction (Dagster, Prefect, referred to as an operator in Airflow). You abstract complex environments and connections away with a simple construct like that. You have the immediate benefits of defining that once and using it in every task or pipeline with context.r... See more
Data Engineering • Data Orchestration Trends: The Shift From Data Pipelines to Data Products
Backends Should be Designed for Product Developers
youtu.be(1) The separation between storage and compute , as encouraged by data lake architectures (e.g. the implementation of P would look different in a traditional database like PostgreSQL, or a cloud warehouse like Snowflake). This architecture is the focus of the current system, and it is prevalent in most mid-to-large enterprises (its benefits that be... See more
Jacopo Tagliabue • Reproducible data science over data lakes: replayable data pipelines with Bauplan and Nessie.
You’ve got a vector database that has all the right database fundamentals you require, has the right incremental indexing strategy for your use case, has a good story around your metadata filtering needs, and will keep its index up-to-date with latencies you can tolerate. Awesome.
Your ML team (or maybe OpenAI) comes out with a new version of their... See more
Your ML team (or maybe OpenAI) comes out with a new version of their... See more