Powering your Copilot for Data – with Artem Keydunov of Cube...

Powering your Copilot for Data – with Artem Keydunov of Cube.dev

RelatedHighlights

So what abstractions do we have as of today? For example, let’s take the resource abstraction (Dagster, Prefect, referred to as an operator in Airflow). You abstract complex environments and connections away with a simple construct like that. You have the immediate benefits of defining that once and using it in every task or pipeline with context.r... See more

Data Engineering • Data Orchestration Trends: The Shift From Data Pipelines to Data Products

Backends Should be Designed for Product Developers

youtu.be

(1) The separation between storage and compute , as encouraged by data lake architectures (e.g. the implementation of P would look different in a traditional database like PostgreSQL, or a cloud warehouse like Snowflake). This architecture is the focus of the current system, and it is prevalent in most mid-to-large enterprises (its benefits that be... See more

Jacopo Tagliabue • Reproducible data science over data lakes: replayable data pipelines with Bauplan and Nessie.

You’ve got a vector database that has all the right database fundamentals you require, has the right incremental indexing strategy for your use case, has a good story around your metadata filtering needs, and will keep its index up-to-date with latencies you can tolerate. Awesome.

Your ML team (or maybe OpenAI) comes out with a new version of their... See more

Data Engineering • Data Orchestration Trends: The Shift From Data Pipelines to Data Products

Backends Should be Designed for Product Developers

Jacopo Tagliabue • Reproducible data science over data lakes: replayable data pipelines with Bauplan and Nessie.

6 Hard Problems Scaling Vector Search