GitHub - airtai/faststream: FastStream is a powerful and easy-to-use Python framework for building asynchronous services that interact with event streams such as Apache Kafka and RabbitMQ.
Getting Started | Modular Diffusion
cabralpinto.github.ioAirbyte enables you to build data pipelines and replicate data from a source to a destination. You can configure how frequently the data is synced, what data is replicated, and how the data is written to in the destination.
This page describes the concepts you need to know to use Airbyte.
Source
A source is an API, file, database, or data warehouse t... See more
This page describes the concepts you need to know to use Airbyte.
Source
A source is an API, file, database, or data warehouse t... See more
Core Concepts | Airbyte Documentation
DuckDB Doesn’t Need Data To Be a Database
nikolasgoebel.com
WebDataset
WebDataset is a library for writing I/O pipelines for large datasets. Its sequential I/O and sharding features make it especially useful for streaming large-scale datasets to a DataLoader.
The WebDataset format
A WebDataset file is a TAR archive containing a series of data files. All successive data files with the same prefix are consider... See more
WebDataset is a library for writing I/O pipelines for large datasets. Its sequential I/O and sharding features make it especially useful for streaming large-scale datasets to a DataLoader.
The WebDataset format
A WebDataset file is a TAR archive containing a series of data files. All successive data files with the same prefix are consider... See more