
Data Machina #222

Amershi et al . [3] state that software teams “flight” changes or updates to ML models, often by testing them on a few cases prior to live deployment. Our work provides further context into the evaluation and deployment process for production ML pipelines: we found that several organizations, particularly those with many customers, employed a multi... See more
Shreya Shankar • "We Have No Idea How Models will Behave in Production until Production": How Engineers Operationalize Machine Learning.
Ragas is a framework that helps you evaluate your Retrieval Augmented Generation (RAG) pipelines. RAG denotes a class of LLM applications that use external data to augment the LLM’s context. There are existing tools and frameworks that help you build these pipelines but evaluating it and quantifying your pipeline performance can be hard. This is wh... See more
explodinggradients • GitHub - explodinggradients/ragas: Evaluation framework for your Retrieval Augmented Generation (RAG) pipelines

Today Google released Gemini with a 60-page report in which they repeatedly say the training data is key ("We find that data quality is critical to a highly-performing model"), while providing almost no information about how it was made, how it was filtered, or its contents.
Creative AI Lab
creative-ai.org