Sublime

An inspiration engine for ideas

AllPeopleCollectionsArticlesAudioBooksFilesHighlightsImagesLinksNotesTextTweetsVideosSocial

Deep-ML

GitHub - arthur-ai/bench: A tool for evaluating LLMs

DeepEval — It’s a tool for easy and efficient LLM testing. Deepeval aims to make writing tests for LLM applications (such as RAG) as easy as writing Python unit tests.

Testing framework for LLM Part

dstack is an open-source toolkit and orchestration engine for running GPU workloads. It's designed for development, training, and deployment of gen AI models on any cloud.

Supported providers: AWS, GCP, Azure, Lambda, TensorDock, Vast.ai, and DataCrunch.

Latest news ✨

[2024/01] dstack 0.14.0: OpenAI-compatible endpoints preview (Release)

[2023/12] dst

dstackai • GitHub - dstackai/dstack: dstack is an open-source toolkit for running GPU workloads on any cloud. It works seamlessly with any cloud GPU providers. Discord: https://discord.gg/u8SmfwPpMd

Mem0: The Memory Layer for Personalized AI

Mem0 provides a smart, self-improving memory layer for Large Language Models, enabling personalized AI experiences across applications.

Note: The Mem0 repository now also includes the Embedchain project. We continue to maintain and support Embedchain ❤️. You can find the Embedchain codebase in the embedchai

GitHub - mem0ai/mem0: The memory layer for Personalized AI

GitHub - deepseek-ai/DeepSeek-R1

github.com

HoneyHive is a collaboration platform to test and evaluate, monitor and debug your LLM apps, from prototype to production. It enables you to continuously improve LLM apps in production with human feedback, quantitative rigour and safety best-practices.

Carlos • Data Machina #222

Dify is an LLM application development platform that has helped built over 100,000 applications. It integrates BaaS and LLMOps, covering the essential tech stack for building generative AI-native applications, including a built-in RAG engine. Dify allows you to deploy your own version of Assistants API and GPTs, based on any LLMs.

Using our Cloud S... See more

langgenius • GitHub - langgenius/dify: An Open-Source Assistants API and GPTs alternative. Dify.AI is an LLM application development platform. It integrates the concepts of Backend as a Service and LLMOps, covering the...

AgentBench: Evaluating LLMs as Agents

Evaluating Large Language Models (LLMs) as agents in interactive environments, highlighting the performance gap between API-based and open-source models, and introducing the AgentBench benchmark.

arxiv.org