GitHub - sgl-project/sglang: SGLang is a structured generati...

GitHub - sgl-project/sglang: SGLang is a structured generation language designed for large language models (LLMs). It makes your interaction with models faster and more controllable.

github.com

RelatedHighlights

LangChain is an open-source Python framework for building LLM-powered applications. It provides developers with modular, easy-to-use components for connecting language models with external data sources and services.

Ben Auffarth • Generative AI with LangChain: Build large language model (LLM) apps with Python, ChatGPT, and other LLMs

1. Introduction of DeepSeek Coder

DeepSeek Coder is composed of a series of code language models, each trained from scratch on 2T tokens, with a composition of 87% code and 13% natural language in both English and Chinese. We provide various sizes of the code model, ranging from 1B to 33B versions. Each model is pre-trained on project-level code co... See more

deepseek-ai • GitHub - deepseek-ai/DeepSeek-Coder: DeepSeek Coder: Let the Code Write Itself

Clean & curate your data with LLMs

databonsai is a Python library that uses LLMs to perform data cleaning tasks.

Features

Suite of tools for data processing using LLMs including categorization, transformation, and extraction

Validation of LLM outputs

Batch processing for token savings

Retry logic with exponential backoff for handling rate limits an

databonsai • GitHub - databonsai/databonsai: clean & curate your data with LLMs.

DeepSeek Coder comprises a series of code language models trained from scratch on both 87% code and 13% natural language in English and Chinese, with each model pre-trained on 2T tokens. We provide various sizes of the code model, ranging from 1B to 33B versions. Each model is pre-trained on repo-level code corpus by employing a window size of 16K ... See more