GitHub - predibase/lorax: Multi-LoRA inference server that scales to 1000s of fine-tuned LLMs
Ollama
ollama.com
MLServer aims to provide an easy way to start serving your machine learning models through a REST and gRPC interface, fully compliant with KFServing's V2 Dataplane spec. Watch a quick video introducing the project here.
- Multi-model serving, letting users run multiple models within the same process.
- Ability to run inference in parallel for vertical sc
GitHub - SeldonIO/MLServer: An inference server for your machine learning models, including support for multiple frameworks, multi-model serving and more
Large language models (LLMs) made easy, EasyLM is a one stop solution for pre-training, finetuning, evaluating and serving LLMs in JAX/Flax. EasyLM can scale up LLM training to hundreds of TPU/GPU accelerators by leveraging JAX's pjit functionality.
Building on top of Hugginface's transformers and datasets, this repo provides an easy to use and easy... See more
Building on top of Hugginface's transformers and datasets, this repo provides an easy to use and easy... See more
young-geng • GitHub - young-geng/EasyLM: Large language models (LLMs) made easy, EasyLM is a one stop solution for pre-training, finetuning, evaluating and serving LLMs in JAX/Flax.
Welcome to RADAR
radardao.xyz