GitHub - predibase/lorax: Multi-LoRA inference server that scales to 1000s of fine-tuned LLMs

GitHub - predibase/lorax: Multi-LoRA inference server that scales to 1000s of fine-tuned LLMs

predibasegithub.com
Thumbnail of GitHub - predibase/lorax: Multi-LoRA inference server that scales to 1000s of fine-tuned LLMs

GitHub - SeldonIO/MLServer: An inference server for your machine learning models, including support for multiple frameworks, multi-model serving and more

GitHub - Sairyss/system-design-patterns: Resources related to distributed systems, system design, microservices, scalability and performance, etc

github.com
Thumbnail of GitHub - Sairyss/system-design-patterns: Resources related to distributed systems, system design, microservices, scalability and performance, etc

young-geng GitHub - young-geng/EasyLM: Large language models (LLMs) made easy, EasyLM is a one stop solution for pre-training, finetuning, evaluating and serving LLMs in JAX/Flax.