GitHub - turboderp/exllamav2: A fast inference library for running LLMs locally on modern consumer-class GPUs

GitHub - turboderp/exllamav2: A fast inference library for running LLMs locally on modern consumer-class GPUs

turboderpgithub.com
Thumbnail of GitHub - turboderp/exllamav2: A fast inference library for running LLMs locally on modern consumer-class GPUs

okuvshynov GitHub - okuvshynov/slowllama: Finetune llama2-70b and codellama on MacBook Air without quantization

jafioti GitHub - jafioti/luminal: Deep learning at the speed of light.

unslothai GitHub - unslothai/unsloth: 5X faster 50% less memory LLM finetuning