r/LocalLLaMA - Reddit

turboderp GitHub - turboderp/exllamav2: A fast inference library for running LLMs locally on modern consumer-class GPUs

Moyi 10 Ways To Run LLMs Locally And Which One Works Best For You

okuvshynov GitHub - okuvshynov/slowllama: Finetune llama2-70b and codellama on MacBook Air without quantization

Ben Auffarth Generative AI with LangChain: Build large language model (LLM) apps with Python, ChatGPT, and other LLMs