NVIDIA Technical Blog | News and tutorials for developers, data ...
koboldcpp
🤗 Transformers
Huggingface is an open source platform and community for deep learning models for language, vision, audio and multimodal. They develop and maintain the transformers library, which simplifies the process of downloading and training state of the art deep learning models.
This is the best library if you have a background in m... See more
🤗 Transformers
Huggingface is an open source platform and community for deep learning models for language, vision, audio and multimodal. They develop and maintain the transformers library, which simplifies the process of downloading and training state of the art deep learning models.
This is the best library if you have a background in m... See more
Moyi • 10 Ways To Run LLMs Locally And Which One Works Best For You
GPT4All: An ecosystem of open-source on-edge large language models.
Important
GPT4All v2.5.0 and newer only supports models in GGUF format (.gguf). Models used with a previous version of GPT4All (.bin extension) will no longer work.
GPT4All is an ecosystem to run powerful and customized large language models that work locally on consumer grade CPUs a... See more
Important
GPT4All v2.5.0 and newer only supports models in GGUF format (.gguf). Models used with a previous version of GPT4All (.bin extension) will no longer work.
GPT4All is an ecosystem to run powerful and customized large language models that work locally on consumer grade CPUs a... See more
nomic-ai • GitHub - nomic-ai/gpt4all: gpt4all: open-source LLM chatbots that you can run anywhere
eneral-purpose models
- 1.1B: TinyDolphin 2.8 1.1B. Takes about ~700MB RAM and tested on my Pi 4 with 2 gigs of RAM. Hallucinates a lot, but works for basic conversation.
- 2.7B: Dolphin 2.6 Phi-2. Takes over ~2GB RAM and tested on my 3GB 32-bit phone via llama.cpp on Termux.
- 7B: Nous Hermes Mistral 7B DPO. Takes about ~4-5GB RAM depending on contex
r/LocalLLaMA - Reddit
Supported Models
Suggest Edits
Where possible, we try to match the Hugging Face implementation. We are open to adjusting the API, so please reach out with feedback regarding these details.
Model
Context Length
Model Type
codellama-34b-instruct
16384
Chat Completion
llama-2-70b-chat
4096
Chat Completion
mistral-7b-instruct
4096 [1]
Chat Completion
pplx-7b-c... See more
Suggest Edits
Where possible, we try to match the Hugging Face implementation. We are open to adjusting the API, so please reach out with feedback regarding these details.
Model
Context Length
Model Type
codellama-34b-instruct
16384
Chat Completion
llama-2-70b-chat
4096
Chat Completion
mistral-7b-instruct
4096 [1]
Chat Completion
pplx-7b-c... See more