ehartford/dolphin · Datasets at Hugging Face

ehartford/dolphin · Datasets at Hugging Face

ehartford/dolphin · Datasets at Hugging Face

RelatedHighlights

LLM-PowerHouse: A Curated Guide for Large Language Models with Custom Training and Inferencing

Welcome to LLM-PowerHouse, your ultimate resource for unleashing the full potential of Large Language Models (LLMs) with custom training and inferencing. This GitHub repository is a comprehensive and curated guide designed to empower developers, researche... See more

ghimiresunil • GitHub - ghimiresunil/LLM-PowerHouse-A-Curated-Guide-for-Large-Language-Models-with-Custom-Training-and-Inferencing: LLM-PowerHouse: Unleash LLMs' potential through curated tutorials, best practices, and ready-to-use code for custom training and inferencing.

Macaw-LLM: Multi-Modal Language Modeling with Image, Audio, Video, and Text Integration

1 2 Chenyang Lyu, 3 Minghao Wu, 1 * Longyue Wang, 1 Xinting Huang,

1 Bingshuai Liu, 1 Zefeng Du, 1 Shuming Shi, 1 Zhaopeng Tu

1 Tencent AI Lab, 2 Dublin City University, 3 Monash University

* Longyue Wang is the corresponding author: vinnlywang@tencent.com

Macaw... See more

lyuchenyang • GitHub - lyuchenyang/Macaw-LLM: Macaw-LLM: Multi-Modal Language Modeling with Image, Video, Audio, and Text Integration

Dolma: 3 Trillion Token Open Corpus for Language Model Pretraining

Luca Soldaini blog.allenai.org

Thumbnail of Dolma: 3 Trillion Token Open Corpus for Language Model Pretraining

Thumbnail of The Advent of "HustleGPT"

Michael Spencer • The Advent of "HustleGPT"