GitHub - facebookresearch/multimodal at a33a8b888a542a4578b1...

GitHub - facebookresearch/multimodal at a33a8b888a542a4578b16972aecd072eff02c1a6

RelatedInsightsHighlights

GitHub - comfyanonymous/ComfyUI: The most powerful and modular stable diffusion GUI, api and backend with a graph/nodes interface.

multimodal-maestro

👋 hello

Multimodal-Maestro gives you more control over large multimodal models to get the outputs you want. With more effective prompting tactics, you can get multimodal models to do tasks you didn't know (or think!) were possible. Curious how it works? Try our HF space!

roboflow • GitHub - roboflow/multimodal-maestro: Effective prompting for Large Multimodal Models like GPT-4 Vision, LLaVA or CogVLM. 🔥

Macaw-LLM: Multi-Modal Language Modeling with Image, Audio, Video, and Text Integration

1 2 Chenyang Lyu, 3 Minghao Wu, 1 * Longyue Wang, 1 Xinting Huang,

1 Bingshuai Liu, 1 Zefeng Du, 1 Shuming Shi, 1 Zhaopeng Tu

1 Tencent AI Lab, 2 Dublin City University, 3 Monash University

* Longyue Wang is the corresponding author: vinnlywang@tencent.com

Macaw... See more

lyuchenyang • GitHub - lyuchenyang/Macaw-LLM: Macaw-LLM: Multi-Modal Language Modeling with Image, Video, Audio, and Text Integration

Flim

beta.flim.ai