GitHub - stanford-futuredata/ColBERT: Stanford ColBERT: stat...

GitHub - stanford-futuredata/ColBERT: Stanford ColBERT: state-of-the-art neural search (SIGIR'20, TACL'21, NeurIPS'21, NAACL'22, CIKM'22)

github.com

RelatedHighlights

Text embeddings are a critical piece of many pipelines, from search, to RAG, to vector databases and more. Most embedding models are BERT/Transformer-based and typically have short context lengths (e.g., 512). That’s only about two pages of text, but documents can be very long – books, legal cases, TV screenplays, code repositories, etc can be tens... See more

Long-Context Retrieval Models with Monarch Mixer

Welcome to RAGatouille

Easily use and train state of the art retrieval methods in any RAG pipeline. Designed for modularity and ease-of-use, backed by research.

The main motivation of RAGatouille is simple: bridging the gap between state-of-the-art research and alchemical RAG pipeline practices. RAG is complex, and there are many moving parts. To g... See more

GitHub - bclavie/RAGatouille: Easily use and train state of the art late-interaction retrieval methods (ColBERT) in any RAG pipeline. Designed for modularity and ease-of-use, backed by research.

Pipeline RobustQA Avg. score Avg. response time (secs) Azure Cognitive Search Retriever + GPT4 + Ada 72.36 >1.0s Canopy (Pinecone) 59.61 >1.0s Langchain + Pinecone + OpenAI 61.42 <0.8s Langchain + Pinecone + Cohere 69.02 <0.6s LlamaIndex + Weaviate Vector Store - Hybrid Search 75.89 <1.0s RAG Google Cloud VertexAI... See more

arXiv:2405.02048v1 [cs.IR] 3 May 2024

🥤 Cola [NeurIPS 2023]

Large Language Models are Visual Reasoning Coordinators

Liangyu Chen*,†,♥ Bo Li*,♥ Sheng Shen♣ Jingkang Yang♥

Chunyuan Li♠ Kurt Keutzer♣ Trevor Darrell♣ Ziwei Liu✉,♥

♥S-Lab, Nanyang Technological University

♣University of California, Berkeley ♠Microsoft Research, Redmond

*Equal Contribution †Project Lead ✉Corresponding Author... See more

Long-Context Retrieval Models with Monarch Mixer

GitHub - bclavie/RAGatouille: Easily use and train state of the art late-interaction retrieval methods (ColBERT) in any RAG pipeline. Designed for modularity and ease-of-use, backed by research.

arXiv:2405.02048v1 [cs.IR] 3 May 2024

cliangyu • GitHub - cliangyu/Cola: [NeurIPS2023] Official implementation of the paper "Large Language Models are Visual Reasoning Coordinators"