arXiv:2405.02048v1 [cs.IR] 3 May 2024

arXiv:2405.02048v1 [cs.IR] 3 May 2024

arxiv.org

AgentBench: Evaluating LLMs as Agents

Evaluating Large Language Models (LLMs) as agents in interactive environments, highlighting the performance gap between API-based and open-source models, and introducing the AgentBench benchmark.

arxiv.org

Task-driven Autonomous Agent Utilizing GPT-4, Pinecone, and LangChain for Diverse Applications – Yohei Nakajima

Yohei Nakajimayoheinakajima.com
Thumbnail of Task-driven Autonomous Agent Utilizing GPT-4, Pinecone, and LangChain for Diverse Applications – Yohei Nakajima

Veris Insights - 2024.09 [ER] GenAI Predictions Whitepaper

The whitepaper outlines the evolving role of Generative AI in Talent Acquisition, providing predictions, actionable insights, and strategies for integrating AI into recruiting processes over the next five years.

Link

When Less is More: Investigating Data Pruning for Pretraining LLMs at Scale

arxiv.org
Thumbnail of When Less is More: Investigating Data Pruning for Pretraining LLMs at Scale