arXiv:2405.02048v1 [cs.IR] 3 May 2024
AgentBench: Evaluating LLMs as Agents
Evaluating Large Language Models (LLMs) as agents in interactive environments, highlighting the performance gap between API-based and open-source models, and introducing the AgentBench benchmark.
arxiv.orgTask-driven Autonomous Agent Utilizing GPT-4, Pinecone, and LangChain for Diverse Applications – Yohei Nakajima
Yohei Nakajimayoheinakajima.com
Veris Insights - 2024.09 [ER] GenAI Predictions Whitepaper
The whitepaper outlines the evolving role of Generative AI in Talent Acquisition, providing predictions, actionable insights, and strategies for integrating AI into recruiting processes over the next five years.
LinkWhen Less is More: Investigating Data Pruning for Pretraining LLMs at Scale
arxiv.org