Sublime

RelatedHighlights

Promisingly, he showed that Q-learning would always “converge,” namely, as long as the system had the opportunity to try every action, from every state, as many times as necessary, it would always, eventually develop the perfect value function:

Brian Christian • The Alignment Problem

人形机器人每个关节都需要大电机,导致整体功率大、续航短。这里面需要新的结构设计、算法设计

siwei • Article

Query the RAG anyway and let the LLM itself chose whether to use the the RAG context or its built in knowledge

Query the RAG but only provide the result to the LLM if it meets some level of relevancy (ie embedding distance) to the question

Run the LLM both on it's own and with the RAG response, use a heuristic (or another LLM) to pick the best answer

r/LocalLLaMA - Reddit

目前做大模型+机器人的最大难点,在于获取数据。谷歌和微软都选择了从收集家用动作入手,未来其他场景,比如办公室、街道、商场、工厂,如果能训练出来一个个小模型,融入到底层大模型里面,将会是创业公司的重要壁垒。