Sublime

Mark Perry • Just a moment...

DeepSeek_R1

DeepSeek-R1 introduces two large language models, DeepSeek-R1-Zero and DeepSeek-R1, utilizing reinforcement learning for enhanced reasoning capabilities without supervised fine-tuning, along with distillation techniques for smaller models.

Link

V

Visakan Veerasamy

@visakanv

author of FRIENDLY AMBITIOUS NERD and INTROSPECT

A

A P

@ashwinxp

M

Mark Smithivas

@msmithivas

a

artemis

@art3_m15

P

Partha Anbil

@parthaanbil

A

Abhishek Sivaraman

@abhisheks

A

Akash

@akash