Merge 152c1f3d7331c5dc60500089380bd1beff4dec42 into 95aaec702f4bf183e18da90545e26c094cedcf6d

Delete CITATION.cff
2025-04-29 10:29:22 +00:00 · 2025-02-24 14:38:49 +08:00 · 2025-02-24 11:49:25 +08:00
1 changed files with 0 additions and 18 deletions
--- a/CITATION.cff
+++ b/CITATION.cff
@ -1,18 +0,0 @@
 cff-version: 1.2.0
 message: "If you use this work, please cite it using the following metadata."
 title: "DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning"
 authors:
  - name: "DeepSeek-AI"
 year: 2025
 identifiers:
  - type: doi 
    value: 10.48550/arXiv.2501.12948
  - type: arXiv
    value: 2501.12948
 url: "https://arxiv.org/abs/2501.12948"
 categories:
  - "cs.CL"
 repository-code: "https://github.com/deepseek-ai/DeepSeek-R1"
 license: "MIT"
 abstract: >
  We introduce our first-generation reasoning models, DeepSeek-R1-Zero and DeepSeek-R1. DeepSeek-R1-Zero, a model trained via large-scale reinforcement learning (RL) without supervised fine-tuning (SFT) as a preliminary step, demonstrates remarkable reasoning capabilities. Through RL, DeepSeek-R1-Zero naturally emerges with numerous powerful and intriguing reasoning behaviors. However, it encounters challenges such as poor readability, and language mixing. To address these issues and further enhance reasoning performance, we introduce DeepSeek-R1, which incorporates multi-stage training and cold-start data before RL. DeepSeek-R1 achieves performance comparable to OpenAI-o1-1217 on reasoning tasks. To support the research community, we open-source DeepSeek-R1-Zero, DeepSeek-R1, and six dense models (1.5B, 7B, 8B, 14B, 32B, 70B) distilled from DeepSeek-R1 based on Qwen and Llama.
Author	SHA1	Message	Date
MUELRASON1	8a57f65fb5	Merge 152c1f3d7331c5dc60500089380bd1beff4dec42 into 95aaec702f4bf183e18da90545e26c094cedcf6d	2025-02-24 14:38:49 +08:00
DeepSeekDDM	95aaec702f	Delete CITATION.cff Some checks failed Mark and close stale issues / stale (push) Has been cancelled Details	2025-02-24 11:49:25 +08:00