Compare commits

...

3 Commits

Author SHA1 Message Date
Garvit Singh Rathore
796d6ae6e7
Merge 1c10c9f6771fadcd5c8fe52cbac7075527184661 into 45b89c6cb13cf6b01da05ef9a7379f13f8d3baf2 2025-02-18 09:17:36 -06:00
Garvit Singh Rathore
1c10c9f677
Update README.md
Behaviors are exhibited rather than emerged.
2025-01-30 23:17:27 +05:30
Garvit Singh Rathore
bb10d07b27
Update README.md
Used more accurate words.
2025-01-30 23:12:58 +05:30

View File

@ -32,7 +32,7 @@
We introduce our first-generation reasoning models, DeepSeek-R1-Zero and DeepSeek-R1. We introduce our first-generation reasoning models, DeepSeek-R1-Zero and DeepSeek-R1.
DeepSeek-R1-Zero, a model trained via large-scale reinforcement learning (RL) without supervised fine-tuning (SFT) as a preliminary step, demonstrated remarkable performance on reasoning. DeepSeek-R1-Zero, a model trained via large-scale reinforcement learning (RL) without supervised fine-tuning (SFT) as a preliminary step, demonstrated remarkable performance on reasoning.
With RL, DeepSeek-R1-Zero naturally emerged with numerous powerful and interesting reasoning behaviors. With RL, DeepSeek-R1-Zero naturally exhibited numerous powerful and interesting reasoning behaviors.
However, DeepSeek-R1-Zero encounters challenges such as endless repetition, poor readability, and language mixing. To address these issues and further enhance reasoning performance, However, DeepSeek-R1-Zero encounters challenges such as endless repetition, poor readability, and language mixing. To address these issues and further enhance reasoning performance,
we introduce DeepSeek-R1, which incorporates cold-start data before RL. we introduce DeepSeek-R1, which incorporates cold-start data before RL.
DeepSeek-R1 achieves performance comparable to OpenAI-o1 across math, code, and reasoning tasks. DeepSeek-R1 achieves performance comparable to OpenAI-o1 across math, code, and reasoning tasks.
@ -187,7 +187,7 @@ python3 -m sglang.launch_server --model deepseek-ai/DeepSeek-R1-Distill-Qwen-32B
**We recommend adhering to the following configurations when utilizing the DeepSeek-R1 series models, including benchmarking, to achieve the expected performance:** **We recommend adhering to the following configurations when utilizing the DeepSeek-R1 series models, including benchmarking, to achieve the expected performance:**
1. Set the temperature within the range of 0.5-0.7 (0.6 is recommended) to prevent endless repetitions or incoherent outputs. 1. Set the temperature between 0.5 and 0.7 (with 0.6 recommended) to prevent endless repetition or incoherent outputs.
2. **Avoid adding a system prompt; all instructions should be contained within the user prompt.** 2. **Avoid adding a system prompt; all instructions should be contained within the user prompt.**
3. For mathematical problems, it is advisable to include a directive in your prompt such as: "Please reason step by step, and put your final answer within \boxed{}." 3. For mathematical problems, it is advisable to include a directive in your prompt such as: "Please reason step by step, and put your final answer within \boxed{}."
4. When evaluating model performance, it is recommended to conduct multiple tests and average the results. 4. When evaluating model performance, it is recommended to conduct multiple tests and average the results.