DeepSeek V3 icon

DeepSeek V3

Visit

DeepSeek-V3 has outperformed several other open-source models, such as Qwen2.5-72B and Llama-3.1-405B, in multiple assessments, and it stands on par with the world's leading proprietary models, GPT-4o and Claude-3.5-Sonnet, in terms of performance.

On December 26, 2024, DeepSeek released the first version of its new model series, DeepSeek-V3, making it available as open-source.

Users can interact with the latest V3 model by visiting the official website chat.deepseek.com. The API services have been updated, ensuring that users do not need to modify their interface configurations. However, the current version, DeepSeek-V3, does not support multi-modal input and output yet.

Performance Alignment with Leading Closed-Source Models Overseas

DeepSeek-V3 is a self-developed Mixture of Experts (MoE) model with 671 billion parameters and 37 billion activations, trained on 14.8 trillion tokens. According to evaluations, DeepSeek-V3 outperforms other open-source models like Qwen2.5-72B and Llama-3.1-405B across multiple metrics and achieves performance comparable to top closed-source models, including GPT-4o and Claude-3.5-Sonnet.

  • Knowledge: In knowledge-based tasks (MMLU, MMLU-Pro, GPQA, SimpleQA), DeepSeek-V3 shows a significant improvement over its predecessor, DeepSeek-V2.5, nearing the performance of Claude-3.5-Sonnet-1022.
  • Long Text: In long text assessments, including DROP, FRAMES, and LongBench v2, DeepSeek-V3 performs better than other models.
  • Code: In algorithmic code scenarios (Codeforces), DeepSeek-V3 demonstrates a leading advantage; in engineering code scenarios (SWE-Bench Verified), its performance is close to Claude-3.5-Sonnet-1022.
  • Mathematics: In major U.S. math competitions (AIME 2024, MATH) and the national high school mathematics league (CNMO 2024), DeepSeek-V3's performance surpasses all open-source and closed-source models.
  • Chinese Language Proficiency: In educational assessments such as C-Eval and pronoun disambiguation tests, DeepSeek-V3 performs similarly to Qwen2.5-72B, but shows an advantage in factual knowledge tests like C-SimpleQA.

Tripled Generation Speed

Thanks to algorithmic and engineering innovations, DeepSeek-V3's text generation speed has increased from 20 tokens per second (TPS) to 60 TPS, achieving a three-fold improvement over the V2.5 model and enhancing user experience.

API Service Pricing Adjustments

With the release of DeepSeek-V3, the pricing for model API services has also been adjusted. The new pricing structure is 0.5 yuan per million input tokens (for hits) / 2 yuan (for misses), and 8 yuan per million output tokens. Additionally, DeepSeek has established a promotional pricing experience period of 45 days starting from now until February 8, 2025, during which the service price will remain at the familiar 0.1 yuan per million input tokens (for hits) / 1 yuan (for misses), and 2 yuan per million output tokens. Both registered existing users and new users who sign up during this period can enjoy this discount.

Open-Source Weights and Local Deployment

DeepSeek-V3 is trained using FP8 and has released its native FP8 weights as open-source. Thanks to the support from the open-source community, SGLang and LMDeploy quickly added support for FP8 inference of the V3 model. Meanwhile, TensorRT-LLM and MindIE have implemented BF16 inference. To facilitate adaptation and expansion of application scenarios in the community, a conversion script from FP8 to BF16 is also provided.

For downloading model weights and more information on local deployment, you can refer to the Hugging Face page.

DeepSeek adheres to the belief of "pursuing inclusive AGI through open-source spirit and long-termism." The company aims to share the latest progress in model pre-training with the community and looks forward to further narrowing the capability gap between open-source and closed-source models.

This marks a brand-new beginning, and in the future, DeepSeek will continue to develop richer functionalities based on DeepSeek-V3, including deep thinking and multi-modal capabilities, sharing the latest exploration results with the community.

Information

Data

  • Monthly Visitors135
  • Domain Rating95
  • Authority Score88

Categories & Tags