Rakuten announces Rakuten AI 3.0 model, with configuration files indicating the underlying architecture as DeepSeek V3

Gate News reports that on March 17, Rakuten Group announced the release of Rakuten AI 3.0, branded as “Japan’s Largest High-Performance AI Model,” which is open-sourced for free under the Apache 2.0 license. The model features a MoE (Mixture of Experts) architecture with a total of 671 billion parameters, 37 billion active parameters during inference, a context window of 128K, and is optimized for Japanese, outperforming GPT-4o on multiple Japanese benchmarks. This model is a result of the GENIAC project jointly promoted by Japan’s Ministry of Economy, Trade and Industry and the New Energy and Industrial Technology Development Organization (NEDO), with partial training compute funding from the Japanese government. In the announcement, Rakuten described the base model as “leveraging the best results from the open-source community,” without specifying the exact model. Community members quickly examined the model files published on HuggingFace and found that the config.json explicitly states model_type: deepseek_v3 and architectures: DeepseekV3ForCausalLM, with 671 billion total parameters, 37 billion active parameters, and a 128K context window, which are identical to DeepSeek V3, indicating that this model is fine-tuned from DeepSeek V3 with Japanese data.

View Original
Disclaimer: The information on this page may come from third parties and does not represent the views or opinions of Gate. The content displayed on this page is for reference only and does not constitute any financial, investment, or legal advice. Gate does not guarantee the accuracy or completeness of the information and shall not be liable for any losses arising from the use of this information. Virtual asset investments carry high risks and are subject to significant price volatility. You may lose all of your invested principal. Please fully understand the relevant risks and make prudent decisions based on your own financial situation and risk tolerance. For details, please refer to Disclaimer.
Comment
0/400
No comments