I have watched several in-depth interviews recently, and three core players in the new energy vehicle sector are all betting on the same direction—VLA (Visual Language Action Model) driver large model. This is not a coincidence but a technological inflection point that the entire industry will face by 2025.
**Is VLA really the iPhone 4 moment for intelligent driving?**
Li Xiang, founder of Li Auto, directly states: VLA represents the "iPhone 4 moment" for automotive intelligent driving. His logic is clear—advancing VLA in three stages: from information tool → reasoner → intelligent agent, using a combination of RLHF, world model training, and collision feedback. He also observes that the gap between domestic large models (like DeepSeek, Qianwen) and those in the US has already narrowed significantly.
Xpeng Motors Chairman He Xiaopeng’s judgment is slightly more conservative, but the conclusion is similar: 2025-2027 will be the true inflection point for L3+ autonomous driving, and automakers that cannot keep up will be eliminated. It took him over two months to decide to bet on VLA, which shows the weight of this choice. He emphasizes one point many overlook—the response time for autonomous driving needs to be in milliseconds, which is completely different from ordinary large models (non-real-time, low reliability).
**What is the true moat?**
Horizon founder Yu Kai provides another dimension of answer. He says he has witnessed China’s AI from catching up to running parallel. But his positioning of Horizon is very clear—it’s not a pure chip company but a "chip + software" system-level intelligent driving technology company. Horizon’s chips have shipped over 8 million units so far.
His entrepreneurial philosophy is quite interesting: "The Jianghu (martial arts world) is not about fighting and killing, but about human relationships." He believes the long-term moat for intelligent driving relies on the accumulation of hard, dirty, and tiring work—vehicle-grade quality, massive scenario validation, organizational culture sedimentation—these cannot be achieved overnight. He also shared a sobering statistic: high-level intelligent driving requires a team of thousands and an annual investment of over 1 billion yuan.
**The cost and price of survival**
Li Xiang emphasizes the importance of capital operation and equity structure, noting that Li Auto has the best governance and cash management among new forces. He Xiaopeng is more straightforward— the industry has entered a survival-of-the-fittest race, no one dares to lie flat, and Xpeng must learn to swim in the "sea of blood." They are all doing the same thing: expanding from a price range of 200,000-500,000 yuan to 100,000-500,000 yuan, and achieving profitable, high-quality products at the critical 150,000 yuan price point. In other words, doing higher quality at lower costs—that’s the logic of survival.
**Large models are just a means**
Li Xiang also has an interesting thought: AI is responsible for "intelligence" (capability can be infinitely improved), while humans are responsible for "wisdom" (the relationship with all things). Regarding safety, they set very high goals—VLA aims to reduce major casualties by over 90%. But He Xiaopeng is more pragmatic—if 10% of new car users in first-tier cities use high-level intelligent driving each month, it will trigger a real trust inflection point.
The consensus from these three interviews is clear: large models, VLA, and chips are all important, but true competitiveness comes from systematic capability accumulation, team, capital efficiency, and grasp of industry rhythm. 2025 is not a watershed for technology but a watershed for business models and execution.
View Original
This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.
22 Likes
Reward
22
5
Repost
Share
Comment
0/400
RugPullAlertBot
· 01-06 00:09
Aiming for gross profit at the 150,000 price point—that's the real logic for survival.
---
A team of a thousand people plus 1 billion in investment, just hearing about it is already despairing...
---
iPhone 4 moments? Haha, that phrase has been a bit overused lately.
---
Milliseconds-level response vs. regular large models, the difference is truly worlds apart.
---
Everyone talks about system accumulation, but how many can really settle down and do this...
---
Li Xiang's approach of "intelligence vs. wisdom" is interesting, but in the end, it still comes down to money.
---
Horizon has shipped 8 million units; this is what a moat looks like.
---
The elimination race has already begun; those who lie flat should be eliminated.
---
Can VLA really reduce 90% of accidents? That goal is a bit ambitious.
---
Xpeng's moves at the 150,000 price point are indeed the only way to survive.
View OriginalReply0
LiquidityWhisperer
· 01-05 04:54
Here we go again talking about iPhone moments, this time it's VLA? It feels like there's a new "inflection point" every year, but the ones that truly fail are still the ones that didn't make money.
I believe in millisecond-level response times. Ordinary large models really can't handle autonomous driving, but after all this hype, why haven't we seen a qualitative leap yet?
Yu Kai's comment about the martial arts world still has some truth to it—hard work and tiring jobs are undervalued, and funding relies on stories.
The 150,000-tier is really the decisive game; whoever survives here is the true winner.
It's good that DeepSeek has caught up; at least we don't have to rely entirely on foreign technology.
Systematic accumulation is just burning money and people. With a 1 billion annual investment, small players simply can't afford to play.
Buying stocks of new powerhouses now is basically betting on who can survive until 2027.
View OriginalReply0
LiquidationWatcher
· 01-04 01:46
It's the VLA again, and the iPhone 4 moment—getting a bit tired of it.
Industry competition is just the way it is; investing 1 billion and ending up with nothing is quite common.
Actually, no one really admits when they can't make it anymore.
From this perspective, Xiaopeng is still clear-headed; the 150,000 price range is the realistic one.
View OriginalReply0
BearMarketBarber
· 01-04 01:28
Damn, are you still hyping VLA? Wake up, can millisecond-level response truly guarantee no crashes?
I have watched several in-depth interviews recently, and three core players in the new energy vehicle sector are all betting on the same direction—VLA (Visual Language Action Model) driver large model. This is not a coincidence but a technological inflection point that the entire industry will face by 2025.
**Is VLA really the iPhone 4 moment for intelligent driving?**
Li Xiang, founder of Li Auto, directly states: VLA represents the "iPhone 4 moment" for automotive intelligent driving. His logic is clear—advancing VLA in three stages: from information tool → reasoner → intelligent agent, using a combination of RLHF, world model training, and collision feedback. He also observes that the gap between domestic large models (like DeepSeek, Qianwen) and those in the US has already narrowed significantly.
Xpeng Motors Chairman He Xiaopeng’s judgment is slightly more conservative, but the conclusion is similar: 2025-2027 will be the true inflection point for L3+ autonomous driving, and automakers that cannot keep up will be eliminated. It took him over two months to decide to bet on VLA, which shows the weight of this choice. He emphasizes one point many overlook—the response time for autonomous driving needs to be in milliseconds, which is completely different from ordinary large models (non-real-time, low reliability).
**What is the true moat?**
Horizon founder Yu Kai provides another dimension of answer. He says he has witnessed China’s AI from catching up to running parallel. But his positioning of Horizon is very clear—it’s not a pure chip company but a "chip + software" system-level intelligent driving technology company. Horizon’s chips have shipped over 8 million units so far.
His entrepreneurial philosophy is quite interesting: "The Jianghu (martial arts world) is not about fighting and killing, but about human relationships." He believes the long-term moat for intelligent driving relies on the accumulation of hard, dirty, and tiring work—vehicle-grade quality, massive scenario validation, organizational culture sedimentation—these cannot be achieved overnight. He also shared a sobering statistic: high-level intelligent driving requires a team of thousands and an annual investment of over 1 billion yuan.
**The cost and price of survival**
Li Xiang emphasizes the importance of capital operation and equity structure, noting that Li Auto has the best governance and cash management among new forces. He Xiaopeng is more straightforward— the industry has entered a survival-of-the-fittest race, no one dares to lie flat, and Xpeng must learn to swim in the "sea of blood." They are all doing the same thing: expanding from a price range of 200,000-500,000 yuan to 100,000-500,000 yuan, and achieving profitable, high-quality products at the critical 150,000 yuan price point. In other words, doing higher quality at lower costs—that’s the logic of survival.
**Large models are just a means**
Li Xiang also has an interesting thought: AI is responsible for "intelligence" (capability can be infinitely improved), while humans are responsible for "wisdom" (the relationship with all things). Regarding safety, they set very high goals—VLA aims to reduce major casualties by over 90%. But He Xiaopeng is more pragmatic—if 10% of new car users in first-tier cities use high-level intelligent driving each month, it will trigger a real trust inflection point.
The consensus from these three interviews is clear: large models, VLA, and chips are all important, but true competitiveness comes from systematic capability accumulation, team, capital efficiency, and grasp of industry rhythm. 2025 is not a watershed for technology but a watershed for business models and execution.