X Square Robot Secures RMB 2B Series B Funding

X Square Robot closed a Series B round raising RMB 2 billion (USD 292.8 million) in a deal co-led by Xiaomi and HongShan. The capital validates the startup's end-to-end strategy for building embodied intelligence foundation models fully in-house. X Square Robot is developing a unified multimodal transformer called WALL-A that converts visual, language, tactile, and action signals into a single token sequence, enabling synchronized perception, planning, and control. Xiaomi's participation complements its broader robotics bets and recent open-source release Xiaomi-Robotics-0, and signals stronger ecosystem alignment between platform players and robotics model builders. The round positions X Square to accelerate real-world deployments and scale multimodal model training and data collection.
What happened
X Square Robot completed a Series B financing of RMB 2 billion (USD 292.8 million), co-led by Xiaomi and HongShan. The raise reinforces X Square Robot's commitment to building embodied intelligence foundation models end-to-end and funding near-term productization and data collection for real-world robotics use cases.
Technical details
X Square Robot is developing a natively multimodal foundation model called WALL-A. The company maps visual, language, tactile, and action signals into a continuous sequence of high-dimensional tokens, then feeds those tokens into a single transformer backbone to produce synchronized outputs across perception, decision, and execution. This contrasts with a fine-tune-on-top architecture and aims to reduce cross-modality information loss and latency. Xiaomi has also open-sourced a vision-language-action model, Xiaomi-Robotics-0, earlier this year.
Xiaomi ecosystem moves
Xiaomi is actively backing robotics stack companies and is now an investor in X Square Robot. Recent Xiaomi investments include:
- •ViTai Robotics, focused on tactile sensing
- •Xynova, developing dexterous hands
- •RoboParty, building robot bodies
- •SynapX, focused on models
Xiaomi is also running live trials of its CyberOne robot in automotive manufacturing, signaling appetite for rapid, integrated deployment.
Context and significance
The funding is sizable for a Series B and signals investor confidence in companies pursuing unified, multimodal approaches to embodied intelligence rather than piecemeal integration of separate perception and control modules. For practitioners, WALL-A represents a practical instantiation of a single-transformer paradigm for robotics, aligning with a broader trend toward foundation models that natively support action and tactile data.
What to watch
Watch for technical benchmarks, open weights or APIs from X Square, details on training data scale and simulators, and Xiaomi's integration plans-these will determine whether the end-to-end approach yields robust, deployable robotic agents in manufacturing and logistics.
Scoring Rationale
A large Series B (nearly USD 293M) co-led by Xiaomi and HongShan materially advances a Chinese leader in embodied intelligence. The in-house foundation-model approach for robotics is strategically significant for practitioners, but it is not yet a paradigm-shifting global model release.
Practice interview problems based on real data
1,500+ SQL & Python problems across 15 industry datasets — the exact type of data you work with.
Try 250 free problemsStep-by-step roadmaps from zero to job-ready — curated courses, salary data, and the exact learning order that gets you hired.



