Physical AI is merely the beginning of a new era. In the future world, including homes, offices, and factories, everything will be populated by embodied or humanoid robots.
Entrepreneurship is like wielding a hammer to find nails, but you must first forge the hammer itself. Only China can accumulate these digital assets, as the United States lacks such a vast industrial system. It is challenging to find another country with this scale globally—a scale so immense that manual production is infeasible, necessitating fully automated factories.
This marks a significant milestone. The United States lacks such a vast industrial system, and it is difficult to find another country worldwide with a comparable scale. This scale is so extensive that manual production is insufficient, necessitating automated production in unmanned factories.
Below is the full interview: Huang Xiaohuang: You can see here that we are simulating an earthquake. The goal is to train your robot not to fall during an earthquake and to continue working. Dong Qian: Why use such an extreme scenario for training? Huang Xiaohuang: Because this scenario is challenging to replicate in the physical world. We must ensure the robot can function during an earthquake without issues. Training can only be conducted in a digital environment, and after completion, it is ready. Voiceover: In Hangzhou, Zhejiang, this seemingly ordinary office space conceals another world—a digital training ground for robots. GroupCore Technology, after 14 years of entrepreneurship, has accumulated significant advantages in spatial intelligence, earning a place among Hangzhou’s “Six Little Dragons” and holding an influential position internationally. Huang Xiaohuang: We believe this is just the beginning of a new era. Dong Qian: What new era? Huang Xiaohuang: The era of Physical AI. The new era of Physical AI has just begun. Voiceover: Physical AI can be understood as artificial intelligence that comprehends physical rules. Only by understanding these rules can autonomous machines, such as robots and self-driving cars, perceive, understand, and execute complex operations in the real physical world. Huang Xiaohuang: To put it simply, if you buy a robot in the future to work in your home, it would need to fall dozens of times before it can perform tasks reliably. Would you be concerned? Absolutely. Therefore, it is essential for the robot to make all necessary mistakes and learn from them in a digital world before working diligently in your home. Dong Qian: What can you do in this process? Huang Xiaohuang: First, we must assist in training the robots. We are proactively preparing for this future world, ensuring that humans and machines perceive the same reality. The image you see now is what humans perceive, while the one below is what the robot sees—darker areas indicate proximity, and lighter areas indicate distance. Voiceover: As the co-founder and chairman of GroupCore Technology, Huang Xiaohuang explains Physical AI, spatial intelligence, and robot training by using relatable examples and simplifying concepts. Before the rise of Hangzhou’s Six Little Dragons, he rarely appeared in the media, being fundamentally a technology enthusiast.Huang Xiaohuang: In simple terms, the future world, such as ten or twenty years from now, will see all devices around us become intelligent. This includes your home, office environment, and factories, which will be filled with large-scale or humanoid robots, as well as devices like cameras. Eventually, everything will achieve intelligence.
Dong Qian: What does intelligent cameras mean? Huang Xiaohuang: For example, today a colleague must sit behind the camera to operate it, but soon it will develop its own ‘brain,’ working autonomously to capture footage and engage in dialogue. Dong Qian: Are you planning to dismantle my team? Huang Xiaohuang: Not at all. They might manage ten cameras each, taking on more advanced tasks. Imagine all the devices around you that are currently inactive will become active, serving you. Each robot requires training and must perceive its environment. They need to operate within a shared digital-physical world to simulate and collaborate effectively. Otherwise, having ten robots acting chaotically around you would be unmanageable. Dong Qian: How does this relate to the training you described earlier? Huang Xiaohuang: Our released Special LM is designed to help devices understand your space. Voiceover: Compared to training robots, understanding the entrepreneurial journey of GroupCore Technology is much simpler. Dong Qian: Why is that? These three words represent your company culture: simple, focused, open. Huang Xiaohuang: These were refined over time, with an emphasis on simplicity to reduce bureaucracy in work communication. I highlighted focus, while Chen Hang advocated for openness. Everyone agreed, and it became the enduring DNA of our company. Voiceover: In 2007, Huang Xiaohuang graduated from Zhejiang University’s Chu Kochen Honors College and pursued a Ph.D. at the University of Illinois Urbana-Champaign, focusing on high-performance computing using GPU graphics processors. Before completing his studies, he joined NVIDIA, working on developing parallel computing programming frameworks and ecosystems for GPU chips. However, he decided to leave NVIDIA after just one year. Dong Qian: Were you interested in that work? Huang Xiaohuang: Very much so. Dong Qian: If you were so interested, why did you leave after only one year? Huang Xiaohuang: Because at that time, NVIDIA’s entire system was still oriented toward desktop computers. Looking back ten years, all media were discussing the future as the era of mobile and cloud computing. Who would have thought that desktop computers, and even laptops, might become obsolete?I immediately discussed with my team manager why we weren’t pursuing cloud computing by integrating GPUs into cloud infrastructure, instead of focusing on desktop computer research.
Dong Qian: Why hadn’t NVIDIA considered such an idea? Huang Xiaohuang: Initially, I was quite puzzled. The company’s stock price was low at that time, and I wondered if their approach was outdated. Dong Qian: Did you feel NVIDIA was limiting your vision? What preparations did you make before deciding to leave? Huang Xiaohuang: Being young and impulsive, I didn’t prepare much. After sharing my ideas with the manager and learning the company wouldn’t pursue cloud computing or mobile technologies, I questioned its future and decided to take matters into my own hands. Narrator: In 2011, NVIDIA was widely perceived as a consumer electronics hardware company. Although Geoffrey Hinton was already using NVIDIA GPUs to train deep neural networks, few recognized that GPU parallel computing would become the computational foundation for the AI boom. Huang Xiaohuang foresaw the potential of combining GPU supercomputing with cloud deployment. He invited Chen Hang from Zhejiang University and Zhu Hao from Tsinghua University to co-found a startup focused on GPU-accelerated cloud-based graphics rendering, which involves converting 3D models into 2D images or videos through algorithms. Huang Xiaohuang: I sold my NVIDIA stocks and pooled hundreds of thousands of RMB with my partners. At that age, I didn’t fully grasp the value of money. Dong Qian: How old were you? Huang Xiaohuang: Around 25. Back then, hundreds of thousands seemed substantial—enough to buy a property in China over a decade ago. Narrator: The young team quickly built a cost-effective, high-performance GPU cluster using affordable graphics cards, significantly reducing computational costs and increasing speed. However, the investment community was still focused on mobile internet. Huang faced unanimous rejection when fundraising in Silicon Valley. During the toughest period, Zhejiang Province’s investment promotion in Silicon Valley prompted Huang and his team to return to China. Huang Xiaohuang: After returning, we started our venture in an attic.Dong Qian: Why did you return to working from an attic?
Huang Xiaohuang: Initially, we thought starting a business in a garage or attic was cool. Later, we realized this environment was completely unsuitable for China’s ecosystem. It would be strange to bring job applicants into a bedroom for interviews. Voiceover: In 2012, Hinton and his students revolutionized image recognition competitions using deep convolutional neural networks, heralding a new chapter in the AI revolution. This breakthrough also established GPU’s reputation. Through collaboration with Amazon, NVIDIA entered the cloud services battlefield. Meanwhile, the young team at Qunhe Technology was exploring applications for their core technology—a GPU-powered rendering engine achieving physical accuracy, meaning rendered images matched real-world parameters. Huang Xiaohuang: I reduced the time required for physically accurate rendering from 30-60 minutes to just 10 seconds per image. Dong Qian: Why insist on physical accuracy? Huang Xiaohuang: Because it offers greater versatility and stability. Dong Qian: As someone immersed in technical research, did you consider how this technology connects to the real world? Huang Xiaohuang: I didn’t overthink it initially. At NVIDIA, the company’s methodology was to develop cutting-edge technology first, then invest resources in finding applications. This approach influenced me—essentially “finding a nail for your hammer” after creating the tool. Voiceover: This “hammer” could be used for movie special effects rendering, but the cost recovery period was too long. While applicable to gaming, mobile games at the time didn’t require high-quality graphics. Ultimately, their technology found its niche in the home renovation industry. Huang Xiaohuang: We needed to assess whether the market had thousands of players or just one or two. Dong Qian: What’s the difference? Huang Xiaohuang: Serving one or two major clients would mean project-based work. We aspired to build a product company, not a project-based one. Dong Qian: Why would home renovation companies need you, and how did they gain traction? Huang Xiaohuang: It coincided with China’s real estate boom. Previously, creating a single rendering took a week—far too slow in a competitive market. Speed became their critical competitive advantage. Dong Qian: So your emergence perfectly met their needs. Huang Xiaohuang: I recall our first offline event where we demoed our product. The audience was electrified—we processed so many POS payments that two machines overheated and malfunctioned.Dong Qian: The idea of using POS machines sounds appealing, as it implies the potential for substantial daily earnings. Are you focused on the incoming revenue or the emergence of the market?
Huang Xiaohuang: The emergence of the market, where people are genuinely willing to pay for it, indicates real demand rather than just talk. Voiceover: However, as the user base expanded, the technological challenges for Huang Xiaohuang and his team increased exponentially. Huang Xiaohuang: The most embarrassing moment I recall was during a live demonstration for a client when the server crashed midway, causing the entire cluster to fail. I had to excuse myself to the restroom, where I stayed, sending messages to check if the issue was resolved before returning. Dong Qian: What underlying problem led to this incident? Huang Xiaohuang: It ultimately comes down to the need for technical accumulation. A typical issue we faced was temperatures frequently exceeding 100 degrees, causing graphics cards to malfunction. We had to devise various solutions to cool them down. As pioneers, we encountered many unforeseen challenges that posed significant difficulties at the time. Voiceover: In 2013, Qunhe Technology launched its flagship product, Kujiale, a spatial design software that gained rapid popularity due to its 10-second rendering capability. It attracted a large number of designers and became the preferred design tool in the home furnishings industry. Dong Qian: By adhering to physical accuracy, could you also gather data from countless households? Huang Xiaohuang: I believe it’s more about accumulation than collection. Accumulation occurs as users employ our product for design, and their work continuously builds up here. Dong Qian: What kind of data is accumulated? Can you provide an example? Huang Xiaohuang: It includes every component, down to details like where a nail or hinge should be placed—all such data is recorded. Voiceover: The expansion of the home decoration industry’s supply chain and data scale naturally led Huang Xiaohuang and his team to extend their technological advantages into Industry 4.0. Physically accurate data allows design blueprints to interface directly with factory production, which in turn generates more data accumulation. Huang Xiaohuang: I had a vague sense that this held undiscovered value, so we maintained a research team to continuously study it and publish papers. Dong Qian: While accumulating data, you are also conducting your own research on it. Huang Xiaohuang: We research how to utilize it. Before 2018, we internally discussed how we were sitting on a gold mine but couldn’t extract the gold—that was the feeling.Narrator: In 2018, leveraging its extensive accumulation of indoor spatial data from business operations, CoolTech collaborated with several domestic and international universities to launch the Interior Net dataset. Prior to this, many renowned datasets existed globally, but most consisted of static or non-interactive data. Interior Net is one of the few datasets composed of interactive 3D data and stands as the world’s largest deep learning dataset for indoor scene recognition. Most importantly, it is a free and open-source dataset.
Huang Xiaohuang: Before 2018, we had no idea how to train this technology. The solution was to open up the data. Dong Qian: What did you gain from this? Huang Xiaohuang: Our company was cash flow positive at the time—quite profitable—so we weren’t overly focused on immediate financial gains. Secondly, after two to three years of internal exploration without progress, we decided it was better to share the data rather than let it go to waste. We hoped someone might have a breakthrough idea and uncover its potential. Dong Qian: Why did you adopt this mindset of preferring to share rather than hoard resources? Huang Xiaohuang: Honestly, I believe that passion for advancing this field matters more to me than monetary profits. We are confident that creating value will eventually lead to financial returns. If we didn’t take the initiative, no one else might have the resources to progress, and the field could remain stagnant. Dong Qian: So, if you couldn’t achieve it alone, you aimed to gather the brightest minds worldwide starting from your contribution. Huang Xiaohuang: I believe these resources should be shared as assets for all humanity. By collaborating to overcome challenges, anyone can potentially benefit financially—it depends on circumstances. Without openness, there’s no opportunity for anyone. If we keep things hidden, neither we nor others can research effectively, which could stall the entire field. Dong Qian: Is there a risk that others might discover even greater value while you withhold resources, rendering your efforts obsolete? Huang Xiaohuang: That’s a possibility, and the risk exists. But by sharing, you at least retain a stake—if others succeed, collaboration becomes an option. Our company’s philosophy has always been openness, especially in tech fields where progress isn’t a zero-sum game. Through cooperation, the potential rewards could grow exponentially; without it, everyone loses. Narrator: Shortly after opening the dataset, CoolTech received an email from a Silicon Valley tech giant expressing interest in collaboration.Huang Xiaohuang: Initially, I thought it was a scam.
Dong Qian: Why did you have that impression? Huang Xiaohuang: Because it never occurred to me that such a large company would seek collaboration with us. Dong Qian: If you were to make a comparison, likening them to an elephant, what would you be? Huang Xiaohuang: Perhaps not even an ant. Dong Qian: The impact of open-source quickly became apparent. By contributing your capabilities, others began building steps, allowing you to ascend progressively. Huang Xiaohuang: Yes, certainly. Their research institute was far more advanced than ours. After collaborating, they published papers, and we learned from them how things could be done. Relying solely on our own resources, we assessed it would have been impossible to achieve independently. Narrator: At the time, the tech giant struggled with a shortage of physically accurate synthetic data for robot training. This collaboration marked the first application of Qunkor Technology’s dataset in spatial intelligence training. Huang Xiaohuang: Gradually, everyone realized this approach was viable. Dong Qian: Which approach is that? Huang Xiaohuang: Spatial training and understanding of space—this method is effective, and synthetic data proves useful. It means all future devices can develop intelligence in the physical world. For instance, a decade ago, robotic vacuum cleaners could only blindly follow zigzag patterns to clean entire homes. But academic papers showed they could understand environments like humans, cleaning specific areas on command, such as under a table, by recognizing what a table is. When devices comprehend their surroundings, numerous new functions become possible. We must first foster a thriving, extensive ecosystem to secure our share of the benefits. Dong Qian: So, can you single-handedly make it prosperous and vast? Huang Xiaohuang: It’s not about individual effort; I contribute modestly to the entire industry. Many are involved, not just us. For example, autonomous driving is part of this machine intelligence, with even greater momentum in that field. Narrator: In the real world, training robots is costly and difficult to scale, while using data faces bottlenecks due to scarce high-quality 3D data.Synthetic data is thus a cost-effective and infinitely potential source of training data. The dataset launched by Qunhe Technology has been adopted by multiple universities, including Imperial College London, the University of Southern California, and Zhejiang University, becoming one of the representative infrastructures in indoor AI visual training.
Dong Qian: I have a question. Why were you able to accumulate these physically accurate data, while some companies, including those in the United States, have not? Huang Xiaohuang: This was achieved when we implemented Industry 4.0. Therefore, all the digital assets we accumulated in the digital world correspond one-to-one with the physical world and can be manufactured. This is a significant milestone. The United States lacks such a vast industrial system, so previous companies in this field were not as effective. I believe no other country in the world has such a large scale. This scale is so extensive that manual production is insufficient, requiring automated unmanned factories for production. Dong Qian: So, the times make the person. Sometimes, if you have the hammer but don’t catch the right era, you might not find the nail to hammer. Huang Xiaohuang: Yes, in one way, we are quite fortunate. On the other hand, we have always believed that this technology holds value. Voiceover: In March 2025, Junhe Technology released and open-sourced its self-developed spatial understanding model, Special LM. Combined with the previously launched spatial intelligence platform Special Words, it enables robots to complete a full closed-loop training from cognitive understanding to action interaction. With the explosive growth of embodied intelligence, Qunhe Technology has new potential to become one of the giants in cloud infrastructure for spatial intelligence training. Dong Qian: If we make an analogy, what would your Special LM and Special Words combined equate to in today’s large language models? Huang Xiaohuang: Special Words is equivalent to the corpus of large language models, while Special LM is equivalent to the model itself. Currently, it is still quite basic, roughly at the stage of GPT-2.5 to 3.0, I would say. Dong Qian: But you are unique in doing this, right? Huang Xiaohuang: So, we will continue to iterate and improve it. Dong Qian: So, to some extent, you are like a company such as ChatGPT.Huang Xiaohuang: Yes, but they are closed, while we are open.
Dong Qian: What differences will your openness and their closed approach bring? Huang Xiaohuang: I focus on our business prospects in the next 10 to 20 years. We will first establish the infrastructure, and then our true capabilities can be fully utilized. I believe that for this generation of Chinese entrepreneurs, embracing open source is likely to create greater value. Dong Qian: So this brings us back to your original motivation for starting the business. What drives you even now? Huang Xiaohuang: We have always believed that if you know your technology has value and the industry is thriving, you will certainly secure a share of the market. Moreover, you must be passionate about what you do. Even if you fail, the journey itself will bring joy, excitement, and a sense of accomplishment. Ultimately, even without financial gain, you will feel it was worthwhile. Risk Warning and Disclaimer: The market carries risks, and investments should be made cautiously. This article does not constitute personal investment advice and does not consider individual users’ specific investment objectives, financial situations, or needs. Users should evaluate whether any opinions, views, or conclusions in this article align with their particular circumstances. Investments made based on this are at the investor’s own responsibility.


