As robotics take a step away from factory floors and into homes, hospitals, and care environments, the hardest problem isn’t getting them to roll around or run a complex model. It’s teaching them how to use their hands.
Dexterity is what lets a person pick up a slick glass without crushing it, pull a T-shirt out of a drawer without snagging it, or handle a pill bottle one-handed while steadying themselves with the other. For robots, those “simple” actions turn into a maze of tiny decisions about grip, force, timing, and what to do when the object doesn’t behave the way you expected.
Henry Yu, Founding Director of Robotics Data and AI Systems at Sunday Robotics, is working on a direct answer to that problem: teach robots the way people learn physical tasks—by doing them—rather than hoping a simulated world will capture every real-world bump, slip, and surprise. His work centers on wearable sensing and augmented reality systems that record how humans actually manipulate objects, then turns that into training data for robotic learning.
At the heart of the approach are several U.S. patent applications describing wearable robotic training systems, including gloves with piezoelectric sensing. Together, they treat human demonstration as the main signal from which the robots can learn from.
“Dexterous manipulation is hard to simulate,” Henry explains. “So much of it lives in precise position, force, timing… If we want robots to work in human spaces, we need training that reflects real human experience instead of a simplified version of it.”
We caught up with Yu to talk about the real problem behind home robotics—dexterity—and the human-centric pipeline he’s built to tackle it.
What problem originally drew you to human-centric robot training?
I kept seeing the same failure mode: robots can move around and recognize things, but they get stuck the moment the task becomes physical. You can have a robot identify a mug perfectly, then watch it fumble the pickup because the handle’s angle is odd or the mug is heavier than expected.
These are physical intuition problems. As I explored robot learning, it became clear that simulation-heavy pipelines were hitting a ceiling. Simulations are useful, but they struggle to capture contact dynamics, soft materials, and the variability of real homes. I became interested in how we could directly transfer human skill into robotic systems without losing that nuance.
Your patent work centers on wearable sensing. Why does wearable capture matter so much?
Wearable capture records the elements of skill that cameras often can’t.
Wearable sensing allows us to capture dexterous behavior at the source of contact, where it matters much more. In our case, that means gloves equipped with multimodal sensors that measure motion, force, pressure, and spatial context at millimeter-level precision.
Our wearable approach uses gloves with multiple sensors so we can measure hand motion and also what the hand feels: pressure, force changes, contact events. One patent application I’m a co-inventor on covers embedding piezoelectric sensors in the wearable device. Piezoelectric materials generate a signal when they’re stressed, which makes them a good fit for capturing subtle force interactions during manipulation—like squeezing and the onset of slip. Put together, you get a richer record: not just what happened, but how it felt to do it.
Why is developing and using wearable data collection devices better than doing teleoperation?
When we collect data through teleoperation, the human operator cannot directly feel the object being manipulated. Instead, they are effectively pinching air and relying on simulated feedback rather than real physical sensation. Wearable data collection keeps the human physically engaged, allowing data to be captured through real touch, force, and interaction with real objects, which results in higher-fidelity data. At the same time, this approach reduces the cost of data collection by avoiding the need to deploy and operate robots solely for training purposes.
You lead the Sunday Robotics Data Engine. What is it, and why is it central to this approach?
Embodied AI needs high-quality, multimodal data—hand motion, force signals, scene geometry, object identity, all synchronized. You can’t scrape that from the public internet the way you can scrape text for large language models. You have to capture it, clean it, check it, and ship it in a form the learning team can trust.
I architected the Sunday Robotics Data Engine from scratch to manage the entire lifecycle of that data. It supports manufacturing over 2,000 wearable Skill Capture Gloves, collecting more than 10 million training episodes across 500 plus U.S. homes, totaling over 200 terabytes of data. The system spans embedded software, spatial tracking, data quality control, multi-modal data visualization, asynchronous data processing pipelines, and delivery of clean datasets directly to the machine learning team training our ACT-1 model.
What does this mean for home robotics and care robotics?
In home and care settings, the bar for generalization and safety is higher. The robot has to be safe near people, predictable, and able to adapt when conditions change—because they always do. A care environment might involve mobility aids or clutter or medical devices, all of which can vary from person to person. The robot can’t require an engineer every time it sees a new cabinet handle or a different style of pill organizer.
Human-centric training lowers the friction. Instead of writing code for every new task, you can demonstrate it. Over time, you can imagine a world where teaching a robot looks more like onboarding a helpful assistant: show it how you like something done, then let it practice with guardrails.
You were invited to judge the Tuya AI Innovators Hackathon. What does that say about where the field is headed?
It says a lot of the energy is shifting to AI-enabled homes. Hackathons are a great snapshot of what developers think is possible right now, especially where hardware meets AI. Innovation is multidisciplinary work, and I’m excited to see it expand into new spaces and mediums.
Looking ahead, what do you think will define the next phase of robotics?
Besides making models more generalizable and robust, as we start to think about deployment in home environments, we also need to think about the interface between people and machines.
With language models, we've become used to a chat-based interface. With robots, the 'interface' suddenly has to include bodies and objects. You show the robot how to organize things in your home in the way you want using language combined with body gestures. That also changes how quickly you can teach the system without someone training a specialized model for you.
The end state looks like a teachable robot platform where non-experts can add skills the same way people train each other at work, demonstrating and offering corrections until it sticks. What matters is being able to solve the problem again and again in slightly different conditions, the way people do.