Podcast: Finally! A Humanoid Robot You Can Trust

In this episode, we discuss research coming out of Columbia University to better connect humans with robots by having them detect, process, and respond to facial expressions!

Daniel Mitchell

fromThe Next Byte

24 Apr, 2024. 17 minutes read

Podcast: Finally! A Humanoid Robot You Can Trust

Topic

A.I.

EPISODE NOTES

(3:55) - Finally! A Humanoid Robot You Can Trust

This episode was brought to you by Mouser, our favorite place to get electronics parts for any project, whether it be a hobby at home or a prototype for work. Click HERE to learn more about the uncanny valley and what it will take for humans to finally trust robots.

Become a founding reader of our newsletter: read.thenextbyte.com

Transcript

What's going on folks? Welcome back to the Next Right Podcast. And let me just ask you a question. Have you ever thought, man, I really cannot trust my Roomba because it doesn't act like a human being. You know, it doesn't, it's not human-like enough. Or on the other side, have you ever thought that a robot can't fool you because you can just tell that it's not a human being? Well, good news and bad news, depending on which camp you're in, we are talking about the humanoid robot that you can finally trust or not trust because it's really, really human-like. And if that's got you excited or scared, buckle up and let's get into it.

I'm Daniel, and I'm Farbod. And this is the NextByte Podcast. Every week, we explore interesting and impactful tech and engineering content from Wevolver.com and deliver it to you in bite sized episodes that are easy to understand, regardless of your background.

Farbod: All right, people, like you heard, we're talking about robots you can trust, which is funny coming from me because one of my favorite series of movies as a child was The Terminator. But before we uncover my childhood trauma from Arnold Schwarzenegger, we're gonna talk about our sponsor for today, which is Mouser Electronics. Now, the cool thing about Mouser, Mouser's one of the world's biggest electronics suppliers. The cool thing about them is that because of who they work with, the different manufacturers, academia partners, they have a lot of insight about what's going on in the world and a lot of interesting tech topics, whether it's additive manufacturing, artificial intelligence, autonomous vehicles, et cetera, et cetera. Well, they tend to write some nice tidbits of information about what they know, and they wrote this really interesting article. It's titled, A Funny Thing Happened on the Road to Becoming Human. And it's incredibly relevant to what we're talking about today because it's got this historical progression of robots and how people have related to them. And it brings up the uncanny valley. Do you know what that is?

Daniel: No.

Farbod: I think it's a term coined by some psychiatrists, but it has roots in Sigmund Freud's philosophy. Generally, it's that humans are very comfortable with a robot that's very robotic. And you can think of that as the robots that are used in manufacturing floors to build cars. And humans are very cool and very comfortable with robots that are just very human-like. Where they're uncomfortable is the space in between where you can kind of tell that it's a robot but it's behaving like a human being and it makes you feel like there's something very unnatural and unhuman about it.

Daniel: Yeah, that makes sense, right? If something is not at all like a human, it doesn't feel threatening. And if it's almost like a human, but it just feels a little bit off, then it's unsettling. You're like Oh man, I don't trust that. I don't trust that thing. But I'm looking at the graph here that's in the article. You guys should just go click on the link in the show notes and check it out because it does a really good job of illustrating exactly what you just described, which is like the more human likeness that something has. There's this weird valley, it's uncanny, the uncanny valley where, especially if things are moving, right, like a robot. If they're not exactly like a human, and they aren't 100% human-like, they're a little bit unsettling, and they even, I think it's pretty interesting on here, they have an ill person, is like, even that's not full, doesn't feel fully 100% human-likeness, and the same way that like, if you're constantly around people who are really, really sick, it can feel a little bit unsettling. So, I think it's super interesting. And obviously, it ties very well into what we're talking about today, which is trying to build robots that humans can trust and overcoming this uncanny valley, right?

Farbod: Absolutely.

Daniel: How do we get from the pit of this valley where I feel like humanoid robots are right now, where they're close enough to being human that it's like, dude, why are you doing this? It's freaking me out.

Farbod: Let's talk about that before we fully go into today's episode because something cool that you and I were both freaking out about was what is it, the number one robot?

Daniel: Figure 01.

Farbod: Figure 01 from Figure Robotics. And it's like a five-minute demo of this very robotic looking robot, but it speaks like a human being and it's doing stuff. And you pointed out really well, I was like, I just feel uncomfortable. You're like, yeah, did you catch it stuttering and stuff? And it's that, I don't know, humanist combined with its definite robotic nature that makes you feel so uneasy about product.

Daniel: Yeah, we can we can link in the show notes that video as well to exactly describe where if something's in this uncanny valley, which is like it's almost completely humanoid, but it's not. It's very unsettling and very similarly right the movements of this robot were very, very human like the dialogue that this robot had was very, very human like but the appearance was off. And I think that the dissonance between how human it sounded and felt while listening to this versus what it looks like, it doesn't look like a human at all, that the dissonance between those two is what made it really unsettling. And that's a perfect description for the uncanny valley. And it's also a perfect description for the problem that this research team is trying to solve, which is how can we give robots a face that you can trust? And I truly feel like, especially in the humanoid robot space, the face is pretty much, it's almost everything in terms of building affinity and building trust with robots.

Farbod: And this is, so this is coming out of Columbia University. And yeah, I mean, you got the gist of it pretty much spot on. They're trying to give a robot a good face that you feel comfortable interacting with. But the fixation on the face is actually kind of interesting. They noted that a lot of human communication is not just like verbal. There are facial cues that we make which signal a certain thought or emotion that then the other person that you're speaking to can reciprocate and make you feel, I don't know more involved in the conversation. It's easier to make a connection when someone can reciprocate those things. Imagine if you're having, you're very animated, like I am, shout out to being Iranian, we would talk with our hands and our face and everything, and then you talk to someone that's completely stoic and they're just spitting out auras and their face isn't changing, that might make you feel a little bit uneasy. So that's where I guess the genesis of this idea was coming from. How do we make humans connect easier to robots? And they were like, oh, facial cues could be great. And this is a very fascinating problem, because it's challenging on multiple fronts. On one side, you have to have a robot that can understand humans' facial expressions and pick up on the little things and understand what those mean. And on the other side, you have to have a robot that can mimic and recreate those, which is not an easy task when you don't have all these little small muscle groups connected to your skin. And what this team at Columbia has done is both a hardware project and a software project at the same time to achieve their goal.

Daniel: No, I agree, right? The important, I think you hit the nail on the head there. The important part is even if you have this incredible software that does an amazing job at determining how the human, the robot is interacting with needs to see some feedback response on the face, if you don't have a face that can actually physically mimic what that face should be, then you're still not gonna be able to build affinity and trust with that.

Farbod: Back in the uncanny valley you go.

Daniel: Yeah. Bunk back into the uncanny valley. But I really appreciate the work that they're doing here, especially like you said in the mechanical realm, obviously that the software is, feels an almost insurmountable challenge and it deserves a lot of credit and we can talk on that. But I wanna start with the mechanical realm. The robot that they built is called Emo and it's not because it's over the emo or it's roaring XD. Yeah. It's, it's meant to convey the fact that it will accurately understand and then replicate human emotion on its face. And Emo has a face with 26 different moving parts to make different facial expressions. And I wanted to check just to double check, the human face has 30 muscles and moving parts. So, it's, it's almost there. They've almost done a really good job at replicating what the moving parts in the face are, they have soft skin that goes over all these moving parts.

Farbod: Like made of silicon?

Daniel: Yeah, to try and replicate skin on the human face. And then they also have put the eyes inside the camera or cameras inside the eyes, which are using the eyes for visual tracking and for sight, which is the reason why a mascot sometimes looks freaky at an amusement park because the person's eyes are actually in the mouth and you're like, oh, that's not a human. So very similarly here, they haven't placed the camera somewhere like the nostrils where you can see the human head or the humanoid head, robot head tracking you with its nostrils. Instead, they're using the eyes for the actual cameras, which again, all this stems back to a bunch of biomimicry, but I'm impressed looking at the photos and videos on the article, which is linked in our show notes. I wouldn't say it's completely out of the uncanny valley yet, at least my perception looking at it, but it does feel like a big step forward versus some of the other humanoid robots I've seen where it basically looks like either a wax model or like a robot with only one or two moving parts and like this weird stretchy face stretched over it. This is definitely a step in the right direction if you're trying to build a robot that people can trust.

Farbod: What was that robot with the AI brain that got its citizenship? You know what I mean? Yeah, that one, its facial movements, made me feel weird. That was scary. This one is definitely a step up from that. But yeah, the hardware side is amazing in and of itself, but now let's talk about some of the software magic that's happening here. So, going back to the problem that we have at hand, it's a two-fold problem. One, being able to detect the kind of facial expressions that someone's making, matching that to some sort of emotion, and then two, also having the robot being able to recreate those expressions as well. So, for the recreating of the facial expressions, they did something pretty funny, in my opinion at least, where it was just called self-learning, where they put the robot head with the skin in front of a mirror and then told it to use its eyeballs to watch itself as it just kept doing different combinations of movements from the 26 actuators that were in its face. So, it kept making weird faces until it could eventually map what moving actuators 1, 2, and 3 resulted in, and then mapping that to some sort of outcome. So that's how it learned to control its face, which is just a crazy thing to say.

Daniel: And then the second part of that is like, all right, now I know exactly how to move all these 26 different muscles in my face.

Farbod: Like, now I know how to smile, now frown, or whatever.

Daniel: How can I learn now what's normal? What do real humans look like when they make these facial expressions? So first, it watched itself in a mirror just to understand, you know, I think of like, you know, myself practicing a smile in the mirror before I go get a headshot taken like, oh, do I want to look like this? Do I want to look like this? Practicing it's different facial expressions in the mirror. And then going to the internet and watching a bunch of videos and understanding how do people use their face to express and then using that to learn and reinforce, oh, this is what actual human facial expressions look like to teach the robot how to do it the right way.

Farbod: Yeah. So it did what any responsible human would do on a night in. It just kind of binged videos on the internet.

Daniel: Yeah, doom scroll through YouTube.

Farbod: Doom scroll through YouTube.

Daniel: And learned how human faces work.

Farbod: Literally, it just kept looking at people making faces and changing their facial expressions. And what was really an important takeaway from there is that it started picking up on facial cues that were precursors to an expression. So, for example, let's say before a smile, maybe the corners of your mouth start going a certain way. It's able to detect that you're about to smile, which is pretty handy if it's trying to match your emotions. Like you just said something funny, you start laughing, it doesn't understand that you said something, but it knows that you're about to laugh, so then it can mimic your laugh and make you feel like it's more involved than what's going on.

Daniel: And I just think about this as if we were replicating a human exchange between you and I.

Farbod: Yeah.

Daniel: And I thought something was funny and I started to laugh and smile. And if for whatever reason, it took you a while to pick up on the fact that I was happy and laughing and smiling. And you waited until I was in complete grin and laugh and smile mode before you even moved a muscle in your face. I'd be freaked out. I'd be like, dude, what's going on?

Farbod: You'd just be like, oh, did that really not hit? Like, was that not a good enough joke for you?

Daniel: It would set the mood off in the conversation because as part of the human experience, like we reciprocate smiles with one another. If you're sad and you start to frown, I start to frown back at you. Like the way, the term that they use, which I think is really, really interesting and captures the human experience really well is they, they say, the term that everyone uses facial expression, but really when there's two humans talking to each other face to face, it's co-expression, right? Both of our faces together are expressing our emotions in the way that reflect the conversation and the relationship that we have. If you've got a robot that is not able to mimic or to do its part in the co-expression until two, three seconds later, until your face is broken out into a full smile. Then it goes, oh, shoot, I'm supposed to be smiling. And then it smiles after. It'd be really freaky the same way that if you and I were sitting here and you thought something was funny and I waited three seconds to smile back at you, you'd be like, dude, what's wrong?

Farbod: I agree. And all that's going to come together to create this robot that they've successfully created. And now, the so what, right? Like, what? In isolation, all these different things work. But there's a video attached to this article, which I think everyone should watch.

Daniel: Yeah, the video will speak volumes better than we can. And I like the way we speak.

Farbod: I agree. And I just think it's funny, because I'm pretty sure it's at least one of the researchers, or maybe the lead researcher, is trying to demonstrate how this robot works. And it's like a full 30 seconds of them just making facial expressions while it's on mimic mode, and then watching the robot mimic that in real time.

Daniel: My favorite clip is like, and I think, I honestly think it could be like a viral clip is the researcher like sitting, looking intently into the robot's eyes and just raising his eyebrows up and down. The robot raises his eyebrows up and down back. I'm like, man.

Farbod: It could be a meme.

Daniel: Like, dude. Just had a moment right there. Yeah. But yeah, I think that that's, again, we've sent you guys to the show notes multiple times during this episode, but this is definitely something worth checking out. The video does a great job of demonstrating the level of fidelity, let's say, in co-expression that this team from Columbia has been able to achieve with this awesome hardware-software combined hybrid solution to understand how human faces might indicate the expression that a robot's supposed to create, and then creating a robot that's capable of actually mimicking those expressions or creating those expressions on its own face, to me definitely feels like a huge step forward. Personally, still in the uncanny valley, but.

Farbod: I'm there with you.

Daniel: Definitely closer toward.

Farbod: Starting to claw my way out.

Daniel: Yeah, definitely, I would say, past the inflection point. So, it's on the slope back up into high affinity. Although I still don't know that I trust this. And honestly, they mentioned in the article, but I think it's worth discussing. There's a little bit of ethical complications around this, too. If you create a robot that is so good at facial co-expression that humans everywhere start to trust it blindly. Do they believe it's a human? Do they know it's a robot? Is that ethical knowing that someone, a bad actor could hack into this robot and do a bunch of bad things? Like there's definitely some level of ethical complexity, let's say, to put it in a nuanced perspective. There's a lot of ethical complexity associated with making robots as humanoid as possible. Maybe there's some benefit to creating the Figure 01 robot, which we can clearly tell is not human. And maybe you don't trust it as much until you interact with it more. But maybe that healthy level of cynicism is needed for robots, I don't know.

Farbod: Maybe there's a healthy compromise between the Figure 01 and wiggling its eyebrows so that I know it knows that I'm having a great time with it.

Daniel: At worst, it'll make a bunch of cool memes with a robot making good faces. I think what's interesting also to mention is their next steps here in the research. Like you mentioned right now, the robot has very little, little to no context as to what the conversation actually is.

Farbod: Right.

Daniel: All it's doing is using the cameras and its eyes to track your face and then try and understand, use those cues to understand when it should create a face back at you. They're trying to implement or I guess integrate this robot with chat GPT or other large language models, but I think they call out chat GPT as a mechanism of allowing this robot to start to understand the context of the conversation which obviously I think you and I would both agree is a critical part to understanding how and when to create a facial reaction in a situation. Though I can say I'm a little bit hard of hearing. So like if we're in a restaurant and there's a lot of chatter going on and there's music playing, I may not hear 100% of the words that you're saying. And I think I can gauge mostly what face I should be making based off what face you're making. But again, I think that they'll get much closer to, let's say like 100% fidelity and doing the correct facial co-expression with a robot if it can get a full understanding of what the actual conversation is as well. And maybe they'll be able to do something like Figure 01 did with an integration with ChatGPT and allow the robot to be able to speak back.

Farbod: That would be pretty interesting. And yeah, I think that's a perfect matchup given the relatively mature status of these LLMs. Like, they're good at conversing. We've seen people go viral where, just like Figure 01, they add a human-like voice to the prompt and it seems like you're really talking to a human being. So that'd be pretty interesting to see. But just to kind of recap what we've been talking about this episode, right? When it comes to robotics, humans have generally been able to trust robots that are very robotic because you feel comfortable with them and they can trust robots that are very human-like because they resemble human beings. But then there's this area in between which is kind of uncertain where you can tell that it's a robot but it's acting like a human being and that makes people very uncomfortable and it's called the uncanny valley. So, researchers have been trying to figure out how we can start clawing our way out of the uncanny valley and making these robots more relatable and these Columbia researchers really pointed out that facial expressions being able to mimic and reciprocate it during conversation makes humans being able to trust robots much better. So, what do they do? They literally constructed a human face for a robot using the number of actuators that we have muscles in our face. They put it on this robotic head with a skin over it and they trained it to understand how movements in its face could create human expressions. And then they had it literally binge YouTube videos of people making expressions so that it would know what the precursors of those expressions were so it could create them and reciprocate it. And that's why this robot is able to converse and connect to a human being so much more naturally.

Daniel: No, I think you nailed it, man. And the only other pop culture reference I wanna bring to this is Westworld. I know you mentioned Terminator at the beginning, which is also a very uncanny, unsettling version of this. But I think if you ever watch the HBO show, Westworld, it's really interesting because they basically create this fantasy world where you can go to a theme park and interact with a bunch of robots, but it's really, really hard to determine the difference between a robot and human. And people actually build real sympathetic connections with the robots in this theme park. And I think there's a lot of ethical complexity associated with this, but they're, they're getting pretty close, at least in the facial expression realm to creating a humanoid robot that if you were just looking at that face only, you might look at it and have a conversation with it and build trust with it and build a connection with it, as opposed to most robots that we see now, where if they're trying to be human, they're not doing a good enough job of it. And like you said, we lose trust with it instead of gaining trust with it because of that.

Farbod: Yeah, and Westworld is just a phenomenal series, by the way. Unfortunately, HBO took it off, but that's beef for a different episode, a different type of content. But everyone should watch that. Everyone should go to the show notes and watch the video from this linked article as well. It's incredible. The future of robotics is looking more and more interesting by the day. Yeah.

Daniel: Do you know what else is looking more and more interesting by the day?

Farbod: What Daniel?

Daniel: Our newsletter. We recently launched the NextByte newsletter. We're in the process of getting out and launching our weekly newsletter. We're gonna be repurposing the same content that we're telling you here. We're gonna be packaging the same awesome secret sauce technology, delivering it to you in your email inbox.

Farbod: In case you don't wanna listen to us, you know?

Daniel: Or if you love us so much that after you've listened to us, you wanna read about us. That being said, we've done a good job, I think, of learning as much as we can around how to communicate technology effectively and to turn interesting and impactful pieces of technology into a few distilled bullet points that are easy to understand and easy to take with you. I think the next natural extension of what we've been doing here on the podcast and on social media is to do it in a newsletter. So, we're going to put the link to that in the show notes, or you can go to read.thenextbyte.com and sign up. Depending on how fast you type, it'll take you anywhere from five to 10 seconds to enter your email. We hope you can trust us with that. And we would love your feedback as one of our founding readers on how we're doing in the newsletter. We obviously wanna make this the best newsletter you've ever read. And that's a true goal that we're striving for. So, we would appreciate if you can join us on that journey, if you've already been joining us in this audio journey.

Farbod: Well said. And I'm pretty sure that's it. Yeah. Everyone, thank you so much for listening as always. We'll catch you in the next one.

Daniel: Peace.

As always, you can find these and other interesting & impactful engineering articles on Wevolver.com.

To learn more about this show, please visit our shows page. By following the page, you will get automatic updates by email when a new show is published. Be sure to give us a follow and review on Apple podcasts, Spotify, and most of your favorite podcast platforms!

The Next Byte: We're two engineers on a mission to simplify complex science & technology, making it easy to understand. In each episode of our show, we dive into world-changing tech (such as AI, robotics, 3D printing, IoT, & much more), all while keeping it entertaining & engaging along the way.

EPISODE NOTES

Transcript

The Next Byte Newsletter

Get your Company on Wevolver