Egocentric. It’s a weird word. To Facebook, regarding AR and VR, it means a world where computers start processing the world as you perceive it. It’s a cornerstone of a strange future we seem to already be heading towards, where assistants and notifications and social media meet us where we’re already looking. Or where we seem to be looking.
Virtual reality is already stellar at making us feel like we’re somewhere else, and Facebook’shas refined that ability. But augmented reality is a stranger beast. Facebook is , but is already that could take years more. According to Michael Abrash, chief scientist of , the company’s AR/VR division, the future interface needed hasn’t been cracked yet.
At Facebook’s virtual Connect conference, a VR and AR event normally held in a convention center, the company looked ahead to new neural interfaces (armbands developed by CTRL-Labs,) and eyewear that will build 3D world maps and explore how AI can be developed to learn from our attention.
The idea of combining smart glasses with an assistant made me think of, or Tim Maughan’s : It sounds weird, it sounds scary, it sounds wild too. I spoke with Abrash virtually (over video chat, not VR) to discuss what could come next. This transcription has been lightly edited for clarity.
True AR glasses sound like they’re further off, but Facebook smart glasses are coming soon. What do you see as the difference between smart glasses and AR glasses, and what features do you think might be included or added over time?
Conversations about AR glasses are often kind of split personality. Everybody sees that AR glasses are the things that come after phones. There’s this progression that goes: desktop, laptop, smartphone, AR glasses. And in each case, when those things appeared, they did exactly the same things the predecessor did, and actually did them worse. They just made them more available … really, you think about that first iPhone and it did what a phone could do: It did internet badly and it did music. All those things you could already do.
That is going to be an important part of why people start to put smart glasses on their face and what true AR glasses will do in the long run, for sure. How do you do messages? How do you get navigation? Definitely valuable.
Then there’s the analogy to the first computer when it first came out. The first personal computer didn’t implement anything that you used to do. It actually was a qualitative change in how you interacted with the world. I mean, you could say a spreadsheet is like using a calculator, but it’s not like using a calculator. And even a word processor is not like using a typewriter.
There are two things that really are unique about AR. One is you have shared virtual persistent objects. The fact that you have those is a radical change. It makes the world basically an index for all sorts of things that makes it a sharing environment. That’s the obvious one. The less obvious one is an assistant that really becomes an extension of you.
I have no idea, 40 years from now, what people are really going to be doing in AR and VR. For collaboration in VR, people say, how close can it be to real-world collaboration? I think the answer is really, how much better can it get?
Those two things — shareable virtual persistent objects, which becomes an index of the world, and this personalized assistant — if you look back the day you retire, and you’ve covered this whole revolution, and you think what really matters, what changed the world, it won’t be the things people think today … It’s like how social media has changed the world. Online, retailing. We can’t see what it will be, but it will be.
Which leads me to a question: I know this vision for Facebook with 3D mapping is also a vision for a couple of other companies, to map space. It makes me think about where we’re at with different OS versions and different apps. How do you see that resolving in AR? Is that a competition where you have different operating systems or apps? Is it channels? Do you see interoperability?
I personally think of it more like the internet, except that it’s going to be internet times a few orders of magnitude in terms of the amount of data. So it has to be something that’s OS agnostic, right? You would be crazy to say, well, you can only use the internet using Windows. I view it as something that is not platform dependent and can’t be platform dependent. And you know how these things always go: Reaching standards takes a long time, settling on where you want this to be. Ultimately, I think that that’s what will happen.
Well, I think about the internet. The way it was built versus all the companies pursuing world-mapping now. Right now, in VR, Oculus doesn’t interconnect with phone apps on iOS and Android. Do you see that we’ll start having a flow between them?
That’s really more what I would call the product side consideration. You know, I really am the person who thinks about what the future is going to be like. So really, the way I look at that one is, the big trick is get it to work, get it started getting bootstrapped, and that’s what we’re trying to do. Then where it goes from there is kind of out of my hands. So I would say stay tuned, we’ll see this is going to take years.
I’m curious how you see the gap bridging for AR and VR over the next few years? Obviously, there are going to be multiple devices, and smart glasses are coming in some form next year. Do you see VR being a way to bridge a lot of those AR tools? Will smart glasses kind of meet and handshake over time?
I think it will be a bit before there’s real bridging there. Because I look at VR and I say you have infrastructure, you have thermals, you have power. I mean, you have much more capability there. It doesn’t mean you couldn’t potentially do some of those things in AR. But for example, you want to sit in a meeting in VR, you’ve got a field of view of, say, 100 degrees, that means you can actually see people sitting around a virtual table. You do that in AR, and you can see the person you’re looking at, but you have no peripheral awareness. And those little details add up to so much difference in the experience.
VR can draw black because it controls every pixel. AR can’t actually draw black; it’s additive blending. You don’t get as much crispness out of things. So what I think you’ll see is this ubiquity thing with AR where those glasses are basically offering less rich experiences, but in a way that can spread across much more of your life and many more people, while VR is delivering what I’ll call rich heavyweight experiences that have high value, but are limited in terms of who will use them and where they can use them.
You can have a VR headset with mixed reality that you could just wear all the time, hypothetically. And you could have very good control over the experience. But not socially acceptable, not light enough to be wearable all the time. I mean, there are these other problems you run into. I’ve always felt like AR and VR are like a water balloon … when you try to get all the axes exactly where you want them, you can’t squeeze every one of them in there. And so the big divide really is, are you location-based with infrastructure and power and thermals, or are you part of every moment in your life potentially? Those two things are not yet capable of being joined.
Could the Oculus Quest 2 and its improved chip and cameras do some of that world scanning and AR work, like hand tracking now?
I’ll be honest with you, I actually don’t really know what the plans are there. what the possibilities are. My job is really to think five years from now what lands in a product that is, like, an integer multiplier on that. I don’t mean to downplay that work. I mean, the polish that has been done really going all the way back to [Oculus] DK2 — think about where we were with DK2 and where we are now and it’s pretty astonishing how much better it is at doing those things. But I’m thinking, well, OK, now how do we change that experience? What happens if you get depth of focus? What happens if you get haptics, because I’ve talked about haptic gloves in the past. What happens if your audio is perfectly spatialized? Just saying that [regarding Quest 2], not that I don’t value it, it’s just not where my mind tends to be.
The CTRL-Labs work with neural inputs is really fascinating. I think about things even like health sensors or other biometrics and how they can be part of the equation. What role do you see with that?
There’s certainly potential for that — there’s a separate team that looks at that — that has been discussed. I have this specific vision about building a platform. And then something like that is one of the things that platform enables. So how do we make it so that you’re wearing those glasses? You woke up this morning, you put your glasses on. They stay on your face until you go to sleep tonight, exactly like me, right? And the question is, how do we get the glasses to be there? And once they’re there, all these other things will come along with that, but you won’t get people to wear the glasses for those reasons you’re talking about.
I’m also thinking about that egocentric idea: eye tracking, and spatial audio. Attention seems part of the picture. It’s something that now seems to meet you as much as you’re going to it. That’s definitely like a different dance than VR where it feels more like I’m kind of moving to make things happen. While AR is reading your information in order to help you navigate what’s a much wider open field?
I love it because it means that my talk actually got the message across. You just got right to the heart of it. The way I think of it, AR glasses, in particular, can basically just be an extension of you because they’re going to be tightly coupled to your perceptions and your actions. The hearing thing says, this is how you would hear if you could engineer yourself to hear. You’re not telling it, “do this for me, do that for me.” It just automatically brings the information to you. You can also imagine contrast enhancement for people who don’t see well in low light, like my family tends not to. And it can also help you remember things because it understands your context.
Think of it as, it’s nothing that you wouldn’t do yourself, if only you worked better. If your memory was better, if your eyes were better if your ears were better. So it really is an extension of you and an enhancement of you, which is very different than saying it’s a device that you manipulate to do things that you want to do. Which is how it works today.
I read a book about AR by Helen Papagiannis a couple of years ago, Augmented Human, that changed my mind thinking about AR away from visuals and towards other senses. Like spatial audio. It’s almost like an ambient thing where your sensory awareness could take many forms. It’s an all pervasive thing?
People always think of the visuals because we’re visual creatures, right? It’s the sizzle. Audio, people very much underestimate how powerful it is. And one of the things I really regretted about thewe did was that, because of the coronavirus, we couldn’t do demos. There is one specific demo where they record binaural audio in your ears as actions happen around you, and then they play it back perfectly. And not only can’t you tell the difference, but there’s a point at which the person comes close to you, using scissors around your head. And you can actually feel the heat of their body being there, even though they’re not there. And people do not understand how powerful truly great audio is because they’ve never experienced it. It’s much more tractable than the visual part. Building a new display system is hugely difficult, building a new display system that can sit on your head within a very tight weight and thermal budget is just insanely difficult. Audio, there’s nothing about it where a miracle needs to occur.
But then there’s the other part you were talking about, which is just pure sensory input. Really what you want is you want more valuable information coming to you. And that doesn’t just mean sensory input. It means your context and awareness of other things. Audio can be text, like the assistant is trying to make your world a place that serves your needs better, and only part of that is perceptual.
Audio seems like a possible starting place for AR glasses because of the achievability. Like audio is first cue and then the visuals come after. Companies have already been looking at audio and smart glasses a little bit, could that be [Facebook’s] starting point?
You could build glasses that were audio based. I think, though, that everyone has in their head this picture of what glasses are going to do for you. The question is, when do we get that true AR imagery imagery? Sure, there could always be intermediate things that pop up. Did you ever hear of smart typewriters? In the ’70s, as microprocessors were developed, they started to make typewriters where you could have a little LCD window that would let you edit the last few lines. When you made a mistake, you could actually go back and fix it. And that was a big business. The reason I bring it up is that smart typewriters were successful, they were a big business category, and no one has ever even heard of them these days. Those intermediate things, people will do them.was a good example of someone doing what I would call an intermediate product of limited capability. But I want to get to that thing where you see it, and you go off and you write the articles that say I’ve seen the future. You’ll know when it happens. It’s kind of like the first time you put on a VR headset, you said, this is not like anything I’ve ever done. It wasn’t like, “oh, this is interesting,” or this is a better version. It was like, no, this is a new thing. And that’s what I’m trying to build here.
With everything this year, the pandemic and the way people have changed, has it changed your philosophy? A lot of things this year seem to have reinforced these ideas, like working at home. But some things about VR work, some things don’t. Things have failures.
The thing that made the personal computer was VisiCalc. If you look at Apple II sales, the knee of the curve was VisiCalc. And if you look at PC sales, the knee of the curve was the IBM PC. It was really about personal computers making it so that businesses could be more productive. Now, virtual collaboration seems even more powerful than that to me. This year, for all these horrible aspects — I’m looking out my window at smoke and an air quality index of 200, for the ninth straight day, it has been a pretty bad year — the silver lining for me is that suddenly the entire world understands that we need better ways of working remotely. And if we had the remote collaboration environment and VR that I have been talking about for five years, it would be the most valuable thing on the face of the Earth today except the vaccine. I mean really, you’d be using it. We’d be doing this in it, everybody would be more productive. And so now we’re never going to stop thinking that, right? Because you never know if there will be another pandemic, you never know what might happen. And companies are moving towards working more remotely. Mark said that for us, but I mean, companies now see it can work, but it can work so much better. So, to me, it was just like this was the year of validation of that. My only regret was that we didn’t have five more years, because by five years from now, maybe we could have been there.
I do believe that there is no reason why, over time, this can’t happen. No miracles need to occur, just a huge amount of research and work. But I’m very confident about that.
I’d love to know what you’ve been reading, what’s obsessing you lately.
For me, it’s been fiction. Not very much of it, but when I have, it triggers me to say, “I see how that could work, I see how that could be real.” Ready Player One, when I read it, I came out of it and I’m like, easily 80% of that seemed feasible, maybe all of it in the long run. For AR, Vernor Vinge’s Rainbow’s End is really the only thing that I’ve ever read where I say, I still don’t think [AR] contacts are going to be a thing for a very long time, but even the twitching in your clothing for the interface, it’s like he has actually tried to do how you would interact, rather than just saying we’re going to do exactly what we’ve been doing for years. Those two really are the most seminal. Most science fiction tends to be more like, “oh, it’s VR, we can do whatever we want.” That’s kind of what thewas: Hey, you know, it’s just a complete blank slate. We can do anything.