What if you could type with your brain?
Mark Zuckerberg proudly threw this thought at the crowd who were gathered for this year’s annual F8 conference.
The question sparked intrigue across the faces of the attendees. This is a prospect that seems straight out of a science fiction novel. It is an undisputed fact that Facebook is always working on mysterious technologies in their nerdish secret bunker facilities and hangers, which, in the normal run of events, they unravel at keynote sessions like this.
This year was not going to be different.
Just like in all the keynotes Zuckerberg had given before, the huge monitor behind him flashed radical ideas that made audience feel like they were on a trip to the future.
Facebook held its annual F8 conference recently where the billionaire CEO Mark Zuckerberg gave a glimpse into what his dreams for the future are. The whole conference revolved around 3 major buzz words that yields innumerous hashtags across the tech community and the investor community worldwide at the moment. Three powerful technologies that had the potential to revolutionize the entire world.
Machine learning. Augmented reality and computational neuroscience.
These powerful technologies are capable of changing everything that modern world is built upon. It could radicalize healthcare, financial system, Telecommunications, neuroscience and even astrology and a lot more. But the most distinguished of them all, was Facebook’s official entry into the field of augmented reality.
Augmented reality is a novel technology that aims to enhance the user’s vision of the world by overlaying digital information onto the user’s camera. You could bring animated characters and other virtual objects out into the camera view, and it creates a seamless experiences that no other platform can seemingly provide. AR doesn’t obstruct a user’s field of view like its cousin Virtual Reality (VR) does, and it gives AR a significant advantage. The recent runaway hit Pokémon Go could be seen as a loose form of AR, but there is more to it than meets the eye. AR could change the way digital marketing and advertising works. One could say that Facebook is bent on an all in all offensive against Snapchat, who recently rolled out their AR feature named “World lens”. Facebook has introduced “Camera Effects”, which is Facebook’s version of “World lens”. Unlike Snapchat, the social networking behemoth Facebook has an upper hand here. Facebook has its army of dedicated development community. It seems that Facebook is trying to reach out for the lowest branches of the tree, the one which is well at the reaches of ordinary developers. Facebook wants to dominate this particular market by empowering a dedicated creator community that would create their own filters and effects, with the means of Facebook’s newly introduced “AR Studio”.
But, don’t be fooled even for one second by thinking that Facebook is doing this to create yet another fancy instagram-ish platform for selfie-addicts.
As the billionaire CEO walked to and fro across the stage, he had only scratched the surface of what is yet to come. The keynote was just the tip of the iceberg.
And oh boy, there is a lot to unpack.
As we all know, Facebook’s billion dollar revenues stems from their efficient advertisement delivery network that is matchless. But it seems that they have hit the pinnacle. There is only too much ads they can cram into a user’s newsfeed. At some point, there will be a tradeoff between ad mechanism and user satisfaction. By venturing into AR, Facebook is opening yet another space for their advertising.
Facebook has really advanced in the field of convolutional neural nets or CNN or Covnets, which are machine learning neural network frameworks that specializes in visual data processing. Back in 2016, Facebook took in an existing framework called caffe that specialized in deep learning, and optimized and adapted it for mobile. This year’s conference saw the announcement of caffe2, built for mobile from the ground up. This enabled real time style transfer, where a particular aesthetic sense of a picture is applied to the camera feed. PRISMA has already demonstrated that it really works with the people. If you are a tech savvy who reads a lot, and knows a bit about digital image processing, you know that even though “style transfer” sounds easy, it involves complex convolution operations that requires most modern approaches in combining applied mathematics and computation.
Depth estimation from 2D images.
Covnets have shown considerable promise for estimation of 3D depths of 2D images. Our eyes does this task so effortlessly that we take the mechanism for granted. Even with our one eye closed, we can, up to a neat degree of precision, calculate the scale of a particular object that is situated within the scene. It is because, our brain is pretty good at understanding what the sizes of those objects are most likely to be. Equipped with enough training data, Covnets would be soon able to infer 3D depth map from a single image. For Facebook, this enables them to create and include 3D effects in a scene that has already been captured, without the need of any additional sensor data such as accelerometer or gyroscope. It can vary the focal fields of images that has already been clicked.
Sounds like a light year leap in the field of digital photography.
In addition to style transfer, Covnets has shown exceptional ability in recognizing objects in the camera view. A properly trained and deployed Covnet can identify many real world objects that the camera feed picks up. There has been forerunners in this field, but all of them follow a generic cloud based approach in which the stream is processed in the cloud. However, with Facebook’s optimization of mobile Covnets, it is conceivable that Facebook could, in the near future, perform robust sematic recognition on local devices. For end users, it’s all about picking up significant momentum in processing speed.
SLAM stands of simultaneous localization and mapping.
“Simultaneous localization and mapping (SLAM) is a chicken and egg problem. It is the computational problem of constructing or updating a map of an unknown environment while simultaneously keeping track of an agent’s location within it. Simultaneous localization and mapping (SLAM) is therefore defined as the problem of building a map while at the same time localizing the robot within that map. In practice, these two problems cannot be solved independently of each other. Before a robot can answer the question of what the environment looks like given a set of observations, it needs to know from which locations these observations have been made. At the same time, it is hard to estimate the current position of a vehicle without a map. Therefore, SLAM is often referred to as a chicken and egg problem: A good map is needed for localization while an accurate pose estimate is needed to build a map.”
In an example showcased by Facebook, the device that has the camera tracks itself in a given space, while recognizing an object (a cereal bowl and a glass of orange juice) which sits on the table. This recognition affects and triggers the AR activity associated with it, which lets the user create virtual copies of the recognized objects within that scene.
Currently we have numerous non-invasive methods to measure brain activities with the aid of electrodes that measure brain’s responses to a particular stimuli. The main hurdle is not data collection from the brain. The problem is that researches still struggle to make sense of that data. A human brain is a miracle. Even though we are unravelling its secret bit by bit, we are looking at decades when it comes to mapping an entire brain and learning the stimulus-response sets. But the question remains. How to make sense of this enormous amount of data that is generated?
This is where AR bridges the gap between technology and neuroscience. Even though Facebook claims that it can be used to enable ALS patients to type words right from their brain, letting them bypass a particular segment of their disability, the advertising giants eyes has zeroed on the most possible and probable advantage of this vision, businesswise.
More and more tailor fit advertising.
And how does they intent to do that?
AR headsets comes along with non-invasive electrodes that touches the sides of your eyes, and also, you give the device access to your entire field of view. This data is powerful. The headset’s camera’s doesn’t only look out into our field of view, but also look inwards to track the moment of our eyes. The particular size of our eyeballs lets them determine the focal plane our eyes are currently focused at, and suddenly, Facebook has a Neuro-engine at their disposal, of stimulus-response system, that lets them determine the acute brain signals produced for each sight the user sees.
And millions of users will be using this headset, which gives them enough data to train and deploy a neural net that outperforms even the most sophisticated machine learning algorithms that has ever been devised on human physiology.