Facebook – Facebook’s new AI model SEER can teach itself to recognize images
Researchers at Facebook Inc. have created an artificial intelligence-based image recognition model called SEER that’s able to describe what it’s seeing without being trained first on a labeled dataset.
Facebook said that SEER, which is an acronym for “Self-SupeERvised,” is a breakthrough that could lead to a “revolution” in computer vision.
The SEER model, outlined in a paper released March 2, was fed 1 billion publicly available images without annotations or labels from Instagram. It then worked through the dataset, learning as it progressed, and was eventually able to achieve extremely high accuracy in tasks such as object detection.
Self-supervised AI learning is already established in the AI field. It refers to AI systems that can learn directly from whatever information they are given without being trained on carefully labeled datasets that can teach them how to perform a given task, such as recognizing an object in a photo or translating a piece of text.
The advantage of self-supervised learning is that these models can be trained much faster. Labeling data is a painstaking and time-consuming task that takes hours for humans to perform. In addition, self-supervised learning models are also able to work with bigger and more diverse datasets too.
Facebook’s AI team said in a blog post that SEER eventually outperformed existing AI models in the ImageNet object recognition test, which is an industry standard for gauging the effectiveness of such systems. SEER achieved a classification accuracy score of 84.2% in the test, which asks the model to identify what it’s seeing in thousands of different photos.
SEER, Facebook explained, takes advantage of an older algorithm called SwAV that relies on online clustering to group images that have similar visual concepts rapidly, to leverage their similarities. “With SwAV, we were able to improve over the previous state of the art in self-supervised learning, and did so with 6x less training time,” Facebook said.
SEER also incorporates a new AI model architecture called RegNets, which are convolutional neural networks capable of scaling to trillions of parameters that can be optimized to fit various runtime and memory limitations. The third and final component of SEER is an all-purpose library for self-supervised learning called VISSL.
“The future of AI is in creating systems that can learn directly from whatever information they’re given — whether it’s text, images, or another type of data — without relying on carefully curated and labeled data sets to teach them how to recognize objects in a photo, interpret a block of text, or perform any of the countless other tasks that we ask it to,” Facebook’s researchers said. “This is a breakthrough that ultimately clears the path for more flexible, accurate, and adaptable computer vision models in the future.”
Facebook listed a number of potential use cases for SEER, such as automatically generating text to describe images to people with visual impairments, better categorization of items sold on the Facebook Marketplace and better systems for censoring “harmful” images on Facebook’s platform.
Facebook looks intent on keeping the secrets of SEER to itself, at least for now. The company didn’t mention any plans to open-source SEER, though it will release the VISSL library to the research community so that it can be used to train other self-supervised learning models.
Since you’re here …
Show your support for our mission with our one-click subscription to our YouTube channel (below). The more subscribers we have, the more YouTube will suggest relevant enterprise and emerging technology content to you. Thanks!
Support our mission: >>>>>> SUBSCRIBE NOW >>>>>> to our YouTube channel.
… We’d also like to tell you about our mission and how you can help us fulfill it. SiliconANGLE Media Inc.’s business model is based on the intrinsic value of the content, not advertising. Unlike many online publications, we don’t have a paywall or run banner advertising, because we want to keep our journalism open, without influence or the need to chase traffic.The journalism, reporting and commentary on SiliconANGLE — along with live, unscripted video from our Silicon Valley studio and globe-trotting video teams at theCUBE — take a lot of hard work, time and money. Keeping the quality high requires the support of sponsors who are aligned with our vision of ad-free journalism content.
If you like the reporting, video interviews and other ad-free content here, please take a moment to check out a sample of the video content supported by our sponsors, tweet your support, and keep coming back to SiliconANGLE.