Intro

Object detection in video has become a matter of routine, however, expanding these models to detect an object of your choosing requires many thousands, if not tens of thousands, of training examples. Few shot learners seek to make this process cheaper and easier by learning to detect new objects with only a small handful of examples (i.e. 1-30). 

In practice, few-shot machine learning is handy when training data is hard to find, expensive to collect, or the cost of labelling outpaces the expected machine learning benefits. If you consider the sheer amount of potential visual cues a machine vision model can learn from, you can quickly understand the benefits of deploying few-shot learners to discover patterns in data and make beneficial predictions. 

In the following screen-shot, Mosaic’s data scientists were able to train a few shot machine learning model to identify labels in refrigerated products. If you would like to skip this entire blog post & just watch the video we produced – it is up on our YouTube channel

few shot learning of bounding boxes on products and labels
Few Shot Vision Learning

Computer vision & deep learning applications are changing how businesses operate. As a field, computer vision has received a lot of publicity and a decent amount of investment. According to KDnuggets, The North American market for computer vision software has had a total investment of $120 million USD, while the Chinese market surged to $3.9 billion USD. 

Deep learning algorithms (that power computer vision applications) learn about the task at hand through a network of neurons that map the task as a hierarchy of concepts. Each complex concept is defined by a series of simpler concepts. And all of this the algorithms can do by themselves. In applications of machine vision, this means identifying light and dark areas first, then categorizing lines, then shapes before moving towards full picture recognition.

One application of deep learning in computer vision is object detection. Object detection in video has become standard fare. There are a whole host of open-source and freely available models capable of detecting numerous everyday objects and items. Some of these items are, for example, humans, cars, bicycles, appliances, etc. Unfortunately these models require many hand labeled and annotated images to train. 

This process can be very expensive and difficult. If you want to detect a non-standard object such as a product label or the presence of your own specific product, gathering all those training examples is often a barrier. Recently, progress has been made in few shot learners which allows models to learn new classes with a small handful of examples. Where small typically refers to anywhere from one to thirty. 

When trying to decide if your computer vision use case requires a few shot deep learner, consider these questions:

  • Is there a scarcity of supervised data? 
  • How much effort & cost is associated with labeling the data? 
  • Are there several samples available? 
  • Is the content of your video fairly uniform? Few shot learners shine bright in cases where the video is more uniform visually and contains similar content. This makes it great for industrial applications.

If your answer was yes to these questions, it might be time to investigate few-shot machine learning.

How do few shot learners work?

This an evolving field and there seems to be significant innovations in the coming years as more and more use cases require few shot machine learning. At a high level there are two main approaches that utilize transfer & meta learning. 

Data-level approach 

This approach is based on the concept that whenever there is insufficient data to fit the parameters of the algorithm and avoid underfitting or overfitting the data, then more data should be added.

A technique used to achieve this is to leverage an extensive collection of external data sources. For example, if the intention is to create a classifier for identifying a new product’s packaging defects  without sufficient labeled elements for each category, it could be necessary to look into other external data sources that have images of similar packaging. In this case, even unlabeled images can be useful. 

In addition to utilizing external data sources, another technique for data-based low shot machine learning is to produce new data. For example, data augmentation techniques can be employed to add random noise to the product packaging images. Alternatively, new visual data can be produced using Generative Adversarial Networks. These GANs can produce new images of the packaging using different perspectives. 

Transfer & Meta Learning 

The basic idea behind few shot learners is that maybe we can learn how object detection models learn and then use this information to create a general and robust model using a few examples. This is generally referred to as transfer learning. By making use of general lessons learned in detecting a variety of objects/classes, we can reduce the number of examples needed to detect a new object/class. 

According to research firm, Borealis AI, approaches to meta and transfer learning are diverse and there is no consensus on the best approach. However, there are three distinct families, each of which exploits a different type of prior knowledge:

  • Prior knowledge about similarity: We learn embeddings in training tasks that tend to separate different classes even when they are unseen.
  • Prior knowledge about learning: We use prior knowledge to constrain the learning algorithm to choose parameters that generalize well from few examples.
  • Prior knowledge of data: We exploit prior knowledge about the structure and variability of the data and this allows us to learn viable models from a few examples.

Prefer a visual image of transfer learning? We like this image produced by Towards Data Science

few shot learning chart of transfer learning vs ml

When to use few shot deep learners? 

Now that we have discussed the why & how of these deep learning techniques, let’s examine a few potential use cases. 

  1. Assembly Line Defect Detection & Quality Control -> Businesses need to constantly innovate to meet evolving customer needs. This is extremely important for consumer & business product manufacturers. Most firms have invested in beefing up their quality decisions with AI, but what do you do when there are new products coming off the line? Instead of waiting 6-months to a year to compile enough manufacturing data, few shot learners can be trained to support decisions now, not months down the road. 
  2. Real-Time Adverse Event Detection -> If 2020 has taught us anything it is to expect the unexpected. Businesses constantly face new threats to their operations & infrastructure, everything from nefarious actors to adverse weather events. Not only can AI be trained to identify these events, but data-driven responses can be recommended based on current operating conditions. If an electric utility experiences a downed line, wouldn’t it be great to have a few shot learner diagnose the problem before a crew gets to the problem area? Power would be restored quicker and the crew wouldn’t have to spend valuable time diagnosing the problem.
  3. New Drug Discovery -> Deep neural networks in particular have been demonstrated to provide significant boosts in predictive power when inferring the properties and activities of small-molecule compounds. However, the applicability of these techniques have been limited by the requirements of large training datasets. Few shot learning can lower the amount of data needed to make accurate predictions in drug discovery. 
  4. Automated Machine Inspection -> Computer vision few shot learners can detect a specific part(s) on any mechanical asset, and compare the part-image to what it should actually look like, in real-time. This would save countless human review hours for any inspection team and cut down on costly mistakes.  

The only limits to these types of use cases are what we can think of them to solve. Mosaic is poised to partner with your business to find few shot learning applications that make a difference. 

Rather than just write content, our data scientists have actually built a few shot learner to demonstrate the power of these models in a computer vision application. In the following video, Mosaic uses few shot learning to identify refrigerated product labels and containers.