We decided to approach this problem as a similarity learning modeling effort. We used convolutional neural networks to train a model that takes an image or video as input and outputs a vector representation of the input, such that similar inputs will be close to each other in the vector space. The vector learning is driven by a triplet loss function.
Object detection in video has become a matter of routine, however, expanding these models to detect an object of your choosing requires many thousands, if not tens of thousands, of training examples. Few shot learners seek to make this process cheaper and easier by learning to detect new objects with only a small handful of examples (i.e. 1-30).
Deep learning, specifically computer vision and natural language processing, can be designed to identify defects during the product packaging process. These deep learning models can verify that a label on a package is present, correct, straight, and readable.
Mosaic designed and deployed custom computer vision models to automate asset recognition & inform inspection decisions.
Designing and deploying computer vision is a powerful technology that humans can employ to improve their decision making. The only limits to these technologies lie within our ability to think of problems for them to solve.