Computer & Machine Vision

Machine vision is a field of artificial intelligence that trains a computer to see, identify and process images in the same way that human vision does. Data scientists can train these models to provide an appropriate output. The goal of these deep learning models is not only to see, but also process and provide useful results based on the observation. Machines can be relied on more than humans to accurately identify and classify objects, react to what they see and automate a traditionally time-consuming process.

Some of the first neural networks were developed in the 1950s and progress has been made since then to train computers in object recognition and image classification. Accuracy rates for computer vision models have risen from 50 percent to 99 percent in less than a decade thanks in large part to a number of technological advancements, popularity of mobile technology, computing power, hardware designed to handle complex deep learning models, and new algorithms available in open source libraries.

How Deep Learning Enables Machine Vision

Deep learning methods use neural network architectures to ingest data streams and output insights to a user. Seems easy right? Just implement a convolutional neural network, and you are good to go. Without understanding the mechanics behind these models, you run the risk of not understanding the outputs, not being able to tune the model or not being able to translate analytics to business terms.

Neural Network Intro

An artificial neural network (ANN) is a system patterned after the operation of neurons in the human brain. An ANN usually involves a large number of processors operating in parallel and arranged in tiers. The first tier receives the raw information, much like an optic nerve in the human eye. Each successive tier in a neural network receives the output from the tier preceding it, just like neurons in the brain receiving information from the optic nerve in the human body. The last tier of the system produces the output.

Each processing node has its own small sphere of knowledge, including what is has seen and any rules it was programmed with or developed for itself. The tiers are highly interconnected, which means, in some cases, each node is connected.

computer vision deep neural network image
Image of a neural network architecture

ANN’s are so powerful because they are adaptive, the machine continues to learn with each subsequent run of the model. Typically, one of these models is trained on a large training dataset, and whomever is tuning the model, needs to provide the model with answers so the model can adjust its internal weightings to learn how to do its job better.

In defining the rules and making determinations, the decision of each node on what to send to the next layer is based on inputs from the previous, ANNs use several mathematical principles including; gradient-based searches, fuzzy logic, genetic algorithms, and Bayesian statistics.

Computer Vision is Gaining Popularity

Deep learning is the engine that machine vision runs off of. The ability to interpret raw photos and videos have been applied to projects in retail, manufacturing, transportation, and energy to name a few. If you haven’t embraced the power of classifying images, your competition probably is. ANNs are used in applications such as facial recognition, driverless vehicle transportation, reverse image searches, surgical operating quality assurance, and many more.

Computer Vision Pitfalls – Why Experienced Data Scientists Matter

AI integrated platforms have made it easy to plug-in an out of the box computer vision model to any set of images. These models are generally trained on natural image datasets like ImageNet or coco. Although its useful in a few applications, industries like healthcare and energy sector will not benefit from these pre-built models and needs experts in the field for training the neural networks suitable for this application. The understanding of how the weights in the neural network are learned helps in tuning them to give the accuracy that is needed. Overfitting is a major problem while using standard models and tuning them with no domain knowledge. You may have noticed your model to give great results on only a particular set of images, that is because the model has only learned a few of the dominant features and cannot recognize minor changes. The use of customized layers and optimal hyperparameters by an experienced data scientist can enhance your model prominently.  

Surgical Object Identification | Healthcare Use Case

Surgical theaters are high pressure situations, most times with human lives on the line. Quick and accurate decisions are as important here than anywhere else. With the rise of robotics and IoT, humans are turning towards AI & machine learning to support more and more of these processes. Computer vision is another tool healthcare providers and medical device manufacturers can add in their toolbox to support their medical staff.

In these pressure box situations, human surgeons and machines have left retained surgical instruments int the bodies of their patients. In any given typical surgery, 250-300 surgical tools are used. The number increases to 600 when a laser surgery is performed. The consequences of leaving a tool behind range from harmless to life-threatening. In many cases, the patient has to come back in for another surgery to remove the item left behind.

A computer vision model could be trained to identify anomalies in the human body. The patient could have a body scan before undergoing surgery, and before the surgeon is done with the procedure, a computer program could run a quick scan to identify any anomalous objects. Data scientists could further train the model to identify common surgical items. Not only will this increase the quality of care given, but could save the healthcare system significant amounts in litigation costs. 

Computer vision is becoming more popular in the radiology department. Newer scanners are coming equipped with AI enabling the recognition of tumors or other anomalous foreign bodies being present in the scan. The objective is not to replace a radiologist but assist them in making better judgement. Computer vision has proven to catch the minutest details that can be easily missed when a human is performing a repetitive job.

Oil & Gas Use Case | Inspection

Oil & gas firms have many opportunities to apply machine vision & AI. Inspection is one such area applicable for upstream to downstream pipelines. With the rise of drones, visual inspection using optical cameras mounted on drones can be used to identify equipment flaws and defects, structural failures, welding flaws, corrosion development and cracks.

Thermal inspection is another NDT (Non-destructive technique) technique used as a preventive maintenance tool to spot leaks in pipelines, tanks and other facilities which help in improving the safety and monitor emissions.

Inspection of pipelines which extends hundreds of kilometers, which once used a labor-intensive process has become much more efficient and accurate with the application of video analytics and drones.

Another application which requires no additional equipment set-up is surveillance. By adding another layer of analytics to the existing network of image and video feed which was used to monitor remote sites, it is possible to identify unauthorized personal entering the site or recognize any suspicious activity and alerting the security systems. This could also be extended to authorized entry using facial recognition technology, monitoring the use of protective equipment on operational facilities and even help in monitoring the operations.

The oil and gas industry is volatile, and it is absolutely essential to find ways to be prepared for the next crash of oil price. The inclusion of such advanced technology not only helps to reduce labor intensive tasks and increase profits but also significantly promotes safety.  

computer vision deep ranking architecture
Deep Ranking Architecture Mosaic deployed for one of our Oil & Gas clients
Put Computer Vision to Work for Your Business

Designing and deploying machine vision is a powerful technology that humans can employ to improve their decision making. The only limits to these technologies lie within our ability to think of problems for them to solve. These deep learning and AI techniques are not easily developed, and trained data scientists need to be involved in the translation of analytics to business insights. With the proper collaboration plan in place, perhaps working with a AI company like Mosaic Data Science, businesses will be able to influence decision making for the next generation.