Everything About

Computer Vision

What is Computer Vision?

A branch of AI that has been changing the way our world works in recent years, computer vision is the learned ability of computers to interpret and understand visual content. Computer vision means that computers, just as humans, are able to “see” photos and videos, identify and classify objects and respond to what is “seen”. When replicating this aspect of human ability, computer vision utilises deep learning; with an extensive amount of image and video data available online, computer vision becomes more and more powerful as the years go by. You will likely have encountered computer vision in a number of situations, but you may not even realise the full extent of its impact.

Everything you need to know about Computer Vision

In this article, we’re going to outline everything you need to know about computer vision – from the deep learning and neural networks that allow it to function to the cases in which it can be used. Computer vision, like other aspects of artificial intelligence, has now enabled computers to surpass human ability in some tasks. The accuracy rates of computer vision with regards to object localization and classification have increased in the last ten years and they only continue to get better. Understanding how computer vision works alongside human activity highlights areas which could be transformed even further in coming years.

How does computer vision work in AI?

Computer vision, when broken down, can be understood to work through three steps.

In the first of these steps, computer vision technology acquires an image or image set in real-time. This can be photos, videos, 3D content or live footage such as that acquired from a CCTV camera. This content is then analyzed through deep learning models which have been trained to detect patterns and classify images. Computer vision in AI can then take actions based on an understanding of the image.

Computer vision can therefore be used in a number of ways. Here are a few key examples:

Edge Detection

Edge Detection analyzes visual content in order to determine the outside edge of an object or landscape.

Image Classification

Image classification takes visual content and recognizes that it belongs to a certain group or class. For example, it can recognize when an image contains a person or a vehicle. At Sightcorp we use image classification to find the age and the gender of a person, as well as his / her emotions and attributes, such as face mask.

Object Detection

Object detection is a type of image classification in which a certain object is identified and the location within a piece of content is highlighted. One example of this is facial recognition which can pick up on human faces within a set of visual content. At Sightcorp we do face, body and vehicle detection, as well as facial recognition.

Object Tracking

Object tracking tracks the position of an object in a set of visual content or in video content, most often in real time. At Sightcorp we use face and body tracking in order to be able to count the number of people.

Semantic Segmentation

Semantic segmentation divides entire images up into groups of pixels that can be analysed according to computer vision algorithms. This information is used in order to understand the role of each aspect of the visual content. For example, semantic segmentation could be used to break up aspects of the environment in which a self-driving vehicle was operating in order to keep it safe.

Content Based Image Retrieval

Content based image retrieval deploys computer vision in order to search for and retrieve images from large data stores based on the information that is picked up from the visual content. For example, it can be used in biodiversity libraries to find a specific kind of bird based on its color.

Image Reconstruction

Image reconstruction uses current data sets and computer vision technology in order to reconstruct an image that has been lost or eroded over time.

What is the technology behind Computer Vision?

Whilst computer vision can be explained with reference to human neural networks that are highly complex, a basic understanding can be gained by dissecting the processes of computer vision, remembering that the field of computer vision is concerned with allowing computers to “see” and interpret information in the way that human minds have learned to over millions of years ago.

Using large sets of data including images, videos and even 3D technology, computer vision runs analysis until it is able to discern patterns. If you showed computer vision technology thousands of pictures of dogs, for example, it would begin to identify the shapes, colours and so on and to learn the context of the visual data. With deep learning, computer vision algorithms essentially teach the technology to see.

The other aspect of computer vision in AI is the influence of a convolutional neural network (CNN). A CNN helps this deep learning model to “see” by deconstructing images into pixels that are given labels or tags. Using these labels, computer vision technology is able to make predictions about what it is seeing; the accuracy of these predictions is continually tested and the technology becomes increasingly accurate. In the example in which the visual content shows dogs, CNN would eventually allow for predictions of where they were included. Similarly, a recurrent neural network (RNN) is used within video applications to allow computers to understand how a series of images are related to one another.

While there are a range of ways computer vision can be deployed, the five main strategies of computer vision technology are: image classification, object detection, object tracking and semantic segmentation. These strategies can be coupled with machine learning in order to solve real life problems in a number of contexts.

How are Computer Vision and machine learning related?

While computer vision and machine learning are two distinct entities, there is some crossover between the two. Just like computer vision, machine learning is a branch of artificial intelligence that aims to mimic human thought processes.

The difference is that machine learning is a broader field. It’s algorithms can be applied to other fields as well, for example speech recognition or natural language processing. Computer vision, on the other hand, primarily deals with digital images and video.

Machine learning, at its core, is about creating intelligent machines that can operate in ways that are on par with or more effective than human behaviors. It works by picking up on patterns in data and utilizing algorithms in order to become more accurate over time.

Some aspects of computer vision can be benefited by machine learning algorithms and equally aspects of computer vision have improved the performance of machine learning. For example, machine learning offers more effective methods for recognition and tracking, because of its optimized algorithms. These algorithms have then been used in computer vision.

What is deep learning for Computer Vision?

Deep learning, on the other hand, which aims to enable computers to “think” more like humans rather than simply following instructions like machines, has a more significant role to play within the field of computer vision.

Deep learning is based on an artificial neural network that can learn and make decisions on its own. Aside from the basic connection outlined earlier, deep learning for computer vision can be understood to advance use cases in several ways.

Within image classification, for example, deep learning allows computer vision software to become more intelligent and to make complex decisions about image class independently. For example, deep learning can implement multiclass classification in computer vision technology – we use that at Sightcorp to find the age and gender of a person simultaneously.

Deep learning allows us to exploit object detection in order to gather the most important features which define a particular face. This forms the basis of Sightcorp’s product DeepSight.

What are the applications of Computer Vision technology?

The applications of computer vision, both in the real world and in theory, are incredibly extensive. We will run through just a few examples of use cases here:

Computer Vision in Retail

There are a large number of ways in which computer vision can be used within retail. Five main ways in which it can be applied are for customer demographic analysis, footfall counting, customer mood analysis, inventory management and for use at self-service checkouts. Computer vision which powers face analysis technology, can be used to detect and understand customer mood, as well as analyze their demographic breakdown. This helps retailers to understand customer satisfaction levels, create more engaging and targeted digital marketing in-store, while also enabling them to understand who their main customer is. Additionally, footfall data is useful for retailers to understand footfall trends, establish store peak times and compare traffic data with PoS data. Within inventory management, computer vision analyzes visual data to notify staff when products are out of stock. In recognizing objects, computer vision can also speed up the process of self-service checkouts. Facial recognition at a self-service checkout also offers a number of advertising opportunities based on age and gender recognition.

Computer Vision in Digital Signage

Computer vision is empowering the digital out-of-home (DOOH) landscape by providing audience analytics and screen analytics data. This includes age and gender insights, mood insights, dwell time, attention time and much more. The data that media owners and media buyers can collect through face analysis technologies powered by computer vision is invaluable.  For example, media buyers can understand campaign performance and play targeted content according to the audience’s demographic breakdown in real-time. Media owners, on the other hand, can discover the true value of their networks using audience impression data.    

Computer Vision in Healthcare

Healthcare organizations are already using computer vision to accelerate medical image processing. Computer vision technology can utilize deep learning to recognize patterns of certain illnesses and to diagnose them.  

Computer Vision in Surveillance

In forming the basis of facial recognition software, machine vision has a number of potential use cases. Perhaps one of the most significant of these is in boosting surveillance and security systems. In crowded places where there cannot be as high a volume of security guards, facial recognition software can pick up on individuals who are on a registered criminal database.


To find out more about how our computer vision software could transform your business, click here.