What is Computer Vision?

Computer vision is one of the core areas of artificial intelligence (AI). It focuses on creating solutions that enable AI-enabled applications to “see” the world and make sense of it.

Computers don’t have natural eyes that work the way ours do, but they are capable of processing images; either from a live camera feed or from digital photographs . This ability of processing images holds key to create software that can emulate human visual perception.


Understanding in this context means the transformation of visual images into descriptions of the world. It makes sense to thought processes and can initiate appropriate action.

This image understanding can be recognised as the disentanglement of information from image data with the help of models constructed with the assisstance of geometry, physics, statistics, and learning theory. The scientific discipline of computer vision is concerned with the theory for the artificial systems that extract information from images

computer vision example: object detection
Computer Vision

To an AI application, an image is just an array of pixel values. Using these numeric values as features to train machine learning models makes predictions about the image and its contents.

The image data can take many forms, such as video sequences, multiple camera views, or multi-dimensional data from a medical scanner. As a technological aspect of knowledge, computer vision seeks to have apply its theories and models for the construction of computer vision systems.


traffic monitoring using computer vision
Traffic being monitored through computer vision
  • Image Classification:

Image classification pertains to train a machine learning model to classify images based on their contents. For example, in a traffic monitoring solution you may use an image classification model to help in classifying images based on the type of vehicle they contain, such as taxis, buses, cyclists, and so on.

  • Object Detection:
object detection using cv
Detection of Different Objects

The training of Object detection machine learning models is helping individual objects within an image classification, and their location identification with a bounding box. For example, a traffic monitoring solution can use object detection for identification of various location of different classes of vehicle.

  • Semantic Segmentation:
semantic segmentation using computer vision
Semantic Segmentation

It is an advanced machine learning technique. It is in which classification of individual pixels in the image is made according to the object to which they belong. For example, a traffic monitoring solution might cover up traffic images with masked layers in other words ,highlights different vehicles using specific colors.

  • Image Analysis:
image analysis
Image Analysis

You can create solutions that combine machine learning models with advanced image analysis techniques to extract information from images. That includes “tags” that could help catalog the image or even descriptive captions that summarize the scene shown in the image.

  • Face detection, analysis, and recognition:
multiple face detection using computer vision
Multiple Face Detection

Face detection is a specialized form of object detection that can locate faces in an image. Combining this with classification and facial geometry analysis techniques infers details such as age, and emotional state; and even recognize individuals based on their facial features.

  • Optical character recognition (OCR):
Optical character recognition (OCR)

It is a technique helpful for detection and understanding text in images. You can use OCR to read text in photographs extract information from scanned documents such as letters, invoices, or forms.