Image and Video Processing with OpenCV

OpenCV (Open Source Computer Vision Library) is a leading open-source library for computer vision and machine learning. It provides a common infrastructure for computer vision applications and facilitates the rapid use of machine perception in commercial products.

Setting Up OpenCV

To use OpenCV with Python, you need to install the OpenCV package:

pip install opencv-python

Basic Image Processing

Reading, Displaying, and Writing Images

Here’s how you can read, display, and save an image:

import cv2

# Read the image
image = cv2.imread('path_to_image.jpg')

# Display the image
cv2.imshow('Image', image)
cv2.waitKey(0)
cv2.destroyAllWindows()

# Save the image
cv2.imwrite('output.jpg', image)

Basic Operations on Images

Operations such as cropping, resizing, and rotating can be performed using OpenCV functions:

import cv2

# Assume 'image' is already read

# Cropping
y, x, h, w = 50, 100, 200, 300 # Example values
cropped_image = image[y:y+h, x:x+w]

# Resizing
new_width, new_height = 640, 480
resized_image = cv2.resize(image, (new_width, new_height))

# Rotating
(h, w) = image.shape[:2]
center = (w // 2, h // 2)
angle = 45 # Degrees
scale = 1.0
matrix = cv2.getRotationMatrix2D(center, angle, scale)
rotated_image = cv2.warpAffine(image, matrix, (w, h))

Basic Video Processing

Processing video frames using OpenCV involves capturing video and iterating over each frame:

import cv2

# Capture video from a file or camera
cap = cv2.VideoCapture('path_to_video.mp4') # Use 0 for webcam

while cap.isOpened():
    ret, frame = cap.read()
    if not ret:
        break
    # Process the frame
    # For example, display it
    cv2.imshow('Frame', frame)
    if cv2.waitKey(1) & 0xFF == ord('q'):
        break

cap.release()
cv2.destroyAllWindows()

Processing Images and Videos Frame by Frame

Processing images and videos frame by frame is a common task in video analysis. This section will guide you through handling video frames using OpenCV.

Reading and Processing Video Frames

To process a video, you need to capture it using OpenCV’s VideoCapture, and then iterate over each frame. Here’s an example:

import cv2

# Capture the video
cap = cv2.VideoCapture('path_to_video.mp4')

while cap.isOpened():
    ret, frame = cap.read()
    if not ret:
        break
    # Process each frame
    # For example, convert to grayscale
    gray_frame = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
    cv2.imshow('Frame', gray_frame)
    if cv2.waitKey(1) & 0xFF == ord('q'):
        break

cap.release()
cv2.destroyAllWindows()

This script reads a video file, converts each frame to grayscale, and displays it. The loop breaks when the video ends or the user presses ‘q’.

Saving Processed Frames

You can also save the processed frames to create a new video. Here’s how to do it:

import cv2

# Capture the video
cap = cv2.VideoCapture('path_to_video.mp4')

# Get frame dimensions
frame_width = int(cap.get(cv2.CAP_PROP_FRAME_WIDTH))
frame_height = int(cap.get(cv2.CAP_PROP_FRAME_HEIGHT))

# Define the codec and create VideoWriter object
fourcc = cv2.VideoWriter_fourcc(*'XVID') # You can use other codecs as well
out = cv2.VideoWriter('output.avi', fourcc, 20.0, (frame_width, frame_height), isColor=False)

while cap.isOpened():
    ret, frame = cap.read()
    if not ret:
        break
    # Process the frame (convert to grayscale)
    gray_frame = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
    # Write the frame to the output video
    out.write(gray_frame)
    cv2.imshow('Frame', gray_frame)
    if cv2.waitKey(1) & 0xFF == ord('q'):
        break

cap.release()
out.release()
cv2.destroyAllWindows()

This code captures video from a file, processes each frame by converting it to grayscale, and saves the processed frames into a new video file called 'output.avi'.

Additional Image Processing Techniques

OpenCV provides a wide range of image processing functions:

Drawing Shapes

import cv2

# Draw a rectangle
start_point = (50, 50)
end_point = (200, 200)
color = (0, 255, 0) # Green color in BGR
thickness = 2
image = cv2.rectangle(image, start_point, end_point, color, thickness)

# Draw a circle
center_coordinates = (120, 120)
radius = 50
color = (255, 0, 0) # Blue color in BGR
thickness = -1 # Fill the circle
image = cv2.circle(image, center_coordinates, radius, color, thickness)

Adding Text

import cv2

# Add text to the image
font = cv2.FONT_HERSHEY_SIMPLEX
org = (50, 50)
font_scale = 1
color = (0, 0, 255) # Red color in BGR
thickness = 2
image = cv2.putText(image, 'OpenCV', org, font, font_scale, color, thickness, cv2.LINE_AA)

Edge Detection

import cv2

# Convert to grayscale
gray_image = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)

# Apply Canny edge detection
edges = cv2.Canny(gray_image, threshold1=100, threshold2=200)

# Display edges
cv2.imshow('Edges', edges)
cv2.waitKey(0)
cv2.destroyAllWindows()

Remember to always release resources and close windows when you’re done processing:

# Release the video capture object
cap.release()

# Close all OpenCV windows
cv2.destroyAllWindows()