YOLO Image Detection

Image detection is a common task in computer vision, and there are various algorithms and techniques that can be used to achieve it. One of the most popular and effective methods is YOLO (You Only Look Once), which is a real-time object detection system developed by Joseph Redmon and Ali Farhadi. In this blog post, we will explore how to use YOLO for image detection in Python.

What is YOLO?

YOLO is a convolutional neural network (CNN) based object detection system that is fast and accurate. It works by dividing the input image into a grid of cells and using each cell to predict the presence and location of objects. YOLO uses a single CNN network to predict the class and location of multiple objects in an image. This allows it to process images in real-time, making it suitable for use in applications where speed is a priority.

How to use YOLO for image detection in Python?

To use YOLO for image detection in Python, we need to install the following libraries:

  • NumPy: a library for scientific computing with Python.
  • OpenCV: a library for computer vision tasks.
  • imutils: a library for image processing in Python.
  • Darknet: a neural network framework written in C and CUDA.

We can install these libraries using the following command:

pip install numpy opencv-python imutils darknet

Once we have these libraries installed, we can start using YOLO for image detection in Python. The first step is to download the YOLO weights file and configuration file from the official website. We can then use these files to initialize the YOLO detector.

Here is an example of how to do this in Python:

import cv2
import numpy as np

# Load YOLO weights and configuration file
net = cv2.dnn.readNet("yolov3.weights", "yolov3.cfg")

# Load image
image = cv2.imread("image.jpg")

# Get image dimensions
(H, W) = image.shape[:2]

# Determine only the *output* layer names that we need from YOLO
ln = net.getLayerNames()
ln = [ln[i[0] - 1] for i in net.getUnconnectedOutLayers()]

# Construct a blob from the input image and then perform a forward
# pass of the YOLO object detector, giving us our bounding boxes and
# associated probabilities
blob = cv2.dnn.blobFromImage(image, 1 / 255.0, (416, 416),
	swapRB=True, crop=False)
net.setInput(blob)
layerOutputs = net.forward(ln)

Once we have the output from YOLO, we can use it to detect objects in the image. To do this, we need to loop over each of the bounding boxes and draw them on the image. We can also use the class labels and confidence scores to label the bounding boxes.

# Loop over each of the layer outputs
for output in layerOutputs:
    # Loop over each of the detections
    for detection in output:
        # Extract the class ID and confidence (i.e., probability) of
        # the current object detection
        scores = detection[5:]
        classID = np.argmax(scores)
        confidence = scores[classID]

        # Filter out weak detections by ensuring the detected
        # probability is greater than the minimum probability
        if confidence > 0.5:
            # Scale the bounding box coordinates back relative to the
            # size of the image, keeping in mind that YOLO actually
            # returns the center (x, y)-coordinates of the bounding
            # box followed by the boxes' width and height
            box = detection[0:4] * np.array([W, H, W, H])
            (centerX, centerY, width, height) = box.astype("int")

            # Use the center (x, y)-coordinates to derive the top and
            # and left corner of the bounding box
            x = int(centerX - (width / 2))
            y = int(centerY - (height / 2))

            # Update our list of bounding box coordinates, confidences,
            # and class IDs
            boxes.append([x, y, int(width), int(height)])
            confidences.append(float(confidence))
            classIDs.append(classID)

# Apply non-maxima suppression to suppress weak, overlapping bounding
# boxes
idxs = cv2.dnn.NMSBoxes(boxes, confidences, 0.5, 0.3)

# Ensure at least one detection exists
if len(idxs) > 0:
    # Loop over the indexes we are keeping
    for i in idxs.flatten():
        # Extract the bounding box coordinates
        (x, y) = (boxes[i][0], boxes[i][1])
        (w, h) = (boxes[i][2], boxes[i][3])

        # Draw the bounding box on the image
        color = [int(c) for c in COLORS[classIDs[i]]]
        cv2.rectangle(image, (x, y), (x + w, y + h), color, 2)
        text = "{}: {:.4f}".format(LABELS[classIDs[i]], confidences[i])
        cv2.putText(image, text, (x, y - 5), cv2.FONT_HERSHEY_SIMPLEX,
            0.5, color, 2)

# Show the output image
cv2.imshow("Image", image)
cv2.waitKey(0)

This code will apply YOLO to the input image and draw bounding boxes around the detected objects. It will also label the bounding boxes with the class labels and confidence scores.