| | --- |
| | license: mit |
| | language: |
| | - en |
| | base_model: |
| | - jameslahm/yolov10n |
| | --- |
| | |
| |
|
| | ```markdown |
| | Document Layout Detection |
| | |
| | This script demonstrates how to use the document layout detection model on an image. |
| | Below are the steps and code implementation. |
| | |
| | --- |
| | ## Step 1: Import Required Libraries |
| | ``` |
| | ```python |
| | import cv2 |
| | import matplotlib.pyplot as plt |
| | import numpy as np |
| | from ultralytics import YOLO |
| | from google.colab.patches import cv2_imshow |
| | ``` |
| |
|
| | - **cv2**: For image processing. |
| | - **matplotlib.pyplot**: For plotting if needed. |
| | - **numpy**: For numerical operations. |
| | - **YOLOv10**: For object detection. |
| | - **cv2_imshow**: For displaying images in Google Colab. |
| | |
| | --- |
| | |
| | ## Step 2: Load YOLOv10 Model |
| | ```python |
| | model = YOLO('vprashant/doclayout_detector/weights') |
| | ``` |
| | |
| | - Load the YOLOv10 model with the path to your trained weights. |
| | |
| | --- |
| | |
| | ## Step 3: Read and Prepare the Image |
| | ```python |
| | img = cv2.imread('/content/yolov10/dataset/train/images/11.png') |
| | img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB) |
| | ``` |
| | |
| | - Read the image from the specified path. |
| | - Convert the image from BGR to RGB color space. |
| | |
| | --- |
| | |
| | ## Step 4: Perform Object Detection |
| | ```python |
| | results = model(img) |
| | ``` |
| | |
| | - Run the YOLOv10 model on the image to get detection results. |
| | |
| | --- |
| | |
| | ## Step 5: Extract and Process Detection Results |
| | ```python |
| | results = results[0] |
| | boxes = results.boxes |
| | data = boxes.data.cpu().numpy() |
| | ``` |
| | |
| | - Extract the first result (image-level result). |
| | - Access the detected bounding boxes. |
| | - Convert detection data to a NumPy array for processing. |
| | |
| | --- |
| | |
| | ## Step 6: Visualize Results |
| | ```python |
| | for i, detection in enumerate(data): |
| | x1, y1, x2, y2, conf, cls_id = detection |
| | x1, y1, x2, y2 = map(int, [x1, y1, x2, y2]) # Convert coordinates to integers |
| | cv2.rectangle(img, (x1, y1), (x2, y2), (0, 255, 0), 2) # Draw bounding box |
| | |
| | class_name = model.names[int(cls_id)] # Get class name |
| | label = f"{class_name}: {conf:.2f}" # Create label with confidence score |
| | cv2.putText(img, label, (x1, y1 - 10), cv2.FONT_HERSHEY_SIMPLEX, |
| | 0.9, (0, 255, 0), 2) # Add text label |
| | ``` |
| | |
| | - Loop through all detections. |
| | - Draw bounding boxes and labels on the image. |
| | |
| | --- |
| | |
| | ## Step 7: Display the Processed Image |
| | ```python |
| | cv2_imshow(img) |
| | cv2.waitKey(0) |
| | cv2.destroyAllWindows() |
| | ``` |
| | |
| | - Display the image with detections in Google Colab using `cv2_imshow`. |
| | - Wait for a keypress and close any OpenCV windows. |
| | |
| | --- |
| | |
| | ## Note |
| | - Ensure you have the trained YOLOv10 model and the dataset in the specified paths. |
| | - Replace the paths with your local or Colab paths. |
| | - Install necessary libraries like OpenCV, Matplotlib, and ultralytics if not already installed. |
| | ``` |
| | |