Update README.md

690e110 verified 10 days ago

5.74 kB

	Agriculture Pest Detection
	by Oscar Mendoza

	Model Description:
	The model I worked on is a computer vision model that uses object detection to help identify common agricultural pest in crop images.
	The model was trained to identify six different insects and put them into classes. Those classes were ants, aphids, beetles, caterpillars,
	grasshoppers, and weevils. The model was trained using Yolo V11 for its fast and accurate object detection.
	The model aims to support early pest detection in agriculture from images captured in field conditions like plants, soil, and foliage

	Training Data:
	The dataset was found called Pest Detection Computer Vision Model
	https://universe.roboflow.com/sams-sgift/pest-detection-qbalv
	The dataset originally came with 7.3k images, but I narrowed it down to 2.5k images
	Each class was pretty balanced for images, with ahids being the only one not in the 400s for photos
	Below are the images per class:
	\| Class \| Image Count \|
	\|-------------\|------------\|
	\| Ant \| 497 \|
	\| Aphid \| 283 \|
	\| Beetle \| 413 \|
	\| Caterpillar \| 409 \|
	\| Grasshopper \| 483 \|
	\| Weevil \| 480 \|

	Classes were reduced from 19 to 6 for better focus, classes were selected on how common its found in crops, and relevance to agriculture.
	Annotation Process:
	Dataset was reviewed and cleaned using roboflow, approximately reveiwed 50 images for annotations. Corrections like misidentified labels, or
	innacurate bounding boxes, removing images that didn't match any of the classes.
	Train: 70%
	Validation: 20%
	Split: 10%

	No data augmentation was applied.
	Link to my annotated refined dataset:
	https://app.roboflow.com/oscars-space-teemy/pest-detection-qbalv-a6hpf/models/pest-detection-qbalv-a6hpf/3
	Limitation in my dataset:
	Some insects such as aphids are small and difficult to detect, natural backgrounds can make insects harder to find due to blending in with its
	environment. Dataset also may not represent all agricultural environments.

	Training Procedure:
	Framework: Ultralytics YOLOv11
	Hardware: Using Google Colab on A100 GPU runtime
	Epochs: 50
	Image size: 640x640

	Evaluation Results:
	\| Metric \| Value \|
	\|------------\|-------\|
	\| mAP50 \| 0.676 \|
	\| mAP50-95 \| 0.345 \|
	\| Precision \| 0.726 \|
	\| Recall \| 0.659 \|

	Per Class Performance (Average Precision):
	\| Class \| Average Precision (AP) \|
	\|-------------\|------------------------\|
	\| Ant \| 0.263 \|
	\| Aphid \| 0.176 \|
	\| Beetle \| 0.370 \|
	\| Caterpillar \| 0.172 \|
	\| Grasshopper \| 0.420 \|
	\| Weevil \| 0.675 \|


	![val_batch0_pred (1)](https://cdn-uploads.huggingface.co/production/uploads/69b8f1e6906a9f3f1426ea5d/8rc-64dggsCYx4AAsFdoD.jpeg)

	![val_batch1_pred](https://cdn-uploads.huggingface.co/production/uploads/69b8f1e6906a9f3f1426ea5d/bOIqOUlM_5207Vxgof_bi.jpeg)

	![results (1)](https://cdn-uploads.huggingface.co/production/uploads/69b8f1e6906a9f3f1426ea5d/X4BmdYzMRiHII2pmQI7Pj.png)

	In my results image, you can see that the traning and validation loss curves decrease over time, meaning that the model is successfully learning
	and improving its prection across epochs. Precision and recall both increase steadily through training, meaning most predictions are correct,
	and the model detects a majority but not all of pests. Overall the trends show the model is learning effectively, but there are limitations,
	especially when it comes to detecting smaller objects.


	![confusion_matrix_normalized](https://cdn-uploads.huggingface.co/production/uploads/69b8f1e6906a9f3f1426ea5d/Uf7SIKnzgaFElGJubn8yN.png)

	My confusion matrix shows a nice diagnol line showing that most objects are getting detected and detected correctly, but we also see that
	some of the best are getting confused with the background.


	![BoxF1_curve (1)](https://cdn-uploads.huggingface.co/production/uploads/69b8f1e6906a9f3f1426ea5d/-XLF9LP28q3laPRwhfbAJ.png)

	My F1- Confidecne Curve shows how the model's F1 score changes with different confidence thresholds for each class. We see weevil performing
	the best and maintaing a high F1 score (around 0.9) across a wide range of thresholds, while aphid and caterpillar perform the worst, with
	lower F1 scores, meaning there is some difficult detecting these two classes.

	Performance Analysis:
	Overall, the model performs well across most classes. The best beeing the Weevil with a 0.675 AP and lowest being Caterpillar at 0.172 AP. Some
	issues that I saw when looking at the performance was small insects like aphids were frequently missed, and some insects blend into natural
	backgrounds, as well as some visual similarity between weevils and beetles. The recall score being 65.9% means that pest are being detected but
	a significant portion are still missed, which wouldnt be the best for a solo agriculture detection system.

	Limitations and Biases:
	Poor performing classes would include the aphids and caterpillars, since aphids are typically green, you can't expect the model to identify the
	aphid on green backgrounds. As for caterpillars, they are always changing their appearance based on the environment which would make it harder
	for the model to detect them. As far as a dataset bias, the dataset isn't as diverse as it could be with most photos having a green background,
	instead of showing a birds view of the insect on a white background to better detect. The model shouldn't be used as a fully automated pest
	detection system since it cannot fully identify the objects. It would also not be suitable for anything that doesn't have human supervision, since
	it would most likely make a mistake or not detect an insect.