Skip to content

vidisha999/Computer-vision-Brain_Tumor-detection-using-YOLOv8

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

29 Commits
 
 
 
 
 
 

Repository files navigation

Brain Tumor detection using computer Vision - YOLOv8 model

What is computer vision ?

Computer vision is an interdisciplinary field of AI, that enables computers to interpret and understand the visual information from the digital images or videos. It involves teaching machines to capture and process visuals, analyze and intepret the data and make predictions or decisions based on that data.Many industries are adopting computer vision because it offers powerful capabilities to automate, enhance, and scale visual tasks that were traditionally done manually, with high speed and accuracy,and improved efficiency.Some of the most common tasks done by using the computer vision are image classification, object detection, image segmentation, pose estimation, Optical Character Recognition(OCR) and facial recognition.

Computer vision in Medical Imaging

In the field of medical dignostics,applying computer vision for medical imaging enhances the accuracy, efficiency, and accessibility of diagnostic processes, with reduced human error. Medical images like X-rays, MRIs, CT scans, and ultrasounds contain vast amounts of visual data that can be time-consuming and complex for human experts to analyze. Computer vision algorithms can automatically detect patterns, anomalies, and structures—such as tumors, fractures, or lesions—with high precision and consistency.

Additionally, it enables early detection of diseases, assists in treatment planning, and facilitates remote diagnostics through telemedicine. By transforming raw medical images into actionable insights, computer vision is becoming a critical tool in modern healthcare.

Problem Statement

Despite its potential, applying computer vision in medical imaging presents several challenges.

  • Data availability and annotation :Medical datasets are often limited due to privacy concerns, and labeling requires expert knowledge in data annotation.
  • High accuracy requirements: Errors in medical diagnosis can have serious consequences, necessitating extremely high model reliability.
  • Regulatory Compliance: Medical AI systems must adhere to strict regulatory standards (e.g., FDA approval), which can slow down deployment.
  • Interpretability: Clinicians require transparent and explainable AI models to trust and validate automated decisions.

So, it's ssential to use a high-performance, high-accuracy framework like YOLOv8, a real-time object detection and instance segmentation model which can overcome the above chgallenegs.

YOLOv8(You Only Look Once vesrion 8) Model

YOLOv8 is a state-of-the-art deep learning model developed by Ultralytics for real-time object detection, instance segmentation, image classification, and pose estimation. It addresess several challenges in medical imaging applications by offering a unified, high-performance framework for object detection and segmentation. YOLOv8 is built on a modular and scalable architecture, supports transfer learning, and is optimized for deployment on various hardware platforms.

Its ability to train on relatively small datasets using transfer learning helps mitigate the issue of limited annotated medical data. YOLOv8’s real-time detection and segmentation capabilities support high accuracy and speed, which are critical in applications of medical imaging, where precise localization of anomalies is crucial for precise diagnosis.Additionally, its modular and open-source design allows for easier integration with explainability tools, helping clinicians interpret model outputs.

Functionality of the YOLOv8

YOLO is a single-stage object detection model; performs object localization and classification in a single forward pass through a neural network. It divides the input image into an N×N grid, where each grid cell—also referred to as an anchor, acts as a classifier responsible for predicting K bounding boxes around potential objects whose ground truth centers fall within that cell. Each anchor also predicts a confidence score indicating the likelihood of an object being present, along with class labels for the detected object.The model then combines the bounding box coordinates, confidence scores, and class probabilities to generate the final detections.

A bounding box in computer vision is a rectangular frame used to visually mark the location of an object within an image. It can be defined by the coordinates of the top-left and bottom-right corners, or by the center point along with width and height. YOLOv8 predicts the bounding box in the format [x_center, y_center, width, height] which is normalized between 0-1 relative to the image dimensions. To refine the results, Non-Maximum Suppression (NMS) is applied during post-processing to eliminate overlapping boxes and retain only the most confident predictions.

Intersection Over Union( IoU)

IoU is a key metric used to evaluate the accuracy of bounding box predictions in object detection. It plays a crucial role in all training, inference and evaluation phases of YOLOv8. It is calculated as :

$$ \text{IoU} = \frac{\text{Area of Overlap}}{\text{Area of Union}} $$

In training, IoU meassures the overlap between the predicted bounding box and ground truth box, and uses IoU thresholding to decide if the predictd box is a positive match. Boxes that are below IoU thershold are considered as neagtives. During inference, YOLOv8 often generates multiple overlapping boxes for the same object. To reduce redundancy, it uses Non-Maximum Suppression (NMS), which applies IoU thresholding to suppress boxes with high overlap and lower confidence scores, retaining only the most confident prediction.

mean Average Precision (mAP)

Intersection over Union (IoU) directly influences the mean Average Precision (mAP) score, which is a key metric used to evaluate the performance of object detection models like YOLOv8. mAP is calculated by averaging the precision values across different recall levels for each class, and it depends on whether predicted bounding boxes are considered correct detections, which is determined using IoU thresholding.

If the IoU between a predicted box and the actual object is above a certain threshold (usually 0.5), it's counted as a correct detection (true positive); otherwise, it's a false positive. Therefore, higher IoU values generally lead to better precision and recall, which in turn improve the mAP score. Additionally, mAP can be computed at multiple IoU thresholds providing a more comprehensive view of the model’s detection quality across varying levels of strictness.

Project Goal

The main goal of this work is to build a reliable and accurate model for detecting brain tumors in MRI images using the YOLOv8 algorithm. By training the model on a custom dataset, it aims to support early diagnosis and treatment planning, ultimately contributing to better patient outcomes.

Data

The dataset used for this project can be found via this link . The brain tumor MRI dataset had been curated using Roboflow Universe, which is a community driven platform for sharing computer vision datasets and models.This dataset is comprised 3,903 MRI images categorized into four distinct classes:

  • Glioma: A tumor originating from glial cells in the brain.
  • Meningioma: Tumors arising from the meninges, the protective layers surrounding the brain and spinal cord.
  • Pituitary Tumor: Tumors arise from the pituitary gland, affecting hormonal balance.
  • No Tumor: MRI scans that do not exhibit any tumor presence.

Each image in the dataset is annotated with bounding boxes using Roboflow Annotate to indicate tumor locations, facilitating object detection tasks, and then exported in YOLOv8-compatible format.The dataset is structured into training (70%), validation (20%), and test (10%) sets, ensuring a robust framework for model development and evaluation.

Model Training

The pretrained YOLOv8 model was loaded using the official Ultralytics library. For this project, the training was configured to run for 50 epochs with a batch size of 16 images, and all input images were resized to 640×640 pixels to match the model’s expected input format. These parameters were chosen to balance training efficiency and model performance. The training outputs—including model weights, metrics, and visualizations—were automatically saved in the "runs/train" directory within the Google Colab environment, allowing for easy access and further evaluation.

Model EValuation on Validation set

YOLOv8 saves the model checkpoint that achieved the highest performance on the validation set during the training process as best.pt file.While the model architecture remains unchanged throughout training, the weights are continuously updated after each epoch. YOLOv8 automatically monitors metrics such as mean average precision (mAP), precision, and recall, and saves the model weights whenever an epoch yields better validation results than previous ones.This ensures that the best.pt file contains the most effective version of the model for generalization,to avoid deploying a model that may have overfit the training data in later epochs.

Class Precision Recall mAP@50 mAP@50-95
Glioma 0.865 0.743 0.831 0.564
Meningioma 0.963 0.924 0.981 0.840
Pituitary 0.935 0.932 0.956 0.750
Overall 0.921 0.867 0.923 0.718

mean Average Precision(mAP@50) uses a single IoU threshold of 0.5 to calculate if the predicted box overlaps at least 50%, to consider them as true positive. Meanwhile, mAP@50-95 calculates mean Average Precision across multiple Intersection over Union (IoU) thresholds, ranging from 0.50 to 0.95, in steps of 0.05. Having an overall mAP@50-95 shows model is precise and consistent in detecting objects, even when stricter overlap is required.

image4

Image1 : Confusion Matrix for the classification model

The above confusion matrix illlustrates the how well classification model identify three types of brain tumor; glioma, meningioma, and pituitary as well as cases where no tumor is present which is labeled as 'background'. The diagonal entries represent correct predictions, with the model accurately identifying 232 glioma cases, 135 meningioma cases, 178 pituitary cases, and 6 non-tumor cases.

However, the off-diagonal cells reveal notable misclassifications, particularly between glioma and the non-tumor class: 73 glioma cases were incorrectly labeled as non-tumor, and 50 non-tumor cases were misclassified as glioma. Pituitary tumors also showed some confusion, with 22 instances misidentified as non-tumor.

These errors suggest that the model struggles to distinguish tumor tissue from healthy regions, especially in cases with subtle or overlapping features. Overall, while the model performs well on the primary tumor classes, refining its ability to separate tumor from non-tumor regions could significantly improve diagnostic accuracy.

image6

Image2 : Precision-Recall Curve

The above plot evaluates the performance of a classification model.Each curve represents the trade-off between precision; how many predicted positives are correct and recall ;how many actual positives are detected for a specific class. The model demonstrates strong performance across all tumor types, with mean Average Precision (mAP) scores at a 0.5 threshold of 0.831 for glioma, 0.981 for meningioma, and 0.956 for pituitary. The combined mAP for all classes is 0.923, indicating high overall accuracy. The steep and sustained curves suggest that the model maintains high precision even as recall increases, which is especially valuable in medical diagnostics where false positives and missed detections carry significant consequences.

Inference on Test Data

Inference was performed on the test image set using the trained YOLOv8 model to generate predictions for brain tumor classification. The process involved analyzing each image and saving two types of outputs: annotated images and structured text files. Annotated images, which include bounding boxes and class labels, were saved in the predictions folder only if the confidence score exceeded 0.25, effectively filtering out low-confidence detections. Alongside each image, a corresponding .txt file was created containing the class ID, confidence score, and bounding box coordinates for each detected object. This structured output supports both visual confirmation of tumor presence and precise data logging, making it valuable for further analysis, reporting, or integration into clinical decision-support systems.

Prediction Outputs

A few predicted annotated images were generated during the inference process are shown below.

image7

Image3 : Predicted output for the first batch of 16 images

image1

Image4 : Presence of pituitary Tumor

image2

Image5: The presence of the meningioma

image3

Image6: The presence of both glioma and meningioma.

This image shows two bounding boxes that detects the glioma tumor with 0.79 confidence score and meningioma with a cofidence score of 0.46. This demonstrates the detection of the multiple instances of tumor in the same MRI scan image.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors