On this page, we collect tips/lessons/recommendations for image processing using AI and applicable in aquatic science.

ContributorsKhadijeh Alibabaei MarcoFrancescangeli

Table of Content

Introduction

This guide provides insights into various aspects of AI-powered image processing in aquatic science. It covers the topics of annotation tools, suitable AI models, tools for tracking experiments, and overcoming challenges such as model drift. The information provided below is based on the experiences of users of the iMagine platform and the references listed in the end of the document.

Annotation Tools

Labelbox

  • Overview: Founded in 2018, Labelbox is a leading platform for training data and supports the labeling of images, videos, geodata, natural language documents, audio, and HTML.

  • Advantages:
    •  AI-powered labeling tools and automation.
    •  AI-assisted labeling with the flexibility to bring in your models.
    •  Supports various labeling techniques: polygons, bounding boxes, and lines.
    •  Integrated data labeling services.
    •  QA/QC tools and workflows for label verification.
    •  Robust performance analytics for labelers.
    •  Customizable user interface for streamlined tasks.
  • Disadvantages:
    • The exported data annotation format is a JSON file which is not easily readable.
  •  Pricing:
    •  Free plan for up to 5000 images.
    •  Customized Pro and Enterprise plans are available.
  • How to use:

CVAT

  • Overview: CVAT is a web-based, open-source platform designed for annotating images and videos to label data for computer vision applications. CVAT  supports  the following tasks

    • Image classification

    • Object detection

    • Semantic and instance segmentation

    • Point clouds

    • Video annotation

  • Advantages:
    • Web-based and Collaborative: Facilitates accessibility and collaboration through a web-based interface.
    • Easy Deployment: Simple deployment with Docker in a local network. And requires maintenance for scalability.
    • Semi-automatic Annotation: Offers semi-automatic annotation features, enhancing efficiency in the labeling process.
    • Will be integrated as a tool into the Imagine dashboard.
  • Disadvantages:
    • Adding new objects requires navigating to the label tools and manually typing the label each time, which may not be convenient.
    • Uploading one video at a time is a requirement.
    • To export annotations with the images, a monthly fee of 33$ is required. 
  • Pricing:
    • Self-hosted CVAT is Free to download and use.
    • Hosted Version (cvat.ai):
      • Limited plan for small projects.
      • Unlimited plan for teams and professional use.
    • Professional Plans: Start at $33 per month.
  • How to use:

Roboflow Annotate

  • Overview: Roboflow is a web-based annotation tool designed for labeling images, and supporting tasks such as object detection, classification, and segmentation.

  • Advantages:
    • Enables users to annotate videos frame by frame, with the flexibility to choose the frequency of frames.
    • Permit users to partition the data into the train, validation, and test datasets.
    • Automatically annotate images in your dataset using either an earlier version of your model or one of the 50,000+ public models available on Roboflow Universe (none of them are for fish detection)
    • The annotations can be exported in various formats such as JSON, XML, yolo annotation format, and more.
  • Disadvantages:
    •   All images are public on the Public plan (free of charge plan). 
    • If you want private images, it is too expensive.
  • Pricing:
    • Roboflow Annotate is free for all users of the Public plan.
    •  If you want private images, you can upgrade to the $249/ month starter plan.
  • How to use:

BIIGLE

  • Overview: Biigle is a web service developed for the efficient and fast annotation of images and videos. Originally developed for monitoring and exploring the marine environment, it is versatile for any image and video annotation task.
  • Advantages
    • Free of charge.
    • User-friendly label tree on the right.
    • Capability to upload tenure video folder.
    • Downloadable dataset after tagging
  • Disadvantages:
    • Restarting the video is required to label each object, which causes inconvenience.
    • Difficult to navigate and use.
    • Lacks a tracking option for objects.
    • Uploading files is a slow process.
  • Pricing 
    • Free of charge 
  • How to use:

Suitable AI models

The CNN model is suitable for object detection, and there are two categories of state-of-the-art object detection algorithms: two-stage detectors such as Faster R-CNN [1], and one-stage detectors such as You Only Look Once (YOLO) [2]. 

When considering AI models for image processing, the choice between Faster R-CNN and YOLO depends on specific requirements, including computing resources, accuracy, and real-time processing needs. Here is a brief comparison to help you decide:

Faster R-CNN

  • Advantages:
    • Generally achieves high accuracy in object detection.
    •  Well-established and widely used in various applications.
    •  Provides a two-stage recognition process that separates the proposal of regions and the classification of objects.
    • The model is available on the Imagine marketplace for utilization.
  • Disadvantages:
    •  Slower inference speed compared to some real-time recognition models.
    •  More complex architecture may require higher computational resources.
  • Suitability 
    • Well suited for tasks where high accuracy is required
    • Suitable for scenarios where real-time processing is not a strict requirement.

YOLO 

  • Advantages: The latest version of the YOLO model referred to as YOLOv8 with various types (Nano, Small, Large, Extra Large), is now available we are focusing on this version  [3].
    • YOLO (You Only Look Once) models are known for real-time object detection.
    •  Achieves a good balance between speed and accuracy [2].
    • The trade-off between accuracy and speed varies depending on the specific model type, be it Nano, Small, Large, or Extra Large.
    • The one-step detection process simplifies the architecture [2].
    • It can be used for various tasks, including classification, pose detection, segmentation, and object detection [3].
    • The model is available on the Imagine marketplace for utilization.
  • Disadvantages:
    • Some accuracy may be lost compared to two-stage detectors such as Faster R-CNN.
  • Suitability:
    • Ideal for applications that require real-time processing,
    • Suitable for tasks where slightly lower accuracy is acceptable in exchange for faster inference.

Experiment Tracking Tools

Experiment tracking tools are crucial in the field of machine learning as they allow researchers and data scientists to organize, monitor, and analyze their experiments. Here are detailed insights into some popular experiment tracking tools [4, 5, 6]:

Tensorborad

The visualization toolkit for TensorFlow. 

  • Advantage:
    • It is simple and integrates with the TensorFlow package
    • By adding a few lines of code, you can track and visualize the metrics and accuracy of the model.
    • Strong and big community 
    • It can be used with Pytorch.
  • Disadvantages:
    • Difficult to use in team environments that require collaboration.
    • Inability to version data and models to track experiments.
    • Limited scalability to millions of runs, leading to UI issues with excessive runs
    • Inability to log and visualize other data formats (other than images) such as audio/video or custom HTML

MLflow

MLflow, an open-source platform, streamlines the entire machine learning lifecycle that includes experimentation, model storage, reproducibility, and deployment.

  • Advantages:
    • Open interface that enables seamless integration with any machine learning library or language.
    • Focus on the entire lifecycle of the ML process.
    • Few lines of code are needed for tracking.
    • It supports Tensorflow and Pytorch, Pytorch Lightning, etc. 
    • Model registry from staging to production
    • It is going to be integrated into the iMagine marketplace in the near future. The pre-deployed test instance for the use cases of the project can be found here
  • Disadvantages:
    • Requires additional computing resources and storage 
    • No orchestration in terms of automatic adaptation and allocation of resources if you do not use a cloud
    • Basic multi-user environment support

Weights &Biasses (w&b)

 A machine learning platform designed for experiment tracking, dataset versioning, and model management.

  • Advantages:
    • A user-friendly and interactive dashboard serves as a central hub for all experiments within the app.
    • The application facilitates the search for hyperparameters and model optimization through w&b sweeps.
    • Does not need external storage to save the experiment.
    • You can easily upload your TensorBoard logs to w&b.
  • Disadvantages:
    • Limited use: Available for Python only
    • Does not provide the ability to deploy model

Data Version Control (DVC)

Data versioning is crucial in data science workflows, but managing datasets can be a challenge. DVC simplifies versioning and helps data scientists track and manage datasets efficiently.

  • Advantages:

    • DVC simplifies the versioning of data and enables efficient tracking in data science workflows.

    • Data and models are in some external storage but keep the version info in the Git repo.

    • Functionality is extended to tracking models and pipelines

    • Easy to learn

    • Facilitates model sharing via cloud storage, improving collaboration and resource utilization across teams.

  • Disadvantages:

    • DVC is lightweight, which means your team may need to manually develop additional features to make it user-friendly.

    • Checking for missing dependencies in DVC is quite difficult.

AI model drift

The performance of a trained AI model can degrade over time based on two Factors:

  • Concept Drift:
    • Concept drift occurs when the relationship between the input features and the target variable changes over time, challenging the accuracy of the model in dynamic environments. An example of concept drift in aquatic science can be as follows: suppose you train a model to detect and count fish in underwater images or videos. Environmental changes including water quality and temperature can lead to new fish species entering that ecosystem.
  • Data Drift
    • Data drift is changes in the distribution of input features that indicate a discrepancy between the training data and the new data and requires continuous monitoring of the robustness of the model.
      An example of data drift in aquatic science using images from underwater cameras: change in image data due to Algal growth.

Frouros is a Python library developed for drift detection in machine learning systems. It offers a mixture of classical and modern statistical algorithms that recognize both concept and data drift.

References

[1] Ren, S.; He, K.; Girshick, R.; Sun, J. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. IEEE Trans. Pattern Anal. Mach. Intell. 2017, 39, 1137–1149. 

[2] J. Redmon, S. Divvala, R. Girshick and A. Farhadi, "You Only Look Once: Unified, Real-Time Object Detection," in 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 2016 pp. 779-788. doi: 10.1109/CVPR.2016.91

[3] Jocher, G., Chaurasia, A., & Qiu, J. (2023). YOLO by Ultralytics (Version 8.0.0) [Computer software]. https://github.com/ultralytics/ultralytics

[4] https://neptune.ai/blog/best-ml-experiment-tracking-tools

[5] https://www.linkedin.com/pulse/mlflow-alternatives-data-version-control-dvc-vs-censius/

[6] https://www.thedatahunt.com/en-insight/mlops-comparison