AquaTriV Dataset

Aquatic Triple Vision: Active, Passive, Neuromorphic Vision

A Multi-Modal Underwater Dataset for Dense SLAM, Multi-sensor Fusion and Neural Reconstruction

AquaTriV Dataset

AquaTriV: An Underwater Multi-Scene, Multi-Modal Dense SLAM Dataset with Active, Passive, Neuromorphic Vision for Localization and Mapping Evaluation

25 sequences · 255GB · 6.9h · 2.8km

Yaming Ou , Xiaoyan Liu, Junfeng Fan, Chao Zhou, Pengjv Zhang, Song Xia, Yang Yang, Bing Wang , Long Cheng , Fu Zhang

Overview

Multi-Modal Sensing

Passive, active, and neuromorphic vision with IMU, DVL, and pressure sensors for robust perception

Dense Point Cloud

High-fidelity dense point clouds
acquired by a 3D laser scanner
for mapping evaluation

Accurate Localization

Motion capture system (indoor) and INS-SfM fusion (outdoor) for localization evaluation

Robotic Platform and Sensor Suite

Water-Scanner: Custom-designed underwater robotic platform equipped with multi-modal sensors, including passive stereo cameras, active laser scanner, neuromorphic vision sensor, IMU, DVL, and pressure sensor. The system enables synchronized data acquisition for robust localization and high-fidelity dense mapping across diverse underwater environments.

Robot Platform

Model: FindROV H20

Weight: 23 kg

Max Depth: 300 m

Velocity: 2 knot

Cable: 100 m

AHRS

Model: 3DM-CV5-25

Freq: 100 Hz

R/P: ±0.5°

Yaw: ±1°

Mag: ±2.5 Gauss

DVL

Model: A50

Beam: 4-beam

Angle: 22.5°

Freq: 4–26 Hz

Res: 0.1 mm/s

Pressure Sensor

Model: B30

Range: 0–30 bar

Depth: 300 m

Acc: ±200 mbar

Res: 0.2 mbar

Monocular Camera

View: Front

USB RGB

Res: 1920×1080

Freq: 30 Hz

FoV: 80°×64°

Stereo Camera

View: Down

Model: D435i

Res: 640×480

Freq: 30 Hz

FoV: 87°×58°

Event Camera

Model: DAVIS346

Resolution: 346 × 260

Temporal: 1 μs

Latency: < 1 ms

Dynamic Range: 120 dB

3D Laser Scanner

Freq: 70 Hz

Points: 512

λ: 450 nm

Power: 3 W

Angle: 35°

Scenes & Downloads

Pool
~100GB
Cave
~50GB
River
~60GB
Sea
~45GB

Scene-wise characteristics in the AquaTriV dataset.

Data Visualization & Format

We provide multi-modal raw sensory data including active vision, passive vision, and neuromorphic vision streams, along with synchronized navigation measurements. Below shows representative visual samples and ROS topic formats.

Examples of visual raw images in the AquaTriV dataset.

Type Topic Name Message Type Rate (Hz)
Active Vision/scanner/left_image/compressedsensor_msgs/CompressedImage100
/scanner/right_image/compressedsensor_msgs/CompressedImage100
Passive Vision/stereo/infra1/compressedsensor_msgs/CompressedImage30
/stereo/infra2/compressedsensor_msgs/CompressedImage30
/stereo/monocular/compressedsensor_msgs/CompressedImage30
/stereo/imusensor_msgs/IMU200
Neuromorphic Vision/dvs/eventsdvs_msgs/EventArray30
/dvs/image_raw/compressedsensor_msgs/CompressedImage30
/dvs/event_rendering/compressedsensor_msgs/CompressedImage30
/dvs/imusensor_msgs/IMU200
AHRS/imu/datasensor_msgs/Imu200
/imu/magsensor_msgs/MagneticField50
DVL/dvl/datadvl_a50_2/DVL12
/dvl/velocitysensor_msgs/Imu12
/dvl/positionsensor_msgs/Imu4
Pressure/pressure_sensor/depthsensor_msgs/FluidPressure60
Bonus/robotview/monocularsensor_msgs/CompressedImage30
/waterview/monocularsensor_msgs/CompressedImage30

Benchmark Evaluation

Localization Evaluation

We evaluate classical and multi-modal visual-inertial SLAM systems including VINS-Fusion, SVIn2, VD-Fusion, AEVINS and AquaSLAM. These methods represent state-of-the-art pipelines combining stereo vision, IMU, DVL, and event cameras.

Five baseline method localization evaluation results in the AquaTriV dataset.

Metrics include APE, ARE, RPE, RRE for accuracy, and CRS/TCR for robustness and tracking continuity.

Dense Mapping Evaluation

We compare representative dense mapping and neural reconstruction baselines, including DR-Scan, DR-Sweep, DROID-SLAM, and MASt3R-SLAM, covering both geometric and learning-based approaches.

Four baseline method dense mapping evaluation results in the AquaTriV dataset.

Evaluation metrics include nearest neighbor error (NNE), completeness (COM), accuracy (AC), Chamfer distance (CD), and mesh metric error (MME).

Neural Reconstruction

Recent neural rendering approaches, particularly 3D Gaussian Splatting (3DGS), require stable and sufficiently dense geometric priors, which are often lacking in underwater datasets relying on sparse SfM-based reconstructions. By deploying SeaSplat on AquaTriV and leveraging dense laser-scanned point clouds for initialization, we achieve stable optimization and high-fidelity photo-realistic reconstruction across diverse underwater scenes.

Underwater photo-realistic mapping by SeaSplat.