MARII - Maritime Intelligent Interface

MARII is an intelligent interface that replaces traditional computer setups with mouse and keyboard. We researched and tested different kinds of gestures to communicate with an AI system more effectively. Another focus was on making sure the interaction is not disrupted by the AI output.

Links

NextCloud   folder: https://cloud.hs-augsburg.de/s/wYsejfXJHMkc9o8

Github Repo: https://github.com/FelixBae/autonomous-maritime-fleet-control

Miro Board: https://miro.com/app/board/uXjVGprIFMM=/

Current Project Overview

MARII: Maritime Intelligent Interface

MARII is designed to tackle a major headache in AI collaboration: „ping-pong prompting“ and the complex problem of agent task delegation. To test our solution, we created a maritime simulation where multiple operators stand around a table, collaborating on a mission with the help of LLMs.

Here is how the project evolved from a hardware challenge into an intelligent software solution:

1. Pivot from Touch to Computer Vision

Initially, we planned on using a large touchscreen table for the operators. When we found out we couldn't get our hands on one, we had to think outside the box. Our workaround? A standard screen paired with a camera system that tracks hand gestures. This switch from touch screen to camera based system enabled us with even more room for different gestures.

2. Gesture Control & Calibration

To make the normal screen feel interactive, we built a gesture recognition system. A static index finger held for 0.3 seconds emulates a touch click, a circular hand motion selects an object, and a quick shake deletes it.

To get the camera and screen on the same page, we borrowed an idea from robotics: ArUco markers. When the calibration key is pressed, the screen displays these markers, allowing the camera to map its position relative to the screen.

3. Voice & The Intelligent Orchestrator

With the hardware sorted, we integrated live voice transcription so operators can speak directly to the AI. However, since our system is built for multi-person collaboration, the last thing we wanted was an AI that constantly interrupts human conversation.

To solve this, we built an intelligent orchestrator architecture. The orchestrator listens and decides exactly when the AI needs to speak out loud. If a verbal reply isn't necessary, the AI stays quiet but still executes its tasks in the background, like updating a summary right on the screen.

4. New Map & Scenario Creator

Because the OpenSea map approach was to complex we decided to use a more simplified version with our own information. We also implemented a scenario creator, that can be used with easy drag and drop operations and saves the scenario to a yaml file.

PROJECT TIMELINE

Drafts for the UI

These are first ui drafts. First sketched by hand, then tested quickly how to implement such a ui.

Meeting Notes First Meeting

Meeting Notes 2026-04-14

Meeting 2026-04-21

Which inputs do we need and what are the possible ways of giving these inputs to the system?

Our new ideas in adaption to the feedback from the last meeting with the profs

We drew 2 setup ideas and played through the scenario by hand with an iPad

The updated BOM

Here we thought about what hardware we need and how important it is for the success of our project

Sprint Review 30.04.2026

This is the feedback from the profs from the group presentation.

Internal Meeting 5.5.2026

21.08.2026

28.05.2026

10.06.2026

Internal Meeting 16.06.2026

Internal Meeting 20.06.2026

Internal Meeting 22.06.2026

We created a 3D scene in Blender and used it to render a poster

Instead of telling the user to put his finger on each corner of the map we took inspiration from robotics and now use Aruco markers to calibrate the relative transform of the map on screen to our system.

Internal Meeting 27.06.2026

We decided to remove the complex OpenStreet/SeaMap and use a more simpler approach with self drawn scenarios.

We also implemented a pathfinding algorithm which the AI can call for suggesting routes. Later we want to add the ability to edit single nodes of the route either by hand or through the ai.

We also implemented a drag and drop scenario creator which saves the custom scenarios as yaml file.

05.07.2026

Because in our last presentation everything what could go wrong went wrong we had to adjust our project setup so it is more reliable. We decided to remove the Aruco Markers and instead of pointing the camera at the screen and using the screen like a touch screen. We now point the camera towards the operator and focus more on the gestures.

We also decided to program some kind of scripted path which we can trigger with simle keypresses so we can fake some actions if they wouldn't work in the presentation. Of course our goal remains to make it work in reality but this system is there so we can show our project even if things go wrong.

Also we changed the scenario so the AI does suggestions first and the operator can accept or deny them and make changes instead of the previous version, where the operator started by giving commands to the AI. This also matches better to the topic of this course which is about AI delegation and overcoming ping pong prompting.

Ein Projekt von

Felix Bäuml

Tomas Herrero Valero

Piotr Grzelak

Fachgruppe

Creative Engineering

Art des Projekts

Keine Angabe

Zugehöriger Workspace

CE-C6 Teamprojekte SoSe2026

Entstehungszeitraum

Sommersemester 2026

Incom ist die Kommunikations-Plattform der Technische Hochschule Augsburg Gestaltung

Incom ist die Kommunikations-Plattform der Technische Hochschule Augsburg Gestaltung mehr erfahren