Institute of Cognitive Science

Research Group Computer Vision

Navigation und Suche der Universität Osnabrück

Main content

Top content

interactive annotation and Segmentation tool

Demo video of iSegs' sementic time line. The the typical workflow in the iSeg application s. Video below
Fig 1: Overview of the interactive semi-automatic annotation and segmentation process


Knowledge extraction from video data is challenging due to its high complexity in both the spatial and temporal domain. Ground truth is crucial for the evaluation and the adaptation of algorithms to new domains. Unfortunately, ground truth annotation is inconvenient and time consuming. Common annotation tools mostly rely on simple geometric primitives such as rectangles or ellipses. Here we propose a novel, interactive and semi-automatic process, which actively asks for user input if the result of the automatic annotation appears to be incorrect. iSeg has been tested on two visual stimulus datasets for eye tracking experiments and on two surveillance datasets.


Reflecting the architecture of video visual analytics, our interactive annotation and segmentation tool iSeg focuses on a semi-automatic architecture putting the user in the loop. The main processing blocks can be seen in figure 1 and are in the following briefly explained.

Fig 2: Activity diagram of the interactive semi-automatic AOI fitting

Manual annotation of specific keyframes

The user has to outline the object on a few frames, where its pose is characteristic. The selection is a polygon, which can be drawn around the object.

Morphing polygon geometry

Subsequently the user can trigger the morphing process, which will interpolate the polygons between all given keyframes over the intermediate frames. To reduce the computational cost we implemented our own preliminary algorithm which satisfies our needs.

Automatic/Interactive keypoint-fitting

Since the morphing is only applied linearly, the resulting contours will most likely not be completely accurate, if the object moves non-linearly between the frames. To compensate this error we compute keypoints on the image patches inside the contours and track them over the frames. From the resulting movement a correction of the contour is calculated.

In some cases the algorithm might not be able to match the keypoints appropriately. Causes can be occlusion of the object or rapid movements. In these cases the system will recognize the error and actively query the user for a manual correction of the problematic frames.


  Description Download
iSeg iSeg
iSeg - the interactive annotation and segmentation tool.
(Version: 0.0.5)
(developed by J. Schöning and P. Faion 2016)

*If iSeg.exe does not start correctly, install Visual C++ Redistributable 2013 (vcredist_x64.exe)

Windows x64*
Linux x64
MacOS x64
Demo video for a typical workflow in the iSeg application (Version 0.0.1)


[1] J. Schöning, P. Faion & G. Heidemann.
Pixel-wise Ground Truth Annotation in Videos - An Semi-automatic Approach for Pixel-wise and Semantic Object Annotation.
In Proceedings of the International Conference on Pattern Recognition Applications and Methods (ICPRAM-2016), pages: 690-697, ISBN: 978-989-758-173-1, 2016. SCITEPRESS.
| PDF | BibTeX
[2] J. Schöning, P. Faion & G. Heidemann.
Semi-automatic Ground Truth Annotation in Videos: An Interactive Tool for Polygon-based Object Annotation and Segmentation.
In Proceedings of the 8th International Conference on Knowledge Capture, pages: 17:1-17:4, ISBN: 978-1-4503-3849-3, 2015. ACM, New York.
| DOI | BibTeX