Graph-based Hierarchical Video Segmentation with

filter-based motion features

>>> Download the code here <<<
(includes sample data)

This software implements a video segmentation algorithm, together with filter-based motion features, respectively proposed in the following papers:

Efficient Hierarchical Graph-Based Video Segmentation
Matthias Grundmann, Vivek Kwatra, Mei Han, Irfan Essa, CVPR, 2010.

Segmentation of Dynamic Scenes with Distributions of Spatiotemporally Oriented Energies
Damien Teney, Matthew Brown, BMVC, 2014.

The segmentation algorithm is flexible and can use, as features, the original color histograms, the histograms of filter-based motion features (HoMEs), histograms of optical flow, or combinations of those. New types of features can easily be integrated as well. The extraction of image features and the segmention are to be run separately in two separate steps. Please see the provided examples.

Two tools are provided to open and visualize the results of the segmentation:

  • visualizeSegmentation.m
    An interactive tool that allows flipping through frames and levels of the segmentation.

  • makeBoundaryMap.m
    A single-run function that generates an image with the boundaries of the segments of all levels of the segmentation, where the strength of each boundary is defined from the highest level of the segmentation it appears in.

Some MEX code needs to be compiled to run the segmentation algorithm. This can be done with the following script: /Segmentation/private/make.m

Examples

If videos do not appear below, you can access them on YouTube.

Fire over water: segmentation of dynamic textures

Segmentation of objects from parallax, due to camera motion in a short sequence

Segmentation of dynamic textures of very similar appearance (SynthDB composite dataset)

Limitations

The segmentation code was purposedly kept simple for easy experimentation (e.g. the use of new image features), rather than optimized for "production" use. In particular, it processes entire videos at a time, and requires a lot of memory to store the pixel-level graph at the first level of the segmentation. A parameter can however be passed to most functions (those that extract image features, and the function that actually performs the segmentation) to limit the processing to the first 'N' frames of a video.

For optimized implementations of the segmentation algorithm (albeit without the filter-based motion features), you may want to look at:
http://www.videosegmentation.com
http://www.cse.buffalo.edu/~jcorso/r/supervoxels/