Mid-level Deep Pattern Mining
The Univerisity of Adelaide NICTA Australian Centre for Robotic Vision
Abstract: Mid-level visual element discovery aims to find clusters of image patches that are both representative and discriminative. In this work, we study this problem from the prospective of pattern mining while relying on the recently popularized Convolutional Neural Networks (CNNs). Specifically, we find that for an image patch, activation extracted from the fully-connected layer of CNNs have two appealing properties which enable its seamless integration with pattern mining. Patterns are then discovered from a large number of CNN activations of image patches through the well-known association rule mining. When we retrieve and visualize image patches with the same pattern (See Fig. 1), surprisingly, they are not only visually similar but also semantically consistent. We apply our approach to scene and object classification tasks, and demonstrate that our approach outperforms all previous works on mid-level visual element discovery by a sizeable margin with far fewer elements being used. Our approach also outperforms or matches recent works using CNN for these tasks.
- Yao Li, Lingqiao Liu, Chunhua Shen, Anton van den Hengel. Mid-level Deep Pattern Mining. Computer Vision and Pattern Recognition (CVPR), 2015. [PDF] [arXiv]
- Yao Li, Lingqiao Liu, Chunhua Shen, Anton van den Hengel. Mining Mid-level Visual Patterns with Deep CNN Activations. International Journal on Computer Vision (IJCV), 2016. [PDF] (extended version of the above one)
Code is public available on the Github.
Discovered mid-level visual elements from the PASCAL VOC 2007 dataset (two per category).
Discovered mid-level visual elements and their detections on test images on the MIT Indoor dataset.
Discovered mid-level visual elements and their detections on test images on the PASCAL VOC 2007 dataset.
This work is partly supported by Australian Research Council Grants FT120100969, LP130100156.