In this paper, we first propose a solid theory on the linearization of the triplet loss with the use of class centroids, where the main conclusion is that our new linear loss represents a tight upper-bound to the triplet loss.
In this paper, we propose a Bayesian generative active deep learning approach that combines active learning with data augmentation – we provide theoretical and empirical evidence (MNIST, CIFAR-{10, 100}, and SVHN) that our approach has more efficient training and better classification results than data augmentation and active learning.
In this paper, we introduce a novel methodology for characterising the performance of deep learning networks (ResNets and DenseNet) with respect to training convergence and generalisation as a function of mini-batch size and learning rate for image classification.
This paper addresses the semantic instance segmentation
task in the open-set conditions, where input images can contain known
and unknown object classes
In this paper, we propose the use of such constraint based on a new regularization for the GAN training that forces the generated visual features to reconstruct their original semantic features. Once our model is trained with this multi-modal cycle-consistent semantic compatibility, we can then synthesize more representative visual representations for the seen and, more importantly, for the unseen classes
In this work, we provide a novel Bayesian formulation to data augmentation, where new annotated training points are treated as missing variables and generated based on the distribution learned from the training set.
In this paper, we introduce a new module for deep ConvNets composed of several convolutional filters of multiple sizes that are joined by a maxout activation unit, which promotes competition amongst these filters.
In this work we propose a unsupervised framework to learn a deep convolutional neural network for single view depth prediction, without requiring a pre-training stage or annotated ground-truth depths
New robust loss function for the problem of 2D human pose estimation that allows for a significantly faster training.
We propose the introduction of batch normalisation units into deep feedforward neural networks with piecewise linear activations, which drives a more balanced use of these activation units