In International Conference on Medical Image Computing and Computer-Assisted Intervention (MICCAI), 2022. * equal contribution.
This paper proposes modifications to our corticalflow method that improve its accuracy and interoperability with existing surface analysis tools, while not sacrificing its fast inference time and low GPU memory consumption. Using large-scale datasets, we demonstrate the proposed changes provide more geometric accuracy and surface regularity while keeping the reconstruction time and GPU memory requirements (during inference) almost unchanged.
In this paper, we introduce CorticalFlow, a new geometric deep-learning model that, given a 3-dimensional image, learns to deform a reference template towards a targeted object. To conserve the template mesh’s topological properties, we train our model over a set of diffeomorphic transformations. This framework can generate surfaces with several hundred thousand vertices in seconds requiring only a small GPU memory footprint. We evaluate its performance for the challenging task of brain cortical surface reconstruction from MRI.
Saliency methods are widely used to visually explain black-box deep learning model outputs to humans. In this paper, we compare the Gradient method, Grad-CAM, Extremal perturbation, and DEEPCOVER, and highlight the complexity in determining which method provides the best explanation of a CNN’s decision.
In this work, we investigate whether, and to what extent, the high-frequency (HF) detail in synthetic brain MR images (MRIs) impacts the performance of DL-based segmentation methods. To assess the impact of HF detail, we generate two synthetic datasets, with and without HF detail and train corresponding segmentation models to evaluate the impact on their performance.
This paper proposes a novel algorithm to sample point clouds from triangular meshes. We formulate this problem as an optimal transport problem between simplexes and discrete Dirac measures, and develop an algorithm to compute the optimal solution. Due to the computational challenge of this algorithm, we train a neural network, named MongeNet, to predict its solution efficiently. MongeNet can be adopted as a mesh sampler during training or testing of 3D deep learning models providing a better representation of the underlying surface with a very small computational overhead.
In this paper, we propose a more accurate and efficient neural network model for brain morphometry named HerstonNet. More specifically, we develop a 3D ResNet-based neural network to learn rich features directly from MRI, design a multi-scale regression scheme by predicting morphometric measures at feature maps of different resolutions, and leverage a robust optimization method to avoid poor quality minima and reduce the prediction variance. As a result, HerstonNet improves the existing approach by 24.30% in terms of intraclass correlation coefficient (agreement measure) to FreeSurfer silver-standards while maintaining a competitive run-time.
Despite the pervasive growth of deep neural networks in medical image analysis, methods to monitor and assess network outputs, such as segmentation or regression, remain limited. In this paper, we introduce SMOCAM (SMOoth Conditional Attention Mask), an optimization method that reveals the specific regions of the input image taken into account by the prediction of a trained neural network. We developed SMOCAM explicitly to perform saliency analysis for complex regression tasks in 3D medical imagery like brain morphometry from MRI.
In this paper, we propose a 3D deep learning framework for cortical surface reconstruction from MR images named DeepCSR. More specifically, we first reformulate this problem as the prediction of an implicit surface representation for points in a continuous coordinate system. Then, the cortical surfaces are extracted using this implicit surface representation, a lightweight topological correction method, and an isosurface mesh extraction technique.
In this paper, we propose a GAN-Based framework for synthesising 3D brain T1-weighted (T1-w) MRI images from Partial Volume (PV) maps for the purpose of generating synthetic MRI volumes with more accurate tissue borders.
In this paper, we address the problem of recognizing complex compositional activities in videos. To this end, we describe activities unambiguously as regular expressions of simple primitive actions and derive framework based on Probabilistic Automata to recognize instances of these regular expressions in videos.
This thesis describes methods that reduce the need for human supervision when training deep learning models by leveraging the structure in the visual world targeting visual recognition in difficult scenarios where annotated data is scarce and the visual concepts are innumerable or ambiguous.
We present a principled approach to uncover the structure of visual data by solving a deep learning task coined visual permutation learning. To this end, we resort to a continuous approximation using doubly-stochastic matrices, formulate a novel bi-level optimization problem, and propose a computationally cheap scheme based on Sinkhorn iterations. The utility of these models are demonstrated on relative attributes learning, supervised learning-to-rank, and self-supervised representation learning.
We build on the compositionality principle and develop an “algebra” to compose classifiers for complex visual concepts. To this end, we learn neural network modules to perform boolean algebra operations on simple visual classifiers. Since these modules form a complete functional set, a classifier for any complex visual concept defined as a boolean expression of primitives can be obtained by recursively applying the learned modules, even if we do not have a single training sample.
We present a principled approach to uncover the structure of visual data by solving a novel deep learning task coined visual permutation learning. Moreover, we propose DeepPermNet, an end-to-end CNN model for this task. The utility of our proposed approach is demonstrated on two challenging computer vision problems, namely, relative attributes learning and self-supervised representation learning.
In this technical report we collect some results on differentiating argmin and argmax optimization problems with and without constraints and provide some insightful motivating applications. Such results are very useful for developing end-to-end gradient based learning methods.
We combine motion features to the Aggregated Channel Features (ACF) pedestrian detector. We demonstrate that motion features can provide more accurate results and reduce false alarms.
We demonstrate that it is possible to exactly evaluate Bayesian model averaging (BMA) over the exponentially-sized powerset of Naive Bayes (NB) feature models in linear-time in the number of features; this yields an algorithm about as expensive to train as a single NB model with all features, but yet provably converges to the globally optimal feature subset in the asymptotic limit of data.
We present an Evolutionary algorithm to tackle simultaneously the regenerator placement and link capacity optimization problems in translucent optical networks. Our proposed method can assist a network designer to manage resources balancing cost and performance.