Disentangling Geometry and Appearance with Regularised Geometry-Aware Generative Adversarial Networks

Abstract

Deep generative models have significantly advanced image generation, enabling generation of visually pleasing images with realistic texture. Apart from the texture, it is the shape geometry of objects that strongly dictates its appearance. However, the vast majority of available generative models do not incorporate geometric information into the image generation process, often times yielding visual objects of degenerated quality. In this paper, we introduce Geometry-Aware Generative Adversarial Network (GAGAN) which disentangles the latent variables corresponding to appearance and shape, respectively. Hence GAGAN enables the generation of images with both realistic texture and shape. Specifically, we condition the generator on a statistical shape prior. This prior is enforced through a mapping of the generated images onto a canonical coordinate frame by employing a differentiable geometric transformation. In addition to incorporating geometric information, this constrains the search space and increases the model’s robustness. We show that our approach is versatile, able to generalise across domains (faces, sketches, hands and cats) and sample sizes (from as little as ~ 200 - 30,000 to more than 200,000). We demonstrate superior performance through extensive quantitative and qualitative experiments in a variety of tasks and settings. Finally, we leverage our model to automatically and accurately detect errors or drifting in facial landmarks detection and tracking in-the-wild.

Publication
in review at the International Journal of Computer Vision (IJCV)
Date
Links