Zero-Episode Few-Shot Contrastive Predictive Coding: Solving intelligence tests without prior training

pdf4.7 MB

Abstract:

Video prediction models often combine three components: an encoder from pixel space to a small latent space, a latent space prediction model, and a generative model back to pixel space. However, the large and unpredictable pixel space makes training such models difficult, requiring many training examples. We argue that finding a predictive latent variable and using it to evaluate the consistency of a future image enables data-efficient predictions because it precludes the necessity of a generative model training. To demonstrate it, we created sequence completion intelligence tests in which the task is to identify a predictably changing feature in a sequence of images and use this prediction to select the subsequent image. We show that a one-dimensional Markov Contrastive Predictive Coding (M-CPC1D) model solves these tests efficiently, with only five examples. Finally, we demonstrate the usefulness of M-CPC1D in solving two tasks without prior training: anomaly detection and stochastic movement video prediction.

Publisher's Version

Last updated on 01/25/2024