Neurocomputational modeling of human physical scene understanding

Abstract

Human scene understanding involves not just localizing objects, but also inferring latent attributes that affect how the scene might unfold, such as the masses of objects within the scene. These attributes can sometimes only be inferred from the dynamics of a scene, but people can flexibly integrate this information to update their inferences. Here we propose a neurally plausible Efficient Physical Inference model that can generate and update inferences from videos. This model makes inferences over the inputs to a generative model of physics and graphics, using an LSTM based recognition network to efficiently approximate rational probabilistic conditioning. We find that this model not only rapidly and accurately recovers latent object information, but also that its inferences evolve with more information in a way similar to human judgments. The model provides a testable hypothesis about the population-level activity in brain regions underlying physical reasoning.

Publication
2018 Conference on Cognitive Computational Neuroscience