We propose a new way to answer probabil tic queries that span multiple datapoints. W formalize reasoning about the similarity of d ferent datapoints as the evaluation of the Bay Factor within a hierarchical deep generati model that enforces a separation between th latent variables used for representation learnin and those used for reasoning. Under this mod we derive an intuitive estimator for the Bay Factor that represents similarity as the amou of overlap in representation space shared by d ferent points. The estimator we derive relies o a query-conditional latent reasoning networ that parameterizes a distribution over the latent space of the deep generative model. The latent reasoning network is trained to amortize the posterior-predictive distribution under a hierarchical model using supervised data and a max-margin learning algorithm. We explore how the model may be used to focus the data variations captured in the latent space of the deep generative model and how this may be used to build new algorithms for few-shot learning.