TY - GEN
T1 - Order-aware generative modeling using the 3D-craft dataset
AU - Chen, Zhuoyuan
AU - Srinet, Kavya
AU - Qi, Charles R.
AU - Fan, Haoqi
AU - Ma, Jerry
AU - Zitnick, Larry
AU - Guo, Demi
AU - Xiao, Tong
AU - Xie, Saining
AU - Chen, Xinlei
AU - Szlam, Arthur
AU - Tulsiani, Shubham
AU - Yu, Haonan
AU - Gray, Jonathan
N1 - Publisher Copyright:
© 2019 IEEE.
PY - 2019/10
Y1 - 2019/10
N2 - In this paper, we study the problem of sequentially building houses in the game of Minecraft, and demonstrate that learning the ordering can make for more effective autoregressive models. Given a partially built house made by a human player, our system tries to place additional blocks in a human-like manner to complete the house. We introduce a new dataset, HouseCraft, for this new task. HouseCraft contains the sequential order in which 2,500 Minecraft houses were built from scratch by humans. The human action sequences enable us to learn an order-aware generative model called Voxel-CNN. In contrast to many generative models where the sequential generation ordering either does not matter (e.g. holistic generation with GANs), or is manually/arbitrarily set by simple rules (e.g. raster-scan order), our focus is on an ordered generation that imitates humans. To evaluate if a generative model can accurately predict human-like actions, we propose several novel quantitative metrics. We demonstrate that our Voxel-CNN model is simple and effective at this creative task, and can serve as a strong baseline for future research in this direction. The HouseCraft dataset and code with baseline models will be made publicly available.
AB - In this paper, we study the problem of sequentially building houses in the game of Minecraft, and demonstrate that learning the ordering can make for more effective autoregressive models. Given a partially built house made by a human player, our system tries to place additional blocks in a human-like manner to complete the house. We introduce a new dataset, HouseCraft, for this new task. HouseCraft contains the sequential order in which 2,500 Minecraft houses were built from scratch by humans. The human action sequences enable us to learn an order-aware generative model called Voxel-CNN. In contrast to many generative models where the sequential generation ordering either does not matter (e.g. holistic generation with GANs), or is manually/arbitrarily set by simple rules (e.g. raster-scan order), our focus is on an ordered generation that imitates humans. To evaluate if a generative model can accurately predict human-like actions, we propose several novel quantitative metrics. We demonstrate that our Voxel-CNN model is simple and effective at this creative task, and can serve as a strong baseline for future research in this direction. The HouseCraft dataset and code with baseline models will be made publicly available.
UR - http://www.scopus.com/inward/record.url?scp=85081899411&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85081899411&partnerID=8YFLogxK
U2 - 10.1109/ICCV.2019.00185
DO - 10.1109/ICCV.2019.00185
M3 - Conference contribution
AN - SCOPUS:85081899411
T3 - Proceedings of the IEEE International Conference on Computer Vision
SP - 1764
EP - 1773
BT - Proceedings - 2019 International Conference on Computer Vision, ICCV 2019
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 17th IEEE/CVF International Conference on Computer Vision, ICCV 2019
Y2 - 27 October 2019 through 2 November 2019
ER -