Microsoft Kinect can be used to capture both depth and color information and has been increasingly used for 3D modeling purposes. However, prior facial modeling methods either are computationally intensive or they generate rough results limited by the low resolution and instability of Kinect. In this paper, we propose a novel scheme for automatically and efficiently constructing a life-like textured 3D high-resolution model for the face of any user in front of a Kinect. Specifically, this scheme is composed of a sequence of steps including head region segmentation, depth and color image registration, resolution enhancement and 3D model fairing. Compared to prior methods, our scheme has a set of distinctive advantages. It can be robust even when the user is in a noisy environment; all the processes are automatic, which means that users need not interactively select feature points, and the energy optimization step is more efficient for fast processing of large-scale dynamic images.