Using Kinect sensors to monitor and provide feedback to patients performing intervention or rehabilitation exercises is an upcoming trend in healthcare. However, the joint positions measured by the Kinect sensor are often unreliable, especially for joints that are occluded by other parts of the body. Motion capture (MOCAP) systems using multiple cameras from different view angles can accurately track marker positions on the patient. But such systems are costly and inconvenient to patients. In this work, we simultaneously capture the joint positions using both a Kinect sensor and a MOCAP system during a training stage and train a Gaussian Process regression model to map the noisy Kinect measurements to the more accurate MOCAP measurements. To deal with the inherent variations in limb lengths and body postures among different people, we further propose a joint standardization method, which translates the raw joint positions of different people into a standard coordinate, where the distance between each pair of adjacent joints is kept at a reference distance. Our experiments show that the denoised Kinect measurements by the proposed method are more accurate than several benchmark methods.