In many practical situations, robots may encounter objects moving in their work space, resulting in undesirable consequences for either the robots or the moving objects. Such situations often call for sensing arrangements that can produce planar images along with depth measurements, e.g., Kinect sensors, to estimate the position of the moving object in 3-D space. In this paper, we aim to estimate the relative distance of a moving object along the axis orthogonal to a camera lens plane, thus relaxing the need to rely on depth measurements that are often noisy when the object is too close to the sensor. Specifically, multiple images of an object, with distinct orthogonal distances, are firstly captured. In this step, the object's distance from the camera is measured and the normalized area, which is the normalized sum of pixels, of the object is computed. Both computed normalized area and measured distance are filtered using a Gaussian smoothing filter (GSF). Next, a Bayesian statistical model is developed to map the computed normalized area with the measured distance. The developed Bayesian linear model allows to predict the distance between the camera sensor (or robot) and the object given the normalized computed area, obtained from the 2-D images, of the object. To evaluate the performance of the relative distance estimation process, a test stand was built that consists of a robot equipped with a camera. During the learning process of the statistical model, an ultrasonic sensor was used for measuring the distance corresponding to the captured images. After learning the model, the ultrasonic sensor was removed and excellent performance was achieved when using the developed model in estimating the distance of an object, a human hand carrying a measurement tape, moving back and forth along the axis normal to the camera plane.