TY - GEN
T1 - Discovering the hidden structure of house prices with a non-parametric latent manifold model
AU - Chopra, Sumit
AU - Thampy, Trivikraman
AU - Leahy, John
AU - Caplin, Andrew
AU - LeCun, Yann
PY - 2007
Y1 - 2007
N2 - In many regression problems, the variable to be predicted depends not only on a sample-specific feature vector, but also on an unknown (latent) manifold that must satisfy known constraints. An example is house prices, which depend on the characteristics of the house, and on the desirability of the neighborhood, which is not directly measurable. The proposed method comprises two trainable components. The first one is a parametric model that predicts the "intrinsic" price of the house from its description. The second one is a smooth, non-parametric model of the latent "desirability" manifold. The predicted price of a house is the product of its intrinsic price and desirability. The two components are trained simultaneously using a deterministic form of the EM algorithm. The model was trained on a large dataset of houses from Los Angeles county. It produces better predictions than pure parametric and non-parametric models. It also produces useful estimates of the desirability surface at each location.
AB - In many regression problems, the variable to be predicted depends not only on a sample-specific feature vector, but also on an unknown (latent) manifold that must satisfy known constraints. An example is house prices, which depend on the characteristics of the house, and on the desirability of the neighborhood, which is not directly measurable. The proposed method comprises two trainable components. The first one is a parametric model that predicts the "intrinsic" price of the house from its description. The second one is a smooth, non-parametric model of the latent "desirability" manifold. The predicted price of a house is the product of its intrinsic price and desirability. The two components are trained simultaneously using a deterministic form of the EM algorithm. The model was trained on a large dataset of houses from Los Angeles county. It produces better predictions than pure parametric and non-parametric models. It also produces useful estimates of the desirability surface at each location.
KW - Energy-based models
KW - Expectation maximization
KW - Latent manifold models
KW - Structured prediction
UR - http://www.scopus.com/inward/record.url?scp=36849089102&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=36849089102&partnerID=8YFLogxK
U2 - 10.1145/1281192.1281214
DO - 10.1145/1281192.1281214
M3 - Conference contribution
AN - SCOPUS:36849089102
SN - 1595936092
SN - 9781595936097
T3 - Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
SP - 173
EP - 182
BT - KDD-2007
T2 - KDD-2007: 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
Y2 - 12 August 2007 through 15 August 2007
ER -