Efficient large-scale distributed training of conditional maximum entropy models

Gideon Mann, Ryan McDonald, Mehryar Mohri, Nathan Silberman, Daniel D. Walker

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Training conditional maximum entropy models on massive data sets requires significant computational resources. We examine three common distributed training methods for conditional maxent: a distributed gradient computation method, a majority vote method, and a mixture weight method. We analyze and compare the CPU and network time complexity of each of these methods and present a theoretical analysis of conditional maxent models, including a study of the convergence of the mixture weight method, the most resource-efficient technique. We also report the results of large-scale experiments comparing these three methods which demonstrate the benefits of the mixture weight method: this method consumes less resources, while achieving a performance comparable to that of standard approaches.

Original languageEnglish (US)
Title of host publicationAdvances in Neural Information Processing Systems 22 - Proceedings of the 2009 Conference
PublisherNeural Information Processing Systems
Pages1231-1239
Number of pages9
ISBN (Print)9781615679119
StatePublished - 2009
Event23rd Annual Conference on Neural Information Processing Systems, NIPS 2009 - Vancouver, BC, Canada
Duration: Dec 7 2009Dec 10 2009

Publication series

NameAdvances in Neural Information Processing Systems 22 - Proceedings of the 2009 Conference

Other

Other23rd Annual Conference on Neural Information Processing Systems, NIPS 2009
Country/TerritoryCanada
CityVancouver, BC
Period12/7/0912/10/09

ASJC Scopus subject areas

  • Information Systems

Fingerprint

Dive into the research topics of 'Efficient large-scale distributed training of conditional maximum entropy models'. Together they form a unique fingerprint.

Cite this