TY - GEN
T1 - Aggregation Techniques in Crowdsourcing
T2 - 30th ACM International Conference on Information and Knowledge Management, CIKM 2021
AU - Difallah, Djellel
AU - Checco, Alessandro
N1 - Publisher Copyright:
© 2021 ACM.
PY - 2021/10/26
Y1 - 2021/10/26
N2 - Crowdsourcing has been leveraged in various tasks and applications, primarily to gather information from human annotators in exchange for a monetary reward. The main challenge associated with crowdsourcing is the low quality of the results, which can stem from multiple reasons, including bias, error, and adversarial behavior. Researchers and practitioners can apply quality control methods to prevent and detect low-quality responses. For example, worker selection methods utilize qualifications and attention check questions before assigning a task. Similarly, task routing identifies the workers who can provide a more accurate response to a given task type using recommender system techniques. In practice, posterior quality control methods are the most common approach to deal with noisy labels once they are obtained. Such methods require task repetition, i.e., assigning the task to multiple crowd-workers, followed by an aggregation mechanism (aka truth inference) to select the most likely answer or request an additional label. A large number of techniques have been proposed for crowdsourcing aggregation covering several types of task types. This tutorial aims to present common and recent label aggregation techniques for multiple-choice questions, multi-class labels, ratings, pairwise comparison, and image/text annotation. We believe that the audience will benefit from the focus on this specific research area to learn about the best techniques to apply in their crowdsourcing projects.
AB - Crowdsourcing has been leveraged in various tasks and applications, primarily to gather information from human annotators in exchange for a monetary reward. The main challenge associated with crowdsourcing is the low quality of the results, which can stem from multiple reasons, including bias, error, and adversarial behavior. Researchers and practitioners can apply quality control methods to prevent and detect low-quality responses. For example, worker selection methods utilize qualifications and attention check questions before assigning a task. Similarly, task routing identifies the workers who can provide a more accurate response to a given task type using recommender system techniques. In practice, posterior quality control methods are the most common approach to deal with noisy labels once they are obtained. Such methods require task repetition, i.e., assigning the task to multiple crowd-workers, followed by an aggregation mechanism (aka truth inference) to select the most likely answer or request an additional label. A large number of techniques have been proposed for crowdsourcing aggregation covering several types of task types. This tutorial aims to present common and recent label aggregation techniques for multiple-choice questions, multi-class labels, ratings, pairwise comparison, and image/text annotation. We believe that the audience will benefit from the focus on this specific research area to learn about the best techniques to apply in their crowdsourcing projects.
KW - crowdsourcing
KW - label aggregation
KW - pairwise comparison
KW - quality control
KW - rating aggregations
KW - truth inference
UR - http://www.scopus.com/inward/record.url?scp=85119173381&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85119173381&partnerID=8YFLogxK
U2 - 10.1145/3459637.3482032
DO - 10.1145/3459637.3482032
M3 - Conference contribution
AN - SCOPUS:85119173381
T3 - International Conference on Information and Knowledge Management, Proceedings
SP - 4842
EP - 4844
BT - CIKM 2021 - Proceedings of the 30th ACM International Conference on Information and Knowledge Management
PB - Association for Computing Machinery
Y2 - 1 November 2021 through 5 November 2021
ER -