TY - GEN
T1 - Characterizing discussions in the Spanish Wikipedia
AU - Torres, Johnny
AU - Ochoa, Alfonsina
AU - Jimenez, Alberto
AU - Garcia, Sixto
AU - Pelaez, Enrique
AU - Ochoa, Xavier
N1 - Publisher Copyright:
© 2017 IEEE.
PY - 2018/1/4
Y1 - 2018/1/4
N2 - Wikipedia, as the largest online encyclopedia, is edited collaboratively by hundreds of users. The content in some articles can have dispute, giving rise to discussions which are registered in the related talk pages. In this paper, we propose an annotation schema for Spanish Wikipedia talk pages in order to determine the type of opinions expressed in them. We apply the annotation schema to a corpus that includes a collection of discussions about 148 topics drawn from 25 Spanish Wikipedia talk pages. We make the resulting dataset publicly available for download on github1. Furthermore, we train and evaluate supervised machine learning models to automatically identify the annotation labels. Linear Support Vector classifier (LinearSVC) performs better compared to other baseline models, and achieves an accuracy F1 = 0.71 in our experiments.
AB - Wikipedia, as the largest online encyclopedia, is edited collaboratively by hundreds of users. The content in some articles can have dispute, giving rise to discussions which are registered in the related talk pages. In this paper, we propose an annotation schema for Spanish Wikipedia talk pages in order to determine the type of opinions expressed in them. We apply the annotation schema to a corpus that includes a collection of discussions about 148 topics drawn from 25 Spanish Wikipedia talk pages. We make the resulting dataset publicly available for download on github1. Furthermore, we train and evaluate supervised machine learning models to automatically identify the annotation labels. Linear Support Vector classifier (LinearSVC) performs better compared to other baseline models, and achieves an accuracy F1 = 0.71 in our experiments.
KW - Collaborative Writing
KW - NLP
KW - Wikipedia
UR - http://www.scopus.com/inward/record.url?scp=85045744307&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85045744307&partnerID=8YFLogxK
U2 - 10.1109/ETCM.2017.8247544
DO - 10.1109/ETCM.2017.8247544
M3 - Conference contribution
AN - SCOPUS:85045744307
T3 - 2017 IEEE 2nd Ecuador Technical Chapters Meeting, ETCM 2017
SP - 1
EP - 6
BT - 2017 IEEE 2nd Ecuador Technical Chapters Meeting, ETCM 2017
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2nd IEEE Ecuador Technical Chapters Meeting, ETCM 2017
Y2 - 16 October 2017 through 20 October 2017
ER -