TY - JOUR
T1 - Acronym expander at SDU@AAAI-21
T2 - 2021 Workshop on Scientific Document Understanding, SDU 2021
AU - Pereira, João L.M.
AU - Galhardas, Helena
AU - Shasha, Dennis
N1 - Funding Information:
Pereira’s work was supported by national funds through FCT (Fundac¸ão para a Ciência e a Tecnologia), under the PhD Scholarship SFRH/BD/135719/2018. Furthermore, Pereira and Galhardas’ work was supported by national funds through FCT under the project UIDB/50021/2020.
Funding Information:
Pereira's work was supported by national funds through FCT (Funda??o para a Ci?ncia e a Tecnologia), under the PhD Scholarship SFRH/BD/135719/2018. Furthermore, Pereira and Galhardas' work was supported by national funds through FCT under the project UIDB/50021/2020. Shasha's work has been partly supported by (i) the New York University Abu Dhabi Center for Interacting Urban Networks (CITIES), funded by Tamkeen under the NYUAD Research Institute Award CG001 and by the Swiss Re Institute under the Quantum Cities initiative, (ii) NYU Wireless, and (iii) U.S. National Science Foundation grants 1934388, 1840761, and 1339362. The server virtual machine used to run the experiments was supported by BioData.pt - Infraestrutura Portuguesa de Dados Biol?gicos, project 22231/01/SAICT/2016, funded by Portugal 2020.
Funding Information:
The server virtual machine used to run the experiments was supported by BioData.pt – Infraestrutura Portuguesa de Dados Biológicos, project 22231/01/SAICT/2016, funded by Portugal 2020.
Funding Information:
Shasha’s work has been partly supported by (i) the New York University Abu Dhabi Center for Interacting Urban Networks (CITIES), funded by Tamkeen under the NYUAD Research Institute Award CG001 and by the Swiss Re Institute under the Quantum Cities initiative, (ii) NYU Wireless, and (iii) U.S. National Science Foundation grants 1934388, 1840761, and 1339362.
Publisher Copyright:
Copyright © 2021for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
PY - 2021
Y1 - 2021
N2 - In order to properly determine which of several possible meanings an acronym A in sentence s has, any system that aims to find the correct meaning for A must understand the context of s. This paper describes the techniques we use for that problem for the SDU@AAAI benchmark in which context was provided in the form of sentences in which acronym A is present and defined. As a capsule summary of our results, Support Vector Machines with Doc2Vec techniques achieves a higher Macro F1-Measure score than Cosine similarity with Classic Context Vector techniques. Although these techniques usually work better with documents (i.e., many sentences rather than the one sentence offered in this benchmark), they achieved scores of Macro F1-Measure 86-89%. While these results were 5.65% worse than the best in the benchmark experiment, the high speed of our approach (max 0.6 seconds on average per sentence on a virtual machine allocated with 4 CPU cores and 32GB of RAM in a shared server) and the possibility that our methods are complementary to those of other groups may lead to high performance hybrid systems.
AB - In order to properly determine which of several possible meanings an acronym A in sentence s has, any system that aims to find the correct meaning for A must understand the context of s. This paper describes the techniques we use for that problem for the SDU@AAAI benchmark in which context was provided in the form of sentences in which acronym A is present and defined. As a capsule summary of our results, Support Vector Machines with Doc2Vec techniques achieves a higher Macro F1-Measure score than Cosine similarity with Classic Context Vector techniques. Although these techniques usually work better with documents (i.e., many sentences rather than the one sentence offered in this benchmark), they achieved scores of Macro F1-Measure 86-89%. While these results were 5.65% worse than the best in the benchmark experiment, the high speed of our approach (max 0.6 seconds on average per sentence on a virtual machine allocated with 4 CPU cores and 32GB of RAM in a shared server) and the possibility that our methods are complementary to those of other groups may lead to high performance hybrid systems.
UR - http://www.scopus.com/inward/record.url?scp=85103101981&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85103101981&partnerID=8YFLogxK
M3 - Conference article
AN - SCOPUS:85103101981
SN - 1613-0073
VL - 2831
JO - CEUR Workshop Proceedings
JF - CEUR Workshop Proceedings
Y2 - 9 February 2021
ER -