TY - GEN
T1 - QLUE
T2 - 31st ACM World Wide Web Conference, WWW 2022
AU - Hashmi, Waleed
AU - Chaqfeh, Moumena
AU - Subramanian, Lakshminarayanan
AU - Zaki, Yasir
N1 - Funding Information:
This work was partially supported by the National Natural Science Foundation of China under Grant No. 61872369 and 61832017, Beijing Outstanding Young Scientist Program under Grant No. BJJWZYJH012019100020098, the Outstanding Innovative Talents Cultivation Funded Programs 2021 and Public Computing Cloud, Renmin University of China. This work is supported by Beijing Academy of Artificial Intelligence (BAAI). Xin Zhao is the corresponding author.
Publisher Copyright:
© 2022 ACM.
PY - 2022/4/25
Y1 - 2022/4/25
N2 - The increasing complexity of the web has attracted a number of solutions to offer optimized versions of web pages that are lighter to process and faster to load. These solutions have been quantitatively evaluated to show significant speed-ups in load times and/or considerable savings in bandwidth/memory consumption. However, while these solutions often produce optimized versions from existing pages, they rarely evaluate the impact of their optimizations on the original content and functionality. Additionally, due to the lack of a unified metric to evaluate the similarity of the pages generated by these solutions in comparison to the original pages, it is not yet possible to fairly compare the results obtained from different user studies campaigns, unless recruiting the exact same users, which is extremely challenging. In this paper, we demonstrate the lack of qualitative evaluation metrics, and propose QLUE (QuaLitative Uniform Evaluation), a tool that automates the qualitative evaluation of web pages generated by web complexity solutions with respect to their original versions using computer vision. QLUE evaluates the content and the functionality of these pages separately using two metrics: QLUE's Structural Similarity, to assess the former, and QLUE's Functional Similarity to assess the latter - a task that is proven to be challenging for humans given the complex functional dependencies in modern pages. Our results show that QLUE computes comparable content and functional scores to those provided by humans. Specifically, 90% of a set of 100 pages were given a similarity score between 90% and 100% by human evaluators, while QLUE shows similar scores for more than 75% of the same pages. In terms of time complexity, QLUE shows that it is capable of evaluating an optimized web page in a few minutes.
AB - The increasing complexity of the web has attracted a number of solutions to offer optimized versions of web pages that are lighter to process and faster to load. These solutions have been quantitatively evaluated to show significant speed-ups in load times and/or considerable savings in bandwidth/memory consumption. However, while these solutions often produce optimized versions from existing pages, they rarely evaluate the impact of their optimizations on the original content and functionality. Additionally, due to the lack of a unified metric to evaluate the similarity of the pages generated by these solutions in comparison to the original pages, it is not yet possible to fairly compare the results obtained from different user studies campaigns, unless recruiting the exact same users, which is extremely challenging. In this paper, we demonstrate the lack of qualitative evaluation metrics, and propose QLUE (QuaLitative Uniform Evaluation), a tool that automates the qualitative evaluation of web pages generated by web complexity solutions with respect to their original versions using computer vision. QLUE evaluates the content and the functionality of these pages separately using two metrics: QLUE's Structural Similarity, to assess the former, and QLUE's Functional Similarity to assess the latter - a task that is proven to be challenging for humans given the complex functional dependencies in modern pages. Our results show that QLUE computes comparable content and functional scores to those provided by humans. Specifically, 90% of a set of 100 pages were given a similarity score between 90% and 100% by human evaluators, while QLUE shows similar scores for more than 75% of the same pages. In terms of time complexity, QLUE shows that it is capable of evaluating an optimized web page in a few minutes.
KW - Functional Similarity
KW - QLUE
KW - Structural Similarity
KW - Uniform Qualitative Evaluation
KW - Web Pages
UR - http://www.scopus.com/inward/record.url?scp=85129818809&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85129818809&partnerID=8YFLogxK
U2 - 10.1145/3485447.3512112
DO - 10.1145/3485447.3512112
M3 - Conference contribution
AN - SCOPUS:85129818809
T3 - WWW 2022 - Proceedings of the ACM Web Conference 2022
SP - 2400
EP - 2410
BT - WWW 2022 - Proceedings of the ACM Web Conference 2022
PB - Association for Computing Machinery, Inc
Y2 - 25 April 2022 through 29 April 2022
ER -