TY - GEN
T1 - Exploring the Knowledge Mismatch Hypothesis
T2 - 11th IEEE/ACM International Conference on Big Data Computing, Applications and Technologies, BDCAT 2024
AU - Wee, Phil
AU - Baghdadi, Riyadh
N1 - Publisher Copyright:
© 2024 IEEE.
PY - 2024
Y1 - 2024
N2 - Recently, there has been an explosion of large language models created through fine-tuning with data from larger models. These small models able to produce outputs that appear qualitatively similar to significantly larger models. However, one of the key limitations that have been observed with these models is their propensity to hallucinate significantly more often than larger models. In particular, they have been observed to generate coherent outputs that involve factually incorrect information and spread misinformation, toxicity, and stereotypes. There are many potential causes of hallucination, of which, one hypothesis is that fine-tuning a model on data produced by a larger model leads to a knowledge mismatch which contributes to hallucination. In particular, it is hypothesized that there is a mismatch between the knowledge that is fed to the model to fine-tune it and the knowledge that is already present in the graph. Fine-tuning the model on data that has such mismatch could contribute to an increased propensity to hallucinate. We show that on an unseen test set, a smaller model fine-tuned on data generated from a larger model produced more wrong answers when compared to models fine-tuned on data created by the small model, which confirms the hypothesis.
AB - Recently, there has been an explosion of large language models created through fine-tuning with data from larger models. These small models able to produce outputs that appear qualitatively similar to significantly larger models. However, one of the key limitations that have been observed with these models is their propensity to hallucinate significantly more often than larger models. In particular, they have been observed to generate coherent outputs that involve factually incorrect information and spread misinformation, toxicity, and stereotypes. There are many potential causes of hallucination, of which, one hypothesis is that fine-tuning a model on data produced by a larger model leads to a knowledge mismatch which contributes to hallucination. In particular, it is hypothesized that there is a mismatch between the knowledge that is fed to the model to fine-tune it and the knowledge that is already present in the graph. Fine-tuning the model on data that has such mismatch could contribute to an increased propensity to hallucinate. We show that on an unseen test set, a smaller model fine-tuned on data generated from a larger model produced more wrong answers when compared to models fine-tuned on data created by the small model, which confirms the hypothesis.
KW - evaluation
KW - fine-tuning
KW - hallucination
KW - large language models
UR - http://www.scopus.com/inward/record.url?scp=105003202381&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=105003202381&partnerID=8YFLogxK
U2 - 10.1109/BDCAT63179.2024.00048
DO - 10.1109/BDCAT63179.2024.00048
M3 - Conference contribution
AN - SCOPUS:105003202381
T3 - Proceedings - 2024 IEEE/ACM International Conference on Big Data Computing, Applications and Technologies, BDCAT 2024
SP - 258
EP - 263
BT - Proceedings - 2024 IEEE/ACM International Conference on Big Data Computing, Applications and Technologies, BDCAT 2024
PB - Institute of Electrical and Electronics Engineers Inc.
Y2 - 16 December 2024 through 19 December 2024
ER -