| Kassner, Nora (2025): Consistency and completeness of knowledge acquired by language models. Dissertation, LMU München: Fakultät für Mathematik, Informatik und Statistik |
Vorschau |
PDF
Kassner_Nora.pdf 6MB |
Abstract
The introduction of Deep Learning based pretrained Language Models (LMs) has brought large improvements throughout Natural Language Processing (NLP). Pretrained in an unsupervised fashion to predict missing words from incomplete pieces of text, they can subsequently be adapted to any task of interest and replace the need to develop and train specialized architectures. Their ability to excel is rooted in the fact that pretrained LMs store a multitude of task relevant knowledge parametrically. This knowledge is not limited to linguistic capabilities. Pretrained LMs were also shown to acquire significant amounts of world and commonsense knowledge. We explore two aspects of world/commonsense knowledge acquired by LMs. First, consistency, that is, we expect a model’s behavior to be consistent across a set of implied queries. Second, completeness with respect to the factual queries which may be put to the model. We analyze knowledge consistency with respect to negation, adversarial distractors, multilinguality, para-phrasing and common-sense reasoning and find that although the LMs under investigation contain significant amounts of world knowledge, they are prone to answer factual implications inconsistently and self-contradictory. As a result, it can be hard to identify what the model actually "believes" about the world, making it susceptible to inconsistent behavior. Building on this, we develop a new architecture where an LM is enhanced with a "symbolic executive" - an evolving, symbolic memory of prior beliefs. For new incoming queries the LM has the ability to reflect back on related beliefs, enabling it to improve over time. To improve knowledge completeness, we explore different ways of integrating knowledge not acquired by the model. i) We enhance an LM with a retrieval component over external knowledge sources for improved Question Answering. ii) We explore LMs' reasoning capabilities during pretraining to deduce knowledge not explicitly seen. iii) We build a model that integrates novel entities into LM-based Entity Linking systems. In analyzing and improving knowledge consistency and completeness, this thesis makes a significant step towards LM-based architectures with a systematic notion of belief, enabling them to construct a more coherent picture of the world, and improve over time without model retraining.
| Dokumententyp: | Dissertationen (Dissertation, LMU München) |
|---|---|
| Themengebiete: | 000 Allgemeines, Informatik, Informationswissenschaft
000 Allgemeines, Informatik, Informationswissenschaft > 004 Informatik |
| Fakultäten: | Fakultät für Mathematik, Informatik und Statistik |
| Sprache der Hochschulschrift: | Englisch |
| Datum der mündlichen Prüfung: | 14. August 2025 |
| 1. Berichterstatter:in: | Schütze, Hinrich |
| MD5 Prüfsumme der PDF-Datei: | a3851a2cbfe5f7bafa5024665c521292 |
| Signatur der gedruckten Ausgabe: | 0001/UMC 31712 |
| ID Code: | 36004 |
| Eingestellt am: | 06. Feb. 2026 15:00 |
| Letzte Änderungen: | 06. Feb. 2026 15:00 |