Loading...
ENHANCING NLP CAPABILITIES: STRATEGIES FOR LANGUAGE MODEL ADAPTATION IN LOW-RESOURCE TEXT CLASSIFICATION TASK AND EVALUATIONS
Xu, Hanzi
Xu, Hanzi
Citations
Altmetric:
Genre
Thesis/Dissertation
Date
2024-12
Advisor
Committee member
Group
Department
Computer and Information Science
Permanent link to this record
Collections
Files
Loading...
Xu_temple_0225E_15994.pdf
Adobe PDF, 3.43 MB
Research Projects
Organizational Units
Journal Issue
DOI
https://doi.org/10.34944/59cx-ae55
Abstract
Nowadays, there are two approaches solving classification tasks in Natural Language Processing (NLP). The traditional way usually includes the adaptation of smaller Pre-trained Large Language Models (BERT, RoBERTa, etc.) to specific downstream tasks that offer both remarkable opportunities and significant challenges. While these models have been pivotal in achieving state-of-the-art results across numerous NLP tasks, their dependence on extensive annotated datasets for fine-tuning poses a substantial barrier, particularly in resource-scarce scenarios. The other approach is enabled by the emerging talent of massive-scale LLMs in recent years, where the classification tasks are solved by a one-for-all general-purposed auto-regressive model (GPT, Llama, etc). However, the strong performances of these models are overrated due to their inability to exhibit the expected comprehension of the task. To address the challenges, we propose three innovative methodologies. Firstly, we introduce “OpenStance”, a novel stance detection system that operates effectively in a zero-shot setting. By leveraging a unique masking mechanism for weak supervision and utilizing existing textual entailment datasets for indirect supervision, OpenStance can handle open-domain topics and generalize across multiple domains without the need for extensive annotated data. Secondly, we present “X-shot”, a robust classification system that addresses the challenges of label variability in real-world applications. This system is capable of handling frequent-shot, few-shot, and zero-shot classification problems simultaneously, employing a flexible framework that adapts to the frequency of label occurrences and manages labels across the spectrum of availability. X-shot shows superior performance across diverse domains and label distributions. Thirdly, we propose “KNOW-NO”, a new benchmark for evaluating the performance of generative LLMs in classification tasks, especially when gold labels are absent. This benchmark, along with a new evaluation metric called “OMNIACCURACY”, reveals the limitations of LLMs when they are forced to select from available label candidates, even when none are correct. This approach provides a more accurate assessment of LLMs’ performance in classification tasks, both when gold labels are present and absent.
This dissertation proposes innovative methodologies that minimize effort in adapting traditional LLMs to various classification tasks and also propose a novel evaluation metric to accurately assess the human-level discrimination intelligence of the newest LLMs in classification tasks. These methodologies aim to enhance the utility and discrimination ability of different generations of LLMs in the NLP domain, setting a foundation for future advancements in text classification tasks.
Description
Citation
Citation to related work
Has part
ADA compliance
For Americans with Disabilities Act (ADA) accommodation, including help with reading this content, please contact scholarshare@temple.edu
Embedded videos
License
IN COPYRIGHT- This Rights Statement can be used for an Item that is in copyright. Using this statement implies that the organization making this Item available has determined that the Item is in copyright and either is the rights-holder, has obtained permission from the rights-holder(s) to make their Work(s) available, or makes the Item available under an exception or limitation to copyright (including Fair Use) that entitles it to make the Item available.
