In the realm of natural language processing (NLP), text classification constitutes a task of paramount significance for large language models (LLMs). Nevertheless, extant methodologies predominantly depend on the output generated by the final layer of LLMs, thereby neglecting the wealth of information encapsulated within neurons residing in intermediate layers. To surmount this shortcoming, we introduce LENS (Linear Exploration and Neuron Selection), an innovative technique designed to identify and sparsely integrate salient neurons from intermediate layers via a process of linear exploration. Subsequently, these neurons are transmitted to downstream modules dedicated to text classification. This strategy effectively mitigates noise originating from non-pertinent neurons, thereby enhancing both the accuracy and computational efficiency of the model. The detection of telecommunication fraud text represents a formidable challenge within NLP, primarily attributed to its increasingly covert nature and the inherent limitations of current detection algorithms. In an effort to tackle the challenges of data scarcity and suboptimal classification accuracy, we have developed the LENS-RMHR (Linear Exploration and Neuron Selection with RoBERTa, Multi-head Mechanism, and Residual Connections) model, which extends the LENS framework. By incorporating RoBERTa, a multi-head attention mechanism, and residual connections, the LENS-RMHR model augments the feature representation capabilities and improves training efficiency. Utilizing the CCL2023 telecommunications fraud dataset as a foundation, we have constructed an expanded dataset encompassing eight distinct categories that encapsulate a diverse array of fraud types. Furthermore, a dual-loss function has been employed to bolster the model’s performance in multiclass classification scenarios. Experimental results reveal that LENS-RMHR demonstrates superior performance across multiple benchmark datasets, underscoring its extensive potential for application in the domains of text classification and telecommunications fraud detection.
Loading....