Background: Biomedical named entity recognition (BioNER) is a fundamental and essential task for biomedical\nliterature mining, which affects the performance of downstream tasks. Most BioNER models rely on domain-specific\nfeatures or hand-crafted rules, but extracting features from massive data requires much time and human efforts. To\nsolve this, neural network models are used to automatically learn features. Recently, multi-task learning has been\napplied successfully to neural network models of biomedical literature mining. For BioNER models, using multi-task\nlearning makes use of features from multiple datasets and improves the performance of models.\nResults: In experiments, we compared our proposed model with other multi-task models and found our model\noutperformed the others on datasets of gene, protein, disease categories. We also tested the performance of different\ndataset pairs to find out the best partners of datasets. Besides, we explored and analyzed the influence of different\nentity types by using sub-datasets. When dataset size was reduced, our model still produced positive results.\nConclusion: We propose a novel multi-task model for BioNER with the cross-sharing structure to improve the\nperformance of multi-task models. The cross-sharing structure in our model makes use of features from both datasets\nin the training procedure. Detailed analysis about best partners of datasets and influence between entity categories\ncan provide guidance of choosing proper dataset pairs for multi-task training.
Loading....