Current Issue : October - December Volume : 2019 Issue Number : 4 Articles : 6 Articles
Background: Correcting a heterogeneous dataset that presents artefacts from several confounders is often an\nessential bioinformatics task. Attempting to remove these batch effects will result in some biologically meaningful\nsignals being lost. Thus, a central challenge is assessing if the removal of unwanted technical variation harms the\nbiological signal that is of interest to the researcher.\nResults: We describe a novel framework, B-CeF, to evaluate the effectiveness of batch correction methods and their\ntendency toward over or under correction. The approach is based on comparing co-expression of adjusted gene-gene\npairs to a-priori knowledge of highly confident gene-gene associations based on thousands of unrelated experiments\nderived from an external reference. Our framework includes three steps: (1) data adjustment with the desired methods\n(2) calculating gene-gene co-expression measurements for adjusted datasets (3) evaluating the performance of the coexpression\nmeasurements against a gold standard. Using the framework, we evaluated five batch correction methods\napplied to RNA-seq data of six representative tissue datasets derived from the GTEx project.\nConclusions: Our framework enables the evaluation of batch correction methods to better preserve the original\nbiological signal. We show that using a multiple linear regression model to correct for known confounders\noutperforms factor analysis-based methods that estimate hidden confounders. The code is publicly available\nas an R package....
Background: The recent success of immunotherapy in treating tumors has attracted increasing interest in research\nrelated to the adaptive immune system in the tumor microenvironment. Recent advances in next-generation\nsequencing technology enabled the sequencing of whole T-cell receptors (TCRs) and B-cell receptors\n(BCRs)/immunoglobulins (Igs) in the tumor microenvironment. Since BCRs/Igs in tumor tissues have high affinities for\ntumor-specific antigens, the patterns of their amino acid sequences and other sequence-independent features such\nas the number of somatic hypermutations (SHMs) may differ between the normal and tumor microenvironments.\nHowever, given the high diversity of BCRs/Igs and the rarity of recurrent sequences among individuals, it is far more\ndifficult to capture such differences in BCR/Ig sequences than in TCR sequences. The aim of this study was to explore\nthe possibility of discriminating BCRs/Igs in tumor and in normal tissues, by capturing these differences using\nsupervised machine learning methods applied to RNA sequences of BCRs/Igs.\nResults: RNA sequences of BCRs/Igs were obtained from matched normal and tumor specimens from 90 gastric\ncancer patients. BCR/Ig-features obtained in Rep-Seq were used to classify individual BCR/Ig sequences into normal or\ntumor classes. Different machine learning models using various features were constructed as well as gradient\nboosting machine (GBM) classifier combining these models. The results demonstrated that BCR/Ig sequences\nbetween normal and tumor microenvironments exhibit their differences. Next, by using a GBM trained to classify\nindividual BCR/Ig sequences, we tried to classify sets of BCR/Ig sequences into normal or tumor classes. As a result, an\narea under the curve (AUC) value of 0.826 was achieved, suggesting that BCR/Ig repertoires have distinct\nsequence-level features in normal and tumor tissues.\nConclusions: To the best of our knowledge, this is the first study to show that BCR/Ig sequences derived from tumor\nand normal tissues have globally distinct patterns, and that these tissues can be effectively differentiated using BCR/Ig\nrepertoires....
Background: Drug candidates often cause an unwanted blockage of the potassium ion channel of the human\nether-a-go-go-related gene (hERG). The blockage leads to long QT syndrome (LQTS), which is a severe life-threatening\ncardiac side effect. Therefore, a virtual screening method to predict drug-induced hERG-related cardiotoxicity could\nfacilitate drug discovery by filtering out toxic drug candidates.\nResult: In this study, we generated a reliable hERG-related cardiotoxicity dataset composed of 2130 compounds, which\nwere carried out under constant conditions. Based on our dataset, we developed a computational hERG-related\ncardiotoxicity prediction model. The neural network model achieved an area under the receiver operating characteristic\ncurve (AUC) of 0.764, with an accuracy of 90.1%, a Matthews correlation coefficient (MCC) of 0.368, a sensitivity of 0.321,\nand a specificity of 0.967, when ten-fold cross-validation was performed. The model was further evaluated using ten\ndrug compounds tested on guinea pigs and showed an accuracy of 80.0%, an MCC of 0.655, a sensitivity of 0.600, and\na specificity of 1.000, which were better than the performances of existing hERG-toxicity prediction models.\nConclusion: The neural network model can predict hERG-related cardiotoxicity of chemical compounds with a high\naccuracy. Therefore, the model can be applied to virtual high-throughput screening for drug candidates that do not\ncause cardiotoxicity. The prediction tool is available as a web-tool at http://ssbio.cau.ac.kr/CardPred....
Background: Networks have been widely used to model the structures of various biological systems. The ultimate\naim of research on biological networks is to steer biological system structures to desired states by manipulating\nsignals. Despite great advances in the linear control of single-layer networks, it has been observed that many\ncomplex biological systems have a multilayer networked structure and extremely complicated nonlinear processes.\nResult: In this study, we propose a general framework for controlling nonlinear dynamical systems with multilayer\nnetworked structures by formulating the problem as a minimum union optimization problem. In particular, we offer\na novel approach for identifying the minimal driver nodes that can steer a multilayered nonlinear dynamical system\ntoward any desired dynamical attractor. Three disease-related biology multilayer networks are used to demonstrate\nthe effectiveness of our approaches. Moreover, in the set of minimum driver nodes identified by the algorithm we\nproposed, we confirmed that some nodes can act as drug targets in the biological experiments. Other nodes\nhave not been reported as drug targets; however, they are also involved in important biological processes from\nexisting literature.\nConclusions: The proposed method could be a promising tool for determining higher drug target enrichment or\nmore meaningful steering nodes for studying complex diseases....
Background: The selection of reference genes is essential for quantifying gene expression. Theoretically they should\nbe expressed stably and not regulated by experimental or pathological conditions. However, identification and\nvalidation of reference genes for human cancer research are still being regarded as a critical point, because cancerous\ntissues often represent genetic instability and heterogeneity. Recent pan-cancer studies have demonstrated the\nimportance of the appropriate selection of reference genes for use as internal controls for the normalization of gene\nexpression; however, no stably expressed, consensus reference genes valid for a range of different human cancers have\nyet been identified.\nResults: In the present study, we used large-scale cancer gene expression datasets from The Cancer Genome Atlas\n(TCGA) database, which contains 10,028 (9,364 cancerous and 664 normal) samples from 32 different cancer types, to\nconfirm that the expression of the most commonly used reference genes is not consistent across a range of cancer\ntypes. Furthermore, we identified 38 novel candidate reference genes for the normalization of gene expression,\nindependent of cancer type. These genes were found to be highly expressed and highly connected to relevant gene\nnetworks, and to be enriched in transcription-translation regulation processes. The expression stability of the newly\nidentified reference genes across 29 cancerous and matched normal tissues were validated via quantitative reverse\ntranscription PCR (RT-qPCR).\nConclusions: We reveal that most commonly used reference genes in current cancer studies cannot be appropriate to\nserve as representative control genes for quantifying cancer-related gene expression levels, and propose in this study\nthree potential reference genes (HNRNPL, PCBP1, and RER1) to be the most stably expressed across various cancerous\nand normal human tissues....
Background: Immunotherapy is an emerging approach in cancer treatment that activates the host immune system\nto destroy cancer cells expressing unique peptide signatures (neoepitopes). Administrations of cancer-specific\nneoepitopes in the form of synthetic peptide vaccine have been proven effective in both mouse models and human\npatients. Because only a tiny fraction of cancer-specific neoepitopes actually elicits immune response, selection of\npotent, immunogenic neoepitopes remains a challenging step in cancer vaccine development. A basic approach for\nimmunogenicity prediction is based on the premise that effective neoepitope should bind with the Major\nHistocompatibility Complex (MHC) with high affinity.\nResults: In this study, we developed MHCSeqNet, an open-source deep learning model, which not only outperforms\nstate-of-the-art predictors on both MHC binding affinity and MHC ligand peptidome datasets but also exhibits\npromising generalization to unseen MHC class I alleles. MHCSeqNet employed neural network architectures\ndeveloped for natural language processing to model amino acid sequence representations of MHC allele and epitope\npeptide as sentences with amino acids as individual words. This consideration allows MHCSeqNet to accept new MHC\nalleles as well as peptides of any length.\nConclusions: The improved performance and the flexibility offered by MHCSeqNet should make it a valuable tool for\nscreening effective neoepitopes in cancer vaccine development....
Loading....