Current Issue : January - March Volume : 2015 Issue Number : 1 Articles : 7 Articles
Background: Human leukocyte antigen (HLA) genes are critical genes involved in important bio medical aspects,\nincluding organ transplantation, autoimmune diseases and infectious diseases. The gene family contains the most\npolymorphic genes in humans and the difference between two alleles is only a single base pair substitution in many\ncases. The next generation sequencing (NGS) technologies could be used for high throughput HLA typing but in silico\nmethods are still needed to correctly assign the alleles of a sample. Computer scientists have developed such\nmethods for various NGS platforms, such as Illumina, Roche 454 and Ion Torrent, based on the characteristics of the\nreads they generate. However, the method for PacBio reads was less addressed, probably owing to its high error rates.\nThe PacBio system has the longest read length among available NGS platforms, and therefore is the only platform\ncapable of having exon 2 and exon 3 of HLA genes on the same read to unequivocally solve the ambiguity problem\ncaused by the ââ?¬Å?phasingââ?¬Â issue.\nResults: We proposed a new method Bayes Typing1 to assign HLA alleles for Pac Bio circular consensus sequencing\nreads using Bayesââ?¬â?¢ theorem. The method was applied to simulated data of the three loci HLA-A, HLA-B and HLA-DRB1.\nThe experimental results showed its capability to tolerate the disturbance of sequencing errors and external noise\nreads.\nConclusions: The Bayes Typing1 method could overcome the problems of HLA typing using PacBio reads, which\nmostly arise from sequencing errors of Pac Bio reads and the divergence of HLA genes, to some extent....
In recent days, biometrics technologies are showing huge importance in various applications. Biometric fingerprint recognition is considered as one of the most reliable technologies and has been extensively used in personal identification. Finger prints have wide variation since no two people have identical fingerprints. Fingerprints are unique because of their features. By extracting features from fingerprints and separate them into different component like continuous component and spiral component and then mix with components of other fingerprint, a new identity can be generated. In the enrollment, two fingerprints are captured from two different fingers. We extract the minutiae positions from one fingerprint, the orientation from the other fingerprint. Based on this extracted information our proposed combined template is generated and stored in a database. In the authentication, the system requires two query fingerprints from the same two fingers which are used in the enrollment. Again extract information from both fingerprint and this extracted information will be matched against the corresponding template stored in the database. Result in terms of matching or not matching will be produced....
Background: DNA methylation is a widely studied epigenetic phenomenon; alterations in methylation patterns\ninfluence human phenotypes and risk of disease. As part of the Atherosclerosis Risk in Communities (ARIC) study,\nthe Illumina Infinium HumanMethylation450 (HM450) BeadChip was used to measure DNA methylation in\nperipheral blood obtained from ~3000 African American study participants. Over 480,000 cytosine-guanine (CpG)\ndinucleotide sites were surveyed on the HM450 BeadChip. To evaluate the impact of technical variation, 265\ntechnical replicates from 130 participants were included in the study.\nResults: For each CpG site, we calculated the intraclass correlation coefficient (ICC) to compare variation of\nmethylation levels within- and between-replicate pairs, ranging between 0 and 1. We modeled the distribution of\nICC as a mixture of censored or truncated normal and normal distributions using an EM algorithm. The CpG sites\nwere clustered into low- and high-reliability groups, according to the calculated posterior probabilities. We also\ndemonstrated the performance of this clustering when applied to a study of association between methylation levels\nand smoking status of individuals. For the CpG sites showing genome-wide significant association with smoking\nstatus, most (~96%) were seen from sites in the high reliability cluster.\nConclusions: We suggest that CpG sites with low ICC may be excluded from subsequent association analyses, or\nextra caution needs to be taken for associations at such sites...
Background: DNA-binding proteins are vital for the study of cellular processes. In recent genome engineering\nstudies, the identification of proteins with certain functions has become increasingly important and needs to be\nperformed rapidly and efficiently. In previous years, several approaches have been developed to improve the\nidentification of DNA-binding proteins. However, the currently available resources are insufficient to accurately\nidentify these proteins. Because of this, the previous research has been limited by the relatively unbalanced accuracy\nrate and the low identification success of the current methods.\nResults: In this paper, we explored the practicality of modelling DNA binding identification and simultaneously\nemployed an ensemble classifier, and a new predictor (nDNA-Prot) was designed. The presented framework is\ncomprised of two stages: a 188-dimension feature extraction method to obtain the protein structure and an ensemble\nclassifier designated as imDC. Experiments using different datasets showed that our method is more successful than\nthe traditional methods in identifying DNA-binding proteins. The identification was conducted using a feature\nthat selected the minimum Redundancy and Maximum Relevance (mRMR). An accuracy rate of 95.80% and an\nArea Under the Curve (AUC) value of 0.986 were obtained in a cross validation. A test data set was tested in our\nmethod and resulted in an 86% accuracy, versus a 76% using iDNA-Prot and a 68% accuracy using DNA-Prot.\nConclusions: Our method can help to accurately identify DNA-binding proteins, and the web server is accessible at\nhttp://data mining. xmu.edu.cn/~songli/nDNA. In addition, we also predicted possible DNA-binding protein sequences\nin all of the sequences from the UniProtKB/Swiss-Prot database....
Background: Understanding the relationship between diseases based on the underlying biological mechanisms is\none of the greatest challenges in modern biology and medicine. Exploring disease-disease associations by using\nsystem-level biological data is expected to improve our current knowledge of disease relationships, which may lead to\nfurther improvements in disease diagnosis, prognosis and treatment.\nResults: We took advantage of diverse biological data including disease-gene associations and a large-scale\nmolecular network to gain novel insights into disease relationships. We analysed and compared four publicly available\ndisease-gene association datasets, then applied three disease similarity measures, namely annotation-based measure,\nfunction-based measure and topology-based measure, to estimate the similarity scores between diseases. We\nsystematically evaluated disease associations obtained by these measures against a statistical measure of comorbidity\nwhich was derived from a large number of medical patient records. Our results show that the correlation between our\nsimilarity measures and comorbidity scores is substantially higher than expected at random, confirming that our\nsimilarity measures are able to recover comorbidity associations. We also demonstrated that our predicted disease\nassociations correlated with disease associations generated from genome-wide association studies significantly\nhigher than expected at random. Furthermore, we evaluated our predicted disease associations via mining the\nliterature on PubMed, and presented case studies to demonstrate how these novel disease associations can be used\nto enhance our current knowledge of disease relationships.\nConclusions: We present three similarity measures for predicting disease associations. The strong correlation\nbetween our predictions and known disease associations demonstrates the ability of our measures to provide novel\ninsights into disease relationships....
Background: Vision-based surveillance and monitoring is a potential alternative for early detection of respiratory\ndisease outbreaks in urban areas complementing molecular diagnostics and hospital and doctor visit-based alert\nsystems. Visible actions representing typical flu-like symptoms include sneeze and cough that are associated with\nchanging patterns of hand to head distances, among others. The technical difficulties lie in the high complexity and\nlarge variation of those actions as well as numerous similar background actions such as scratching head, cell phone\nuse, eating, drinking and so on.\nResults: In this paper, we make a first attempt at the challenging problem of recognizing flu-like symptoms from\nvideos. Since there was no related dataset available, we created a new public health dataset for action recognition\nthat includes two major flu-like symptom related actions (sneeze and cough) and a number of background actions.\nWe also developed a suitable novel algorithm by introducing two types of Action Matching Kernels, where both types\naim to integrate two aspects of local features, namely the space-time layout and the Bag-of-Words representations. In\nparticular, we show that the Pyramid Match Kernel and Spatial Pyramid Matching are both special cases of our\nproposed kernels. Besides experimenting on standard testbed, the proposed algorithm is evaluated also on the new\nsneeze and cough set. Empirically, we observe that our approach achieves competitive performance compared to the\nstate-of-the-arts, while recognition on the new public health dataset is shown to be a non-trivial task even with simple\nsingle person unobstructed view.\nConclusions: Our sneeze and cough video dataset and newly developed action recognition algorithm is the first of\nits kind and aims to kick-start the field of action recognition of flu-like symptoms from videos. It will be challenging but\nnecessary in future developments to consider more complex real-life scenario of detecting these actions\nsimultaneously from multiple persons in possibly crowded environments...
Background: Recent advances in deep digital sequencing have unveiled an unprecedented degree of clonal\nheterogeneity within a single tumor DNA sample. Resolving such heterogeneity depends on accurate estimation of\nfractions of alleles that harbor somatic mutations. Unlike substitutions or small indels, structural variants such as\ndeletions, duplications, inversions and translocations involve segments of DNAs and are potentially more accurate\nfor allele fraction estimations. However, no systematic method exists that can support such analysis.\nResults: In this paper, we present a novel maximum-likelihood method that estimates allele fractions of structural\nvariants integratively from various forms of alignment signals. We develop a tool, Break Down, to estimate the allele\nfractions of most structural variants including medium size (from 1 kilobase to 1 megabase) deletions and duplications,\nand balanced inversions and translocations.\nConclusions: Evaluation based on both simulated and real data indicates that our method systematically enables\nstructural variants for clonal heterogeneity analysis and can greatly enhance the characterization of genomically\ninstable tumors....
Loading....