Computational Knowledge Discovery for Bioinformatics Research
Latest Publications


TOTAL DOCUMENTS

20
(FIVE YEARS 0)

H-INDEX

1
(FIVE YEARS 0)

Published By IGI Global

9781466617858, 9781466617865

Author(s):  
Sven Ulrich ◽  
Pierre Baumann ◽  
Andreas Conca ◽  
Hans-Joachim Kuss ◽  
Viktoria Stieffenhofer ◽  
...  

Therapeutic drug monitoring (TDM) has consistently been shown to be useful for optimization of drug therapy. For the first time, a method has been developed for the text analysis of TDM in SPCs in that a catalogue SPC-ContentTDM (SPCCTDM) provides a codification of the content of TDM in SPCs. It consists of six structure-related items (dose, adverse drug reactions, drug interactions, overdose, pregnancy/breast feeding, and pharmacokinetics) according to implicit or explicit references to TDM in paragraphs of the SPC, and four theory-guided items according to the information about ranges of plasma concentrations and a recommendation of TDM in the SPC. The catalogue is regarded as valid for the text analysis of SPCs with respect to TDM. It can be used in the comparison of SPCs, in the comparison with medico-scientific evidence and for the estimation of the perception of TDM in SPCs by the reader. Regarding the approach as a model of text mining, it may be extended for evaluation of other aspects reported in SPCs.


Author(s):  
Young-Rae Cho ◽  
Aidong Zhang

High-throughput techniques involve large-scale detection of protein-protein interactions. This interaction data set from the genome-scale perspective is structured into an interactome network. Since the interaction evidence represents functional linkage, various graph-theoretic computational approaches have been applied to the interactome networks for functional characterization. However, this data is generally unreliable, and the typical genome-wide interactome networks have a complex connectivity. In this paper, the authors explore systematic analysis of protein interactome networks, and propose a $k$-round signal flow simulation algorithm to measure interaction reliability from connection patterns of the interactome networks. This algorithm quantitatively characterizes functional links between proteins by simulating the propagation of information signals through complex connections. In this regard, the algorithm efficiently estimates the strength of alternative paths for each interaction. The authors also present an algorithm for mining the complex interactome network structure. The algorithm restructures the network by hierarchical ordering of nodes, and this structure re-formatting process reveals hub proteins in the interactome networks. This paper demonstrates that two rounds of simulation accurately scores interaction reliability in terms of ontological correlation and functional consistency. Finally, the authors validate that the selected structural hubs represent functional core proteins.


Author(s):  
Tsuyoshi Kato ◽  
Kinya Okada ◽  
Hisashi Kashima ◽  
Masashi Sugiyama

The authors’ algorithm was favorably examined on two kinds of biological networks: a metabolic network and a protein interaction network. A statistical test confirmed that the weight that our algorithm assigned to each assay was meaningful.


Author(s):  
Oruganty Krishnadev ◽  
Shveta Bisht ◽  
Narayanaswamy Srinivasan

The genomes of many human pathogens have been sequenced but the protein-protein interactions across a pathogen and human are still poorly understood. The authors apply a simple homology-based method to predict protein-protein interactions between human host and two mycobacterial organisms viz., M.tuberculosis and M.leprae. They focused on secreted proteins of pathogens and cellular membrane proteins to restrict to uncovering biologically significant and feasible interactions. Predicted interactions include five mycobacterial proteins of yet unknown function, thus suggesting a role for these proteins in pathogenesis. The authors predict interaction partners for secreted mycobacterial antigens such as MPT70, serine proteases and other proteins interacting with human proteins, such as toll-like receptors, ras signalling proteins and immune maintenance proteins, that are implicated in pathogenesis. These results suggest that the list of predicted interactions is suitable for further analysis and forms a useful step in the understanding of pathogenesis of these mycobacterial organisms.


Author(s):  
Susan Fairley ◽  
John D. McClure ◽  
Neil Hanlon ◽  
Rob Irving ◽  
Martin W. McBride ◽  
...  

A probe mapping technique using a novel implementation of a persistent q-gram index was developed. It guarantees to find all matches that meet certain definitions. These include exact matching of the central 19 bases of 25 base probes, matching the central 19 bases with at most one or three mismatches and exact matching of any 16 bases. In comparison with BLAST and BLAT, the new methods were either significantly faster or identified matches missed by the heuristics. The 16 bp method was used to map the 342,410 perfect match probes from the Affymetrix GeneChip Rat Genome 230 2.0 Array to the genome. When compared with the mapping from Ensembl, the new mapping included over seven million novel matches, providing additional evidence for researchers wishing to further investigate the sources of signals measured in microarray experiments. The results demonstrate the practicality of the index, which could support other q-gram based algorithms.


Author(s):  
Richipal Singh Bindra ◽  
Jason T. L. Wang ◽  
Paramjeet Singh Bagga

MicroRNAs (miRNAs) are short single-stranded RNA molecules with 21-22 nucleotides known to regulate post-transcriptional expression of protein-coding genes involved in most of the cellular processes. Prediction of miRNA targets is a challenging bioinformatics problem. AU-rich elements (AREs) are regulatory RNA motifs found in the 3’ untranslated regions (UTRs) of mRNAs, and they play dominant roles in the regulated decay of short-lived human mRNAs via specific interactions with proteins. In this paper, the authors review several miRNA target prediction tools and data sources, as well as computational methods used for the prediction of AREs. The authors discuss the connection between miRNA and ARE-mediated post-transcriptional gene regulation. Finally, a data mining method for identifying the co-occurrences of miRNA target sites in ARE containing genes is presented.


Author(s):  
Miao Wang ◽  
Xuequn Shang ◽  
Shaohua Zhang ◽  
Zhanhuai Li

DNA microarray technology has generated a large number of gene expression data. Biclustering is a methodology allowing for condition set and gene set points clustering simultaneously. It finds clusters of genes possessing similar characteristics together with biological conditions creating these similarities. Almost all the current biclustering algorithms find bicluster in one microarray dataset. In order to reduce the noise influence and find more biological biclusters, the authors propose the FDCluster algorithm in order to mine frequent closed discriminative bicluster in multiple microarray datasets. FDCluster uses Apriori property and several novel techniques for pruning to mine biclusters efficiently. To increase the space usage, FDCluster also utilizes several techniques to generate frequent closed bicluster without candidate maintenance in memory. The experimental results show that FDCluster is more effective than traditional methods in either single micorarray dataset or multiple microarray datasets. This paper tests the biological significance using GO to show the proposed method is able to produce biologically relevant biclusters.


Author(s):  
Erliang Zeng ◽  
Chengyong Yang ◽  
Tao Li ◽  
Giri Narasimhan

Clustering of gene expression data is a standard exploratory technique used to identify closely related genes. Many other sources of data are also likely to be of great assistance in the analysis of gene expression data. This data provides a mean to begin elucidating the large-scale modular organization of the cell. The authors consider the challenging task of developing exploratory analytical techniques to deal with multiple complete and incomplete information sources. The Multi-Source Clustering (MSC) algorithm developed performs clustering with multiple, but complete, sources of data. To deal with incomplete data sources, the authors adopted the MPCK-means clustering algorithms to perform exploratory analysis on one complete source and other potentially incomplete sources provided in the form of constraints. This paper presents a new clustering algorithm MSC to perform exploratory analysis using two or more diverse but complete data sources, studies the effectiveness of constraints sets and robustness of the constrained clustering algorithm using multiple sources of incomplete biological data, and incorporates such incomplete data into constrained clustering algorithm in form of constraints sets.


Author(s):  
Majid Masso

A computational mutagenesis is detailed whereby each single residue substitution in a protein chain of primary sequence length N is represented as a sparse N-dimensional feature vector, whose M << N nonzero components locally quantify environmental perturbations occurring at the mutated position and its neighbors in the protein structure. The methodology makes use of both the Delaunay tessellation algorithm for representing protein structures, as well as a four-body, knowledge based, statistical contact potential. Feature vectors for each subset of mutants due to all possible residue substitutions at a particular position cohabit the same M-dimensional subspace, where the value of M and the identities of the M nonzero components are similarly position dependent. The approach is used to characterize a large experimental dataset of single residue substitutions in bacteriophage T4 lysozyme, each categorized as either unaffected or affected based on the measured level of mutant activity relative to that of the native protein. Performance of a single classifier trained with the collective set of mutants in N-space is compared to that of an ensemble of position-specific classifiers trained using disjoint mutant subsets residing in significantly smaller subspaces. Results suggest that significant improvements can be achieved through subspace modeling.


Author(s):  
Junming Shao ◽  
Klaus Hahn ◽  
Qinli Yang ◽  
Afra Wohlschläeger ◽  
Christian Boehm ◽  
...  

Diffusion tensor magnetic resonance imaging (DTI) provides a promising way of estimating the neural fiber pathways in the human brain non-invasively via white matter tractography. However, it is difficult to analyze the vast number of resulting tracts quantitatively. Automatic tract clustering would be useful for the neuroscience community, as it can contribute to accurate neurosurgical planning, tract-based analysis, or white matter atlas creation. In this paper, the authors propose a new framework for automatic white matter tract clustering using a hierarchical density-based approach. A novel fiber similarity measure based on dynamic time warping allows for an effective and efficient evaluation of fiber similarity. A lower bounding technique is used to further speed up the computation. Then the algorithm OPTICS is applied, to sort the data into a reachability plot, visualizing the clustering structure of the data. Interactive and automatic clustering algorithms are finally introduced to obtain the clusters. Extensive experiments on synthetic data and real data demonstrate the effectiveness and efficiency of our fiber similarity measure and show that the hierarchical density-based clustering method can group these tracts into meaningful bundles on multiple scales as well as eliminating noisy fibers.


Sign in / Sign up

Export Citation Format

Share Document