The goal of the Bioinformatics and Pathways Core is to aid in the analysis, interpretation and presentation of data within the experimental systems being studied by the COBRE Junior Investigators. The Core also provides assistance in experimental design. The Director is Dr. Jonathan Wren, Associate Member in the Arthritis and Clinical Immunology Program at OMRF. Dr. Wren is assisted by Dr. Constantin Georgescu, a trained statistician.
The Specific Aims of the Core are 1) to provide complete bioinformatics analysis and biological interpretation of high throughput data, 2) to identify key genes and biomarkers involved in processes of interest and predict gene functions, phenotypes, and disease relevance, and 3) to create a sustainable core facility that can be used institution-wide. The services of the core are available free of charge to COBRE investigators.
In addition to providing conventional statistical analyses, the Bioinformatics and Pathways Core has novel software developed by Dr. Wren. One program called GAMMA (Global Microarray Meta-Analysis) predicts functions of poorly annotated genes based on co-expression data and mutual information metrics. GAMMA aids in interpreting the functional significance of these genes in biological studies. Conversely, it can also identify candidate genes of interest relevant to experimental systems under study (e.g., meiosis, hematopoiesis, etc.) and screen for those where no publications exist between the gene and the system, enabling discovery of novel associations for new investigators.
A second novel software program called IRIDESCENT (Implicit Relationship IDEntification by Software Construction of an Entity-based Network from Text) is designed for large-scale analysis of PubMed abstracts IRIDESCENT automates the identification and analysis of relationships within the published literature by identifying simple relationships between terms, along with a relative strength of association between them. This large network of relationships between genes, diseases, phenotypes, chemical compounds, ontological categories and FDA-approved drugs serves as a basis for analyzing lists (e.g., microarray data), and for identifying implied relationships. That is, given two things that are not related themselves, IRIDESCENT can be used to identify things they have in common. By evaluating the statistical significance of what they have in common, a measure of strength (relative to known relationships) of their relatedness can be developed.
Finally, Genome Runner is a tool for automating genome exploration. It performs annotation and enrichment analyses of user-provided genomic regions (SNPs, ChIP-seg binding sites, etc.) against >6000 (human genome) epigenomic features available from the UCSC genome browser. It gives a detailed annotation of each genomic region in the input data and can be used to prioritize individual genomic regions by the total number of epigenomic feathers they co-localize with. It also provides p-values for statistically significant co-localizations of input genomie-wide data with genome annotation features selected for the analysis. These p-values can be used to prioritize epigenomic features associated with user data.