Skip to main content


Bioinformatics Challenge Project


The Bioinformatics Challenge Project is a two-semester team project course in which all first year Bioinformatics PhD students conduct research with high-throughput data on problems proposed by biology and medical school faculty members. Students work with limited supervision to generate their own ideas and approaches, make functional predictions, develop computational tools, write progress reports, design validation experiments, and present their findings at our Systems Biology seminar.

This year’s three Challenge Projects are described below. One very important aspect of the Challenge Project is that the students have opportunities to present their work in a public science forum. In the second semester, each team creates a poster describing its approach and findings. The posters are entered in the Boston University Science & Engineering Research Symposium in March, and the students are encouraged to submit their posters elsewhere. This year, the team for Project 3 won a Best Poster award at the Science & Engineering Research Symposium. The team for Project 2 had their poster accepted to the ISMB 2012 Conference (Intelligent Systems for Molecular Biology), and two of the students will be attending the conference to present their poster. PDF copies of posters from the projects are included as part of this summary.

We consider the Challenge Project to be one of the outstanding learning opportunities for interdisciplinary research that has resulted from the IGERT grant.

Project 1: Using Dynamic Programming and Phylogenetic Clustering to Reveal Structure/Function Relationships in the Y-Family DNA Polymerases That Bypass DNA Damage Caused by Mutagenic Carcinogens.

Cells suffer damage to their DNA from various environmental agents, including ionizing radiation and DNA-binding chemicals, and such damage can cause mutations and ultimately cancer. If unrepaired, DNA damage typically blocks the DNA replication process during cell division, leading to cell death. To compensate, living systems have evolved special DNA polymerases that can correctly copy damaged DNA, thereby fixing the damage in the new copy. Two classes of DNA polymerase that fix damage are from the Y-Family polymerases. They are the Pol IV class, which fixes modifications that protrude into the minor groove of the DNA double helix, and Pol V class, which corrects for UV damage. The goal of this project is to understand what structural differences between these classes account for their different roles. The Y-family polymerases have a unique “little finger” structural domain which binds DNA and is known to play a significant part in the type of damage corrected. Conventional computational approaches are unable to properly align the amino acids in the little finger region. Dynamic programming techniques and phylogenetic analysis have being applied in this project to understand how amino acid sequence dictates little finger structure and function.

Project 2: Tracking Antiviral Responses Following Infection With Hemorrhagic Fever Viruses.

Lassa fever is an acute viral hemorrhagic fever caused by the Lassa virus. The virus infects from 300,000 to 500,000 people a year, and has a mortality rate of 15% in hospitalized patients. Its symptoms go unnoticed in 80% of the cases, or are confused with those of other hemorrhagic fevers or the common flu. Due to the difficulty in diagnosing the disease, there is considerable interest in developing portable tests that can detect the presence of the virus even during early stages of infection. Data for this project were obtained by isolating peripheral blood mononuclear cells (PBMCs) from non-human primates that were inoculated with the Lassa virus, extracting their RNA at different time points during infection, and quantifying the amount of expression of each gene using microarrays. Differential gene expression analysis was conducted on groups of microarrays corresponding to combined PBMC samples, as well as samples containing a single PBMC cell type. The goal was to find genes that could be used as early response biomarkers, with a pattern of expression that is consistent enough to be detected experimentally and uniquely characteristic of Lassa fever. Several immune response genes were detected that meet these criteria and that are strong candidates for experimental validation.

Project 3: Genomic Signatures of Carcinogenicity.

There are around 75,000 chemical compounds used in industry, many of which are suspected carcinogens. Standard approaches to carcinogen testing are costly and time-consuming. As a result, only approximately 1,500 of the chemicals in commercial use have been tested. Additionally, some chemicals can have synergistic effects, making the characterization of carcinogenic compounds even more difficult as combinations have to be considered. The goals of this project are the development of computational models of carcinogenicity based on gene expression profiles and classification of the carcinogenic potential of individual or complex mixtures of environmental pollutants and/or therapeutics. Analysis was conducted on 3610 gene expression microarray profiles from rats treated with 188 well-characterized chemicals, including genotoxic and non-genotoxic carcinogens, as well as non-carcinogens. Five different tissues types were profiled: liver, kidney, heart, thigh muscle, and cell-cultured hepatocytes, at multiple doses and durations of exposure. The analysis was aimed at identifying gene expression signatures distinguishing benign from carcinogenic substances, as well as genotoxic from non-genotoxic carcinogens. An ensemble classifier was built that uses gene expression-based and chemical structure-based classification models, and takes into account dataset substructure, to predict compound carcinogenicity and genotoxicity. Integration of toxicogenomic-based models with structure-based models leads to increased carcinogenicity prediction accuracy.

Address Goals

This activity primarily addresses the goal of cultivating an outstanding scientific workforce. It emphasizes the goals of developing strong interdisciplinary research collaborations among our faculty and students and encourages creativity, independence, and quality in student research. With respect to the secondary goal of advancing the frontier of knowledge, the work started by the students in the Challenge Projects will be carried forward and ultimately advance our knowledge in these diverse areas.