[Socbin] Student project at EBI Microarray group

Margus Lukk lukk at cs.helsinki.fi
Wed Nov 18 21:30:38 CET 2009


Dear All,

Please feel free to distribute the student project below.

Best regards,
Margus




NEW Interface for Human Gene Expression Map
Project Description:

Thousands of microarray experiments are performed every year to estimate 
the relative abundance of different RNA molecules in different biological 
conditions. Most of these experiments concentrate on comparing a small 
number of biological conditions such as a particular disease state 
compared to a normal condition. However, many more different biological 
states, such as rare diseases or particular cell subtypes exist. It is 
impractical for a single dedicated experiment to generate a comprehensive 
expression data set covering all biological conditions but we can achieve 
this by using computational integration of the wealth of experiments that 
have already been performed and are available from public databases.

We collected, from the public microarray repositories GEO [1] and 
ArrayExpress [2], over 9000 raw data files generated from the human gene 
expression array Affymetrix U133A and derived a global Human Gene 
Expression Map by integrating raw data and harmonizing annotations from 
thousands of samples. For each gene we then computed its expression 
profile characterizing its expression under each of the studied biological 
conditions and sample classes. This project is part of an ongoing effort 
of integrative analysis across very large collections of microarray 
datasets, which has resulted in the release of a new database, the Gene 
Expression Atlas [2].

This project aims at implementing, based on the current prototype, a 
public interface for the Human Gene Expression Map. Through this 
interface, users will be able to visualize the map, explore expression 
profiles and differentially expressed genes under particular experimental 
and biological conditions, and link to other bioinformatics resources 
providing genes functional annotation.
Project Learning Summary:

The student will acquire experience in the development of web interfaces 
for explorative bioinformatics analysis, a deeper understanding of 
microarray data analysis, as well as meta-analytical approaches and 
statistical methods for data mining of large-scale datasets. The student 
will also gain insights into the production process of a large European 
transcriptomics database.
Objectives:

1. Further develop visualization and browsing features in the existing 
prototype (http://wwwdev.ebi.ac.uk/microarray/hge/HGE.jsp);
2. Implement a more advanced interface for exploring the Human Gene 
Expression Map;
3. Implement basic descriptive statistics for data exploration; and
4. Retrieve or link information about gene annotation from other 
bioinformatics resources

Required Student Knowledge:

1. Good to expert level of programming ability with Java and R;
2. Familiarity with some of the aspects of bioinformatics would be 
advantageous, especially transcriptomics and microarray data analysis; and
3. Ability to document code and work progress

Please note that candidates must be registered at a university throughout 
the entire duration of their stay at EBI and the work completed at EBI 
should form part of the candidate's final year studies.
Project duration:

approx. 6 months

Please email a cv and a cover letter to Gabriella Rustici 
(gabry at ebi.ac.uk).


References:

1. Barrett, T. et al. NCBI GEO: archive for high-throughput functional 
genomic data. Nucleic Acids Res 37, D885-90 (2009). 10.
2. Parkinson, H. et al. ArrayExpress update--from an archive of 
functional genomics experiments to the atlas of gene expression. Nucleic 
Acids Res 37, D868-72 (2009).


More information about the SocBiN mailing list