Databases & Webservers

CyanoEXpress is a web database for interactive exploration and visualisation of transcriptional response patterns in Synechocystis sp. PCC6803. It comprises expression data from more than 700 transcriptome measurements carried out in over 30 independent studies. Notably, changes in expression during both environmental and genetic perturbations are included in the integrated data set. The current version enables the inspection of transcriptional responses of a defined set of curated processes in Synechocystis as well as user-defined gene clusters.

HeartEXpress is a web-based platform for the analysis of integrated expression datasets associated with cardiomyogenesis. The current version comprises data from 120 microarray measurements for stem cell differentiation, in vitro or in vivo reprogramming in the context of cardiomyogenesis, and heart development.

HeartmiR is a webserver for querying of microRNAs (miRs) or mRNAs to identify relevant miRNA-mRNA interactions for late heart development in mouse (Mus musculus). It is designed to serve as comprehensive tool for identification, visualisation and assessment of miR-mRNA interactions for heart development. The current version of HeartmiR is based on the integration of transcriptomic data from our study of the murine embryonic heart and a set of 102,083 experimentally detected or computationally predicted interactions between 386 miRNAs and 9,211 target genes.

Huntington's Disease Network Database (HDNetDB) enables the network-based analysis of Huntington's disease. It is based on the integration of human molecular interactions with data and information relevant to Huntington's disease (HD). Interaction partners of given genes or proteins can be queried and visualized in form of interaction networks. HDNetDB was especially designed as a platform for investigations into the molecular mechanisms associated with HD. Retrieved networks and their components can be analyzed with respect to functional relevance, expression patterns in human HD patients and disease models as well as known association with HD.

StemChecker is a web-based tool that enables researchers to rapidly check whether a given list of genes can be linked to stemness. For this purpose, we curated numerous published stemness signatures derived by alternative approaches. StemChecker examines whether genes uploaded by the user are included in the curated set of stemness signatures and evaluates the statistical significance. The results are displayed in alternative formats, showing the potential association with stemness signatures of individual genes, as well as of the whole set of inputted genes. Additionally, StemChecker indicates whether genes are targeted by a set of transcription factors linked to pluripotency and stem cell maintenance.

StemCellNet is an interactive web server for network analysis and visualization in stem cell biology. It gives access to a large collection of curated physical and regulatory interactions identified in human and murine stem cells and features various easy-to-use tools for selection and prioritization of network components, as well as for integration of expression data. StemCellNet can indicate novel candidate genes by evaluating their connectivity patterns. It is the only current platform, which allows the screening of networks for stemness-associated genes and potential target candidates. With its comprehensive coverage of the human interactome, it is a powerful tool not only for stem cell researchers, but also for researchers working on degenerative diseases and on cancer to identify stemness signatures in molecular networks of interest.

StemMapper is a manually curated database containing gene expression data for different lineages of Stem Cells (SCs). StemMapper currently holds almost 1000 human and murine transcriptome measurements, collected from NCBIfs Gene Expression Omnibus (GEO), and provides a comprehensive coverage of expression profiles for 51 types of murine SCs, progenitor cells and their progeny as well as 19 types of human SCs, progenitor cells and their progeny. Transcriptomics datasets underwent standardized processing and stringent quality control to minimize artefacts. StemMapper features an intuitive interface for visual comparison of gene expression across different SC types. Researchers can easily check previously reported expression values for their genes of interest, while comparing them across different SC lineages.

Unified Human Interactome (UniHI) is comprehensive platform for retrival and analysis of human molecular interactions. Currently, UniHI integrates human protein-protein, transcriptional regulatory and drug-target interactions from 16 resources. In total, almost 400,000 unique molecular interactions are currently included. Additionally, various phenotypic information and disease association have been integrated. The UniHI web-server includes tools (i) to search for molecular interaction partners of query genes or proteins in the integrated dataset, (ii) to inspect the origin, evidence and functional annotation of retrieved proteins and interactions, (iii) to visualize and adjust the resulting interaction network, (iv) to filter interactions based on method of derivation, evidence and type of experiment as well as based on gene expression data or gene lists and (v) to analyze the functional composition of interaction networks.

Huntington's disease (HD) is a fatal neurodegenerative disease, for which no cure has yet been found. Besides other mechanisms, the unfolded protein response (UPR) has been put forward to play a role in the pathogenesis in HD. As an adaptive response, UPR can counter-balance accumulation of mis-folded proteins, but can also trigger cell death if it persists. Our analysis published in F1000Research suggested that UPR is activated in various HD models, and presents an attractive target for therapeutic interventions. To help independent researchers to study the role of UPR in HD, we established the online tool UPR-HD. It enables researchers to search, interactively analyze and visualize the expression patterns of UPR-associated genes across various HD expression data sets.

Some R packages for gene expression analysis

We also maintain some Bioconductor packages for analysis of high-througput gene expression data:

The cycle package enables a more reliable detection of periodic patterns in gene expression time series data by the use of more appropriate background models. Many methods for detection of periodicity rely on data normality or the extensive use of permutation tests neglecting the fact that time series data exhibit generally a considerable autocorrelation i.e. correlation between successive measurements. Therefore, neither the assumptions of data normality nor for randomizations may hold- a failure that can substantially interfere with the significance testing, and can typically lead to overestimating of the number of periodically expressed genes, as our analysis published in (Bioinformatics 2008) showed. The cycle package includes functions for the detection of periodic patterns in gene expression time series data and the calculation of the statistical significiance based on different background models. The current version of the cycle package can be downloaded from the: Bioconductor repository.

Mfuzz is a Bioconductor package for the soft clustering of gene expression data. This contrasts conventional hard clustering which assigns each gene or protein to exactly one cluster. Hard clustering is favourable if clusters are well separated. However, this is generally not the case for gene expression data, where gene/protein clusters frequently overlap. Mfuzz overcomes this limitation by using fuzzy c-means clustering allowing genes/proteins to be assigned to several clusters. This can be used for the a more robust detection of expression patterns and targeted search for regulatory elements as our publication in JBCB demonstrated. The recent version of Mfuzz include also a graphical user interface (described in an article in Bioinformation) for convenient use. The current version of Mfuzz can be downloaded from the Bioconductor repository.

OLIN is a package for optimised normalization of two-color microarrays. It enables correction for location- and intensity-dependent biases through iterative local regression and model selection. The package also includes functions to assess the presense of systematic errors in two-color microarray data. The underlying approach was publisched in Genome Biology. The complimentary OLINgui package provides a graphical user interface to OLIN and is described in a publication in Bioinformatics. Current versions of OLIN and OLINgui can be downloaded from the Bioconductor repository: OLIN, OLINGui.