| Function prediction using Markov random fields |
|
|
Function prediction using Markov random fieldsA large amount of data from a variety of experiments is available on possible protein interactions. We are interested in how these data can be used to predict the functions these proteins perform in the organism. Even without knowing details about how the proteins interact, there is statistical evidence that interacting proteins have an increased probability to be functionally related. Based on probabilistic models such a Markov random fields, we try to find statistical parameters to identify possible candidate for new functional assignments. Soft clustering and biclusteringClustering gene expression data involves identifying genes that have similar expression patterns over a variety of experiments. Traditionally, such clustering assigns a discrete cluster label to each gene. We are investigating clustering methods based on multidimensional scaling. In this method, genes are assigned coordinates in a low-dimensional space in such a way that genes with similar expression patterns are assigned places close to each other. Applied in two dimensions, this creates a planar map in which clusters can be visually identified and relations between clusters investigated interactively. By using this method both to map genes and experiments, we are looking for characteristic patterns in gene expression data that can serve as input to network inference applications.
An extension to this approach are biclustering and feature selection methods in which we try to identify features that are characteristic for certain clusters and partition the feature space in such a way that correlations and regulatory relationships become visible. Network inference using Gaussian processesLarge-scale gene expression data provide us with information about how the expression values of different genes are correlated in a variety of experimental settings. However, such a correlation does not immediately imply a functional relation. Graphical models are tools to derive functional relations by trying to match a probabilistic model of genetic regulation to actual experimental data. We investigate models based on linear correlations and Gaussian processes to describe the regulatory relationships between genes. Such models are conceptually simple as they describe well-known linear regulation, but become computationally expensive when a large number of genes is involved.
|
||||||||||||||
|
||||||||||||||