News
- 28.04.2009: No-ranking option available: --ranking=0 (this option saves time and memory)
- 23.04.2009: New code versions with improved efficiency available (see Download)
- 06.02.2009: The article Enumeration of condition-dependent dense modules in protein interaction networks is accepted for publication in Bioinformatics.
Introduction
DME (Dense Module Enumeration) is a C++ software for extracting dense modules from a weighted interaction network. It allows to incorporate constraints with respect to additional data sources.
Given an undirected weighted graph with node set V, let W be the symmetric matrix that represents the interaction weights between all pairs of nodes (if no edge is given, the interaction weight is zero by default). Let U be a subset of nodes. Then the density of U with respect to W is defined as the average pairwise interaction weight between all members of U. DME detects all node subsets that satisfy a user-defined minimum density threshold. We refer to these node subsets as dense modules.
In addition, the module search can respect consistency constraints with respect to external profile data. Given some discrete profile for each node, a module is called consistent if there exists a subprofile which is shared by all member nodes. For example, if each node corresponds to a protein, the profile could indicate its presence or absence across multiple cellular conditions. For each of the discrete states (for example, presence or absence), the user can define the minimum required number of profile conditions for which all module members are in this state. DME yields all dense modules that satisfy the specified consistency constraints.
The method only outputs locally maximal solutions, i.e. modules where all direct supermodules (containing one additional node) do not satisfy the minimum density threshold. The obtained modules are ranked according to the probability that a random selection of the same number of nodes produces a module with at least the same density. For a detailed description of the method, please see the publications section.
Note:
For the special case of dense module enumeration in an unweighted graph and without additional constraints and module ranking,
it is most efficient to use the PCE software by T. Uno (Method description).
Download
License: DME is free software under the GNU General Public License, version 2. A copy of the license document is included in the DME distribution.
- Current DME Version (improved efficiency): A zip file including the DME source code, example files and a README file.
- DME Version 0: A zip file including the DME source code, example files and a README file.
If the above version consumes too much memory for your application, please use the --ranking=0 option (no ranking) or, to obtain the module-ranking in a memory-efficient way, try the following version (which might be more time-consuming).
- DME with Memory-Efficient Ranking: A zip file including the DME source code, example files and a README file.
PCE (pseudo clique enumerator) by T. Uno: Very efficient implementation for the special case of dense module enumeration in an unweighted graph and without additional constraints and module ranking.
DME Manual
Compilation:
>> make
Usage:
Basic module enumeration method:
>> ./dme m <input_file> <output_file> <density_threshold> [options]
With consistency constraints:
>> ./dme c <input_file> <output_file> <density_threshold>
<profile_file>
<number_of_states_s> <threshold_state_0> ... <threshold_state_(s-1)> [options]
-
Input file:
The input file contains the weights of non-zero interactions in tab-delimited format, where the first column contains a node index, the second column contains a node index, and the third column contains the corresponding interaction weight. Node indices start from 0. Interaction weights can be positive or negative, but the minWeight option (see below) currently only works for non-negative data.
Example:0 1 1
0 2 0.9
1 2 0.8 -
Optional profile file:
The profile file for consistency constraints contains the node profiles in tab-delimited format. Each line contains the profile of the corresponding node index (first line corresponds to node 0 etc.). Each column corresponds to one condition. Each entry corresponds to the state of the given node under the given condition. The states (e.g. "present" and "absent") are encoded as integers from 0 to s-1. As additional parameters, DME expects the number of different states, s, as well as the desired minimum number of consistent conditions for each individual state. Given a module, a condition is consistent with respect to state i (0<=i<=s-1) if all module members are in state i.
Example profile (with two states, three nodes, and four conditions):1 1 1 0
1 1 1 0
1 1 0 0 -
Output file:
The output file lists the obtained module results in tab-delimited format. The columns contain the following information:- Ranking score (negative log10 of the probability value)
(note: this column is missing if the --ranking=0 option is used) - Density (relative to the maximum interaction weight, thus between 0 and 1)
- Module size
- Module nodes (comma-separated, indices starting from 0)
- Ranking score (negative log10 of the probability value)
-
Density threshold:
Minimum density threshold for modules (between 0 and 1). The density is defined as the average pairwise interaction weight within the cluster, relative to the maximum interaction weight in the given network. Thus, the maximum density is 1. In principle, this threshold can be set to any value >0. However, depending on the size and the density of the given network and on the type of consistency constraints, the computation might take long. Thus, it is recommended to start with a density threshold close to 1, which then can be lowered step by step. -
Options:
--minSize=%d Minimal size of a module
--maxSize=%d Maximal size of a module
--minWeight=%f Weighted degree threshold: must be exceeded by each node within a module
--ranking=0 Do not perform probability-based ranking (this option saves time and memory)
Examples:
>> ./dme m data/ppi.txt res/output.txt 0.75 --minSize=3
Please note that consistency constraints have to be specified for each possible state individually.
For states where you do not want to impose a constraint, you can enter a 0.
>> ./dme m data/ppi.txt res/output1.txt 0.75 --minSize=3 --ranking=0
>> ./dme c data/ppi.txt res/output2.txt 0.75 data/nodeInfo.txt 2 0 3
>> ./dme c data/ppi.txt res/output3.txt 0.75 data/nodeInfo.txt 2 1 1
Publications
- Elisabeth Georgii, Sabine Dietmann, Takeaki Uno, Philipp Pagel, Koji Tsuda (2009) Enumeration of condition-dependent dense modules in protein interaction networks. Bioinformatics 25(7):933-940. Link
Contact