Motif finding
The first step in MMF is launching motif discovery programs. Currently, four of them are available:
  • BioProspector
BioProspector description
  • MDscan
MDscan description
  • MEME
MEME description
  • Weeder
Weeder description
Each of these programs can be downloaded and used locally.

For all of them at once user can set the following parameters: At the end of this step, the programs' results are gathered together.
[+ TODO: Inner filtering]

BioProspector is a program using a Gibbs sampling strategy, and Markov background to model the base dependencies of non-motif bases.

Reference: Liu X, Brutlag DL, Liu JS. BioProspector: discovering conserved DNA motifs in upstream regulatory regions of co-expressed genes. Pacific Symposium on Biocomputing 2001;:127-38.

Website: http://robotics.stanford.edu/~xsliu/BioProspector/

License: MIT license


MDscan is a program designed specially for ChIP-array experiments, however can be used in other experiments where some of the sequences may contain motif sites. The algorithm combines the advantages of two search strategies: word enumeration and iterative updating of motif's PSSM.

Reference: Liu XS, Brutlag DL, Liu JS, An algorithm for finding protein-DNA binding sites with applications to chromatin immunoprecipitation microarray experiment, Nature Biotechnology 2002 Aug;20(8):835-9.

Website: http://ai.stanford.edu/~xsliu/MDscan/

License: MIT license


MEME(Multiple EM For Motif Elicitation) tool uses a statistical method (EM - Expectation Maximisation) for identifying highly conserved regions.

References: Timothy L. Bailey, Charles Elkan, Fitting a mixture model by expectation maximization to discover motifs in biopolymers, Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, (28-36), AAAI Press, 1994.

Timothy L. Bailey, Nadya Williams, Chris Misleh, and Wilfred W. Li, MEME: discovering and analyzing DNA and protein sequence motifs, Nucleic Acids Research, Vol. 34, pp. W369-W373, 2006.

Website: http://meme.sdsc.edu/meme/meme.html

License: MEME is copyrighted software and can be licensed for commercial use.


Weeder searches for candidate motifs by scanning a suffix tree built for input sequences. Additionally, the program uses a background model based on pre-computed frequencies of all possible 6- and 8-bp subsequences from several most important organisms.

Reference: Giulio Pavesi, Giancarlo Mauri, Graziano Pesole, An algorithm for finding signals of unknown length in DNA sequence, Bioinformatics, Vol. 17 No Suppl. 1, June 2001, Pages: S207-S214.

Website: http://159.149.109.16:8080/weederWeb/

License: Please see Weeder license.


back