******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 5.3.3 (Release date: Sun Feb 7 15:39:52 2021 -0800) For further information on how to interpret these results please access https://meme-suite.org/meme. To get a copy of the MEME Suite software please access https://meme-suite.org. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** PRIMARY SEQUENCES= ../result/final_prediction/fly/fasta/RankLinear8.0_60/pan.fasta CONTROL SEQUENCES= --none-- ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ chrM:6072-6192 1.0000 120 chrM:4487-4607 1.0000 120 chrM:12471-12591 1.0000 120 chrM:14134-14254 1.0000 120 chrM:14406-14526 1.0000 120 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme -oc ../result/final_prediction/fly/inference_raw/MEME/RankLinear8.0_60_pan/ -dna -nmotifs 5 -w 8 -maxsize 250000 -nostatus ../result/final_prediction/fly/fasta/RankLinear8.0_60/pan.fasta model: mod= zoops nmotifs= 5 evt= inf objective function: em= E-value of product of p-values starts= E-value of product of p-values strands: + width: minw= 8 maxw= 8 nsites: minsites= 2 maxsites= 5 wnsites= 0.8 theta: spmap= uni spfuzz= 0.5 em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 trim: wg= 11 ws= 1 endgaps= yes data: n= 600 N= 5 sample: seed= 0 hsfrac= 0 searchsize= 600 norand= no csites= 1000 Letter frequencies in dataset: A 0.358 C 0.15 G 0.137 T 0.355 Background letter frequencies (from file dataset with add-one prior applied): A 0.358 C 0.151 G 0.137 T 0.354 Background model order: 0 ******************************************************************************** ******************************************************************************** MOTIF WCGACSKD MEME-1 width = 8 sites = 5 llr = 44 E-value = 3.9e+000 ******************************************************************************** -------------------------------------------------------------------------------- Motif WCGACSKD MEME-1 Description -------------------------------------------------------------------------------- Simplified A 4::a:::2 pos.-specific C :8::a6:: probability G 22a::444 matrix T 4:::::64 bits 2.9 * * 2.6 * * 2.3 * * 2.0 ** * Relative 1.7 ** ** Entropy 1.4 ***** (12.8 bits) 1.1 ****** 0.9 ****** 0.6 ******* 0.3 ******** 0.0 -------- Multilevel ACGACCTG consensus TG GGT sequence G A -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif WCGACSKD MEME-1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- -------- chrM:14134-14254 61 3.08e-06 AATAATAAGA GCGACGGG CGATGTGTAC chrM:4487-4607 30 1.21e-05 GAAATATTAT TCGACCTG GAACATTAGC chrM:14406-14526 5 3.70e-05 GATA ACGACGGT ATATAAACTG chrM:6072-6192 19 1.15e-04 GCAATTAGTT TCGACCTA ATCTTAGGTA chrM:12471-12591 66 1.60e-04 CAACTTTATT AGGACCTT TACGAATTTG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif WCGACSKD MEME-1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- chrM:14134-14254 3.1e-06 60_[+1]_52 chrM:4487-4607 1.2e-05 29_[+1]_83 chrM:14406-14526 3.7e-05 4_[+1]_108 chrM:6072-6192 0.00012 18_[+1]_94 chrM:12471-12591 0.00016 65_[+1]_47 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif WCGACSKD MEME-1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF WCGACSKD width=8 seqs=5 chrM:14134-14254 ( 61) GCGACGGG 1 chrM:4487-4607 ( 30) TCGACCTG 1 chrM:14406-14526 ( 5) ACGACGGT 1 chrM:6072-6192 ( 19) TCGACCTA 1 chrM:12471-12591 ( 66) AGGACCTT 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif WCGACSKD MEME-1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 8 n= 565 bayes= 7.06095 E= 3.9e+000 16 -897 54 17 -897 241 54 -897 -897 -897 286 -897 148 -897 -897 -897 -897 273 -897 -897 -897 199 154 -897 -897 -897 154 76 -84 -897 154 17 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif WCGACSKD MEME-1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 8 nsites= 5 E= 3.9e+000 0.400000 0.000000 0.200000 0.400000 0.000000 0.800000 0.200000 0.000000 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.600000 0.400000 0.000000 0.000000 0.000000 0.400000 0.600000 0.200000 0.000000 0.400000 0.400000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif WCGACSKD MEME-1 regular expression -------------------------------------------------------------------------------- [ATG][CG]GAC[CG][TG][GTA] -------------------------------------------------------------------------------- Time 0.40 secs. ******************************************************************************** ******************************************************************************** MOTIF GGWCCWTC MEME-2 width = 8 sites = 2 llr = 23 E-value = 1.9e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif GGWCCWTC MEME-2 Description -------------------------------------------------------------------------------- Simplified A ::5::5:: pos.-specific C :::aa::a probability G aa:::::: matrix T ::5::5a: bits 2.9 ** ** * 2.6 ** ** * 2.3 ** ** * 2.0 ** ** * Relative 1.7 ** ** * Entropy 1.4 ** ** ** (16.4 bits) 1.1 ** ** ** 0.9 ** ** ** 0.6 ******** 0.3 ******** 0.0 -------- Multilevel GGACCATC consensus T T sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif GGWCCWTC MEME-2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- -------- chrM:14406-14526 40 1.16e-05 ATTTAAGTAA GGTCCATC GTGGATTATC chrM:4487-4607 109 1.16e-05 AGGAAATACA GGACCTTC TATA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif GGWCCWTC MEME-2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- chrM:14406-14526 1.2e-05 39_[+2]_73 chrM:4487-4607 1.2e-05 108_[+2]_4 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif GGWCCWTC MEME-2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF GGWCCWTC width=8 seqs=2 chrM:14406-14526 ( 40) GGTCCATC 1 chrM:4487-4607 ( 109) GGACCTTC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif GGWCCWTC MEME-2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 8 n= 565 bayes= 8.13699 E= 1.9e+002 -765 -765 286 -765 -765 -765 286 -765 48 -765 -765 49 -765 272 -765 -765 -765 272 -765 -765 48 -765 -765 49 -765 -765 -765 149 -765 272 -765 -765 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif GGWCCWTC MEME-2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 8 nsites= 2 E= 1.9e+002 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.500000 0.000000 0.000000 0.500000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.500000 0.000000 0.000000 0.500000 0.000000 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif GGWCCWTC MEME-2 regular expression -------------------------------------------------------------------------------- GG[AT]CC[AT]TC -------------------------------------------------------------------------------- Time 0.84 secs. ******************************************************************************** ******************************************************************************** MOTIF CASKTTCC MEME-3 width = 8 sites = 2 llr = 22 E-value = 8.4e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif CASKTTCC MEME-3 Description -------------------------------------------------------------------------------- Simplified A :a:::::: pos.-specific C a:5:::aa probability G ::55:::: matrix T :::5aa:: bits 2.9 * ** 2.6 * ** 2.3 * ** 2.0 * ** Relative 1.7 * * ** Entropy 1.4 *** **** (15.6 bits) 1.1 ******** 0.9 ******** 0.6 ******** 0.3 ******** 0.0 -------- Multilevel CACGTTCC consensus GT sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif CASKTTCC MEME-3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- -------- chrM:14406-14526 68 2.90e-06 GATTAAAAAA CAGGTTCC TCTAGATAGA chrM:14134-14254 9 2.18e-05 TCTAGATA CACTTTCC AGTACATCTA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif CASKTTCC MEME-3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- chrM:14406-14526 2.9e-06 67_[+3]_45 chrM:14134-14254 2.2e-05 8_[+3]_104 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif CASKTTCC MEME-3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF CASKTTCC width=8 seqs=2 chrM:14406-14526 ( 68) CAGGTTCC 1 chrM:14134-14254 ( 9) CACTTTCC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif CASKTTCC MEME-3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 8 n= 565 bayes= 8.13699 E= 8.4e+002 -765 272 -765 -765 148 -765 -765 -765 -765 173 186 -765 -765 -765 186 49 -765 -765 -765 149 -765 -765 -765 149 -765 272 -765 -765 -765 272 -765 -765 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif CASKTTCC MEME-3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 8 nsites= 2 E= 8.4e+002 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.500000 0.500000 0.000000 0.000000 0.000000 0.500000 0.500000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif CASKTTCC MEME-3 regular expression -------------------------------------------------------------------------------- CA[CG][GT]TTCC -------------------------------------------------------------------------------- Time 1.29 secs. ******************************************************************************** ******************************************************************************** MOTIF RCTKTTCG MEME-4 width = 8 sites = 2 llr = 21 E-value = 1.3e+003 ******************************************************************************** -------------------------------------------------------------------------------- Motif RCTKTTCG MEME-4 Description -------------------------------------------------------------------------------- Simplified A 5::::::: pos.-specific C :a::::a: probability G 5::5:::a matrix T ::a5aa:: bits 2.9 * ** 2.6 * ** 2.3 * ** 2.0 * ** Relative 1.7 * ** Entropy 1.4 ** **** (15.2 bits) 1.1 ******** 0.9 ******** 0.6 ******** 0.3 ******** 0.0 -------- Multilevel ACTGTTCG consensus G T sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif RCTKTTCG MEME-4 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- -------- chrM:4487-4607 46 2.60e-06 TGGAACATTA GCTGTTCG ATTAACTGCT chrM:12471-12591 97 3.37e-05 ATATCCTAAA ACTTTTCG TTCTAATAAA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif RCTKTTCG MEME-4 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- chrM:4487-4607 2.6e-06 45_[+4]_67 chrM:12471-12591 3.4e-05 96_[+4]_16 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif RCTKTTCG MEME-4 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF RCTKTTCG width=8 seqs=2 chrM:4487-4607 ( 46) GCTGTTCG 1 chrM:12471-12591 ( 97) ACTTTTCG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif RCTKTTCG MEME-4 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 8 n= 565 bayes= 8.13699 E= 1.3e+003 48 -765 186 -765 -765 272 -765 -765 -765 -765 -765 149 -765 -765 186 49 -765 -765 -765 149 -765 -765 -765 149 -765 272 -765 -765 -765 -765 286 -765 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif RCTKTTCG MEME-4 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 8 nsites= 2 E= 1.3e+003 0.500000 0.000000 0.500000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.500000 0.500000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif RCTKTTCG MEME-4 regular expression -------------------------------------------------------------------------------- [AG]CT[GT]TTCG -------------------------------------------------------------------------------- Time 1.80 secs. ******************************************************************************** ******************************************************************************** MOTIF WGAGGMAT MEME-5 width = 8 sites = 2 llr = 20 E-value = 1.4e+003 ******************************************************************************** -------------------------------------------------------------------------------- Motif WGAGGMAT MEME-5 Description -------------------------------------------------------------------------------- Simplified A 5:a::5a: pos.-specific C :::::5:: probability G :a:aa::: matrix T 5::::::a bits 2.9 * ** 2.6 * ** 2.3 * ** 2.0 * ** Relative 1.7 * ** Entropy 1.4 **** ** (14.7 bits) 1.1 ******* 0.9 ******* 0.6 ******** 0.3 ******** 0.0 -------- Multilevel AGAGGAAT consensus T C sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif WGAGGMAT MEME-5 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- -------- chrM:6072-6192 69 1.25e-05 GAAGCCAAAA AGAGGCAT ATCACTGTTA chrM:12471-12591 37 4.23e-05 ACAAAAAGGT TGAGGAAT TCCTATTAAA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif WGAGGMAT MEME-5 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- chrM:6072-6192 1.3e-05 68_[+5]_44 chrM:12471-12591 4.2e-05 36_[+5]_76 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif WGAGGMAT MEME-5 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF WGAGGMAT width=8 seqs=2 chrM:6072-6192 ( 69) AGAGGCAT 1 chrM:12471-12591 ( 37) TGAGGAAT 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif WGAGGMAT MEME-5 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 8 n= 565 bayes= 8.13699 E= 1.4e+003 48 -765 -765 49 -765 -765 286 -765 148 -765 -765 -765 -765 -765 286 -765 -765 -765 286 -765 48 173 -765 -765 148 -765 -765 -765 -765 -765 -765 149 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif WGAGGMAT MEME-5 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 8 nsites= 2 E= 1.4e+003 0.500000 0.000000 0.000000 0.500000 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.500000 0.500000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 1.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif WGAGGMAT MEME-5 regular expression -------------------------------------------------------------------------------- [AT]GAGG[AC]AT -------------------------------------------------------------------------------- Time 2.19 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- chrM:6072-6192 4.19e-03 68_[+5(1.25e-05)]_44 chrM:4487-4607 2.23e-06 29_[+1(1.21e-05)]_8_[+4(2.60e-06)]_\ 55_[+2(1.16e-05)]_4 chrM:12471-12591 3.21e-05 36_[+5(4.23e-05)]_52_[+4(3.37e-05)]_\ 16 chrM:14134-14254 6.09e-04 8_[+3(2.18e-05)]_44_[+1(3.08e-06)]_\ 52 chrM:14406-14526 4.56e-06 4_[+1(3.70e-05)]_27_[+2(1.16e-05)]_\ 20_[+3(2.90e-06)]_45 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because requested number of motifs (5) found. ******************************************************************************** CPU: c27n10.farnam.hpc.yale.internal ********************************************************************************