******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 5.3.3 (Release date: Sun Feb 7 15:39:52 2021 -0800) For further information on how to interpret these results please access https://meme-suite.org/meme. To get a copy of the MEME Suite software please access https://meme-suite.org. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** PRIMARY SEQUENCES= ../result/final_prediction/fly/fasta/RankLinear8.0_60/usp.fasta CONTROL SEQUENCES= --none-- ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ chrM:12831-12951 1.0000 120 chrM:2408-2528 1.0000 120 chr2L:8162151-8162271 1.0000 120 chr3L:15141271-15141391 1.0000 120 chrM:9767-9887 1.0000 120 chrM:6073-6193 1.0000 120 chrM:12483-12603 1.0000 120 chrM:12018-12138 1.0000 120 chrM:4908-5028 1.0000 120 chr3L:17972627-17972747 1.0000 120 chrM:13255-13375 1.0000 120 chr2L:15911709-15911829 1.0000 120 chrM:14124-14244 1.0000 120 chr2R:9548127-9548247 1.0000 120 chrM:11219-11339 1.0000 120 chrM:7424-7544 1.0000 120 chrX:1951582-1951702 1.0000 120 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme -oc ../result/final_prediction/fly/inference_raw/MEME/RankLinear8.0_60_usp/ -dna -nmotifs 5 -w 8 -maxsize 250000 -nostatus ../result/final_prediction/fly/fasta/RankLinear8.0_60/usp.fasta model: mod= zoops nmotifs= 5 evt= inf objective function: em= E-value of product of p-values starts= E-value of product of p-values strands: + width: minw= 8 maxw= 8 nsites: minsites= 2 maxsites= 17 wnsites= 0.8 theta: spmap= uni spfuzz= 0.5 em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 trim: wg= 11 ws= 1 endgaps= yes data: n= 2040 N= 17 sample: seed= 0 hsfrac= 0 searchsize= 2040 norand= no csites= 1000 Letter frequencies in dataset: A 0.337 C 0.185 G 0.171 T 0.307 Background letter frequencies (from file dataset with add-one prior applied): A 0.337 C 0.185 G 0.171 T 0.307 Background model order: 0 ******************************************************************************** ******************************************************************************** MOTIF MGCTGYAG MEME-1 width = 8 sites = 5 llr = 52 E-value = 1.8e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif MGCTGYAG MEME-1 Description -------------------------------------------------------------------------------- Simplified A 4:::::8: pos.-specific C 6:a::62: probability G :a::a::a matrix T :::a:4:: bits 2.5 ** * * 2.3 ** * * 2.0 ** * * 1.8 **** * Relative 1.5 **** * Entropy 1.3 ***** * (15.1 bits) 1.0 ******** 0.8 ******** 0.5 ******** 0.3 ******** 0.0 -------- Multilevel CGCTGCAG consensus A TC sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif MGCTGYAG MEME-1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- -------- chr3L:17972627-17972747 72 3.28e-06 CATGCAAATT CGCTGCAG TTCAGGCAGa chrX:1951582-1951702 46 5.07e-06 ACAGACTGCG CGCTGCCG AGTTCAACAA chr2L:15911709-15911829 74 1.05e-05 CAGCGCGTGA CGCTGTAG AAACGGTGCT chrM:7424-7544 90 1.65e-05 GAGCAGCTAT AGCTGCAG GTAACCAAGA chr2L:8162151-8162271 8 3.26e-05 TTGTACA AGCTGTAG AGCTCCAAAA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif MGCTGYAG MEME-1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- chr3L:17972627-17972747 3.3e-06 71_[+1]_41 chrX:1951582-1951702 5.1e-06 45_[+1]_67 chr2L:15911709-15911829 1.1e-05 73_[+1]_39 chrM:7424-7544 1.6e-05 89_[+1]_23 chr2L:8162151-8162271 3.3e-05 7_[+1]_105 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif MGCTGYAG MEME-1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF MGCTGYAG width=8 seqs=5 chr3L:17972627-17972747 ( 72) CGCTGCAG 1 chrX:1951582-1951702 ( 46) CGCTGCCG 1 chr2L:15911709-15911829 ( 74) CGCTGTAG 1 chrM:7424-7544 ( 90) AGCTGCAG 1 chr2L:8162151-8162271 ( 8) AGCTGTAG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif MGCTGYAG MEME-1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 8 n= 1921 bayes= 8.83409 E= 1.8e+001 25 170 -897 -897 -897 -897 254 -897 -897 243 -897 -897 -897 -897 -897 170 -897 -897 254 -897 -897 170 -897 38 125 11 -897 -897 -897 -897 254 -897 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif MGCTGYAG MEME-1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 8 nsites= 5 E= 1.8e+001 0.400000 0.600000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.600000 0.000000 0.400000 0.800000 0.200000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif MGCTGYAG MEME-1 regular expression -------------------------------------------------------------------------------- [CA]GCTG[CT][AC]G -------------------------------------------------------------------------------- Time 0.49 secs. ******************************************************************************** ******************************************************************************** MOTIF SGRCVGGY MEME-2 width = 8 sites = 5 llr = 48 E-value = 9.8e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif SGRCVGGY MEME-2 Description -------------------------------------------------------------------------------- Simplified A ::4:2::: pos.-specific C 6::a4:24 probability G 4a6:4a8: matrix T :::::::6 bits 2.5 * * * 2.3 * * * 2.0 * * * 1.8 * * ** Relative 1.5 ** * ** Entropy 1.3 **** ** (13.8 bits) 1.0 **** *** 0.8 ******** 0.5 ******** 0.3 ******** 0.0 -------- Multilevel CGGCCGGT consensus G A G CC sequence A -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif SGRCVGGY MEME-2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- -------- chrX:1951582-1951702 8 9.86e-06 CAGACAA GGGCCGGT GCCCAACAAG chrM:14124-14244 72 1.17e-05 ATAATAAGAG CGACGGGC GATGTGTACA chr2L:15911709-15911829 100 1.67e-05 CTTTCTTTTT CGACGGGT CAACGAACCC chr2L:8162151-8162271 68 2.74e-05 TTAACATCTT CGGCCGCC TGGTTGAATT chrM:13255-13375 84 4.76e-05 AATTTTCAGT GGGCAGGT TAGACTTTAT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif SGRCVGGY MEME-2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- chrX:1951582-1951702 9.9e-06 7_[+2]_105 chrM:14124-14244 1.2e-05 71_[+2]_41 chr2L:15911709-15911829 1.7e-05 99_[+2]_13 chr2L:8162151-8162271 2.7e-05 67_[+2]_45 chrM:13255-13375 4.8e-05 83_[+2]_29 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif SGRCVGGY MEME-2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF SGRCVGGY width=8 seqs=5 chrX:1951582-1951702 ( 8) GGGCCGGT 1 chrM:14124-14244 ( 72) CGACGGGC 1 chr2L:15911709-15911829 ( 100) CGACGGGT 1 chr2L:8162151-8162271 ( 68) CGGCCGCC 1 chrM:13255-13375 ( 84) GGGCAGGT 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif SGRCVGGY MEME-2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 8 n= 1921 bayes= 9.52718 E= 9.8e+002 -897 170 122 -897 -897 -897 254 -897 25 -897 181 -897 -897 243 -897 -897 -75 111 122 -897 -897 -897 254 -897 -897 11 222 -897 -897 111 -897 96 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif SGRCVGGY MEME-2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 8 nsites= 5 E= 9.8e+002 0.000000 0.600000 0.400000 0.000000 0.000000 0.000000 1.000000 0.000000 0.400000 0.000000 0.600000 0.000000 0.000000 1.000000 0.000000 0.000000 0.200000 0.400000 0.400000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.200000 0.800000 0.000000 0.000000 0.400000 0.000000 0.600000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif SGRCVGGY MEME-2 regular expression -------------------------------------------------------------------------------- [CG]G[GA]C[CGA]G[GC][TC] -------------------------------------------------------------------------------- Time 0.92 secs. ******************************************************************************** ******************************************************************************** MOTIF GAGCGAGC MEME-3 width = 8 sites = 2 llr = 25 E-value = 2.9e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif GAGCGAGC MEME-3 Description -------------------------------------------------------------------------------- Simplified A :a:::a:: pos.-specific C :::a:::a probability G a:a:a:a: matrix T :::::::: bits 2.5 * *** ** 2.3 * *** ** 2.0 * *** ** 1.8 * *** ** Relative 1.5 ******** Entropy 1.3 ******** (18.2 bits) 1.0 ******** 0.8 ******** 0.5 ******** 0.3 ******** 0.0 -------- Multilevel GAGCGAGC consensus sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif GAGCGAGC MEME-3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- -------- chrX:1951582-1951702 25 3.32e-06 TGCCCAACAA GAGCGAGC GAGACAGACT chr3L:17972627-17972747 34 3.32e-06 ACGAAGTAAA GAGCGAGC TGTGCTCTCA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif GAGCGAGC MEME-3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- chrX:1951582-1951702 3.3e-06 24_[+3]_88 chr3L:17972627-17972747 3.3e-06 33_[+3]_79 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif GAGCGAGC MEME-3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF GAGCGAGC width=8 seqs=2 chrX:1951582-1951702 ( 25) GAGCGAGC 1 chr3L:17972627-17972747 ( 34) GAGCGAGC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif GAGCGAGC MEME-3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 8 n= 1921 bayes= 9.90614 E= 2.9e+002 -765 -765 254 -765 157 -765 -765 -765 -765 -765 254 -765 -765 243 -765 -765 -765 -765 254 -765 157 -765 -765 -765 -765 -765 254 -765 -765 243 -765 -765 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif GAGCGAGC MEME-3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 8 nsites= 2 E= 2.9e+002 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif GAGCGAGC MEME-3 regular expression -------------------------------------------------------------------------------- GAGCGAGC -------------------------------------------------------------------------------- Time 1.31 secs. ******************************************************************************** ******************************************************************************** MOTIF CACCGGTT MEME-4 width = 8 sites = 2 llr = 24 E-value = 1.1e+003 ******************************************************************************** -------------------------------------------------------------------------------- Motif CACCGGTT MEME-4 Description -------------------------------------------------------------------------------- Simplified A :a:::::: pos.-specific C a:aa:::: probability G ::::aa:: matrix T ::::::aa bits 2.5 * **** 2.3 * **** 2.0 * **** 1.8 * ****** Relative 1.5 ******** Entropy 1.3 ******** (17.4 bits) 1.0 ******** 0.8 ******** 0.5 ******** 0.3 ******** 0.0 -------- Multilevel CACCGGTT consensus sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif CACCGGTT MEME-4 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- -------- chr2R:9548127-9548247 71 5.88e-06 CGCGTTGGCA CACCGGTT CATTGAACTT chrM:12831-12951 7 5.88e-06 GGCTTA CACCGGTT TGAACTCAGA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif CACCGGTT MEME-4 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- chr2R:9548127-9548247 5.9e-06 70_[+4]_42 chrM:12831-12951 5.9e-06 6_[+4]_106 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif CACCGGTT MEME-4 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF CACCGGTT width=8 seqs=2 chr2R:9548127-9548247 ( 71) CACCGGTT 1 chrM:12831-12951 ( 7) CACCGGTT 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif CACCGGTT MEME-4 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 8 n= 1921 bayes= 9.90614 E= 1.1e+003 -765 243 -765 -765 157 -765 -765 -765 -765 243 -765 -765 -765 243 -765 -765 -765 -765 254 -765 -765 -765 254 -765 -765 -765 -765 170 -765 -765 -765 170 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif CACCGGTT MEME-4 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 8 nsites= 2 E= 1.1e+003 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif CACCGGTT MEME-4 regular expression -------------------------------------------------------------------------------- CACCGGTT -------------------------------------------------------------------------------- Time 1.70 secs. ******************************************************************************** ******************************************************************************** MOTIF GGTKTAGG MEME-5 width = 8 sites = 5 llr = 46 E-value = 3.1e+003 ******************************************************************************** -------------------------------------------------------------------------------- Motif GGTKTAGG MEME-5 Description -------------------------------------------------------------------------------- Simplified A :::::a:: pos.-specific C :2::::2: probability G a8:42:6a matrix T ::a68:2: bits 2.5 * * 2.3 * * 2.0 * * 1.8 *** * Relative 1.5 *** * * Entropy 1.3 *** ** * (13.4 bits) 1.0 ******** 0.8 ******** 0.5 ******** 0.3 ******** 0.0 -------- Multilevel GGTTTAGG consensus C GG C sequence T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif GGTKTAGG MEME-5 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- -------- chrM:7424-7544 73 4.64e-06 AGCAGAAACA GGTGTAGG AGCAGCTATA chrM:12483-12603 22 2.02e-05 ATCACAAAAA GGTTGAGG AATTCCTATT chrM:4908-5028 62 3.93e-05 AGTAACTATT GGTTTACG ATGAGGAATA chr3L:17972627-17972747 110 5.66e-05 tatatatttg ggtgtatg tat chrM:2408-2528 80 5.66e-05 TATTTTATGA GCTTTAGG ATTTGTTTTT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif GGTKTAGG MEME-5 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- chrM:7424-7544 4.6e-06 72_[+5]_40 chrM:12483-12603 2e-05 21_[+5]_91 chrM:4908-5028 3.9e-05 61_[+5]_51 chr3L:17972627-17972747 5.7e-05 109_[+5]_3 chrM:2408-2528 5.7e-05 79_[+5]_33 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif GGTKTAGG MEME-5 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF GGTKTAGG width=8 seqs=5 chrM:7424-7544 ( 73) GGTGTAGG 1 chrM:12483-12603 ( 22) GGTTGAGG 1 chrM:4908-5028 ( 62) GGTTTACG 1 chr3L:17972627-17972747 ( 110) GGTGTATG 1 chrM:2408-2528 ( 80) GCTTTAGG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif GGTKTAGG MEME-5 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 8 n= 1921 bayes= 9.52718 E= 3.1e+003 -897 -897 254 -897 -897 11 222 -897 -897 -897 -897 170 -897 -897 122 96 -897 -897 22 138 157 -897 -897 -897 -897 11 181 -62 -897 -897 254 -897 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif GGTKTAGG MEME-5 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 8 nsites= 5 E= 3.1e+003 0.000000 0.000000 1.000000 0.000000 0.000000 0.200000 0.800000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.400000 0.600000 0.000000 0.000000 0.200000 0.800000 1.000000 0.000000 0.000000 0.000000 0.000000 0.200000 0.600000 0.200000 0.000000 0.000000 1.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif GGTKTAGG MEME-5 regular expression -------------------------------------------------------------------------------- G[GC]T[TG][TG]A[GCT]G -------------------------------------------------------------------------------- Time 2.09 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- chrM:12831-12951 1.68e-02 6_[+4(5.88e-06)]_106 chrM:2408-2528 6.25e-02 79_[+5(5.66e-05)]_33 chr2L:8162151-8162271 4.94e-05 7_[+1(3.26e-05)]_52_[+2(2.74e-05)]_\ 45 chr3L:15141271-15141391 5.30e-01 120 chrM:9767-9887 1.00e+00 120 chrM:6073-6193 9.99e-01 120 chrM:12483-12603 1.32e-01 21_[+5(2.02e-05)]_91 chrM:12018-12138 2.84e-01 120 chrM:4908-5028 2.10e-01 61_[+5(3.93e-05)]_51 chr3L:17972627-17972747 2.60e-06 33_[+3(3.32e-06)]_30_[+1(3.28e-06)]_\ 30_[+5(5.66e-05)]_3 chrM:13255-13375 1.29e-01 83_[+2(4.76e-05)]_29 chr2L:15911709-15911829 2.34e-05 73_[+1(1.05e-05)]_18_[+2(1.67e-05)]_\ 13 chrM:14124-14244 4.52e-02 71_[+2(1.17e-05)]_41 chr2R:9548127-9548247 2.70e-02 70_[+4(5.88e-06)]_42 chrM:11219-11339 8.34e-01 120 chrM:7424-7544 3.56e-05 72_[+5(4.64e-06)]_9_[+1(1.65e-05)]_\ 23 chrX:1951582-1951702 1.02e-06 7_[+2(9.86e-06)]_9_[+3(3.32e-06)]_\ 13_[+1(5.07e-06)]_67 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because requested number of motifs (5) found. ******************************************************************************** CPU: c22n12.farnam.hpc.yale.internal ********************************************************************************