******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 5.3.3 (Release date: Sun Feb 7 15:39:52 2021 -0800) For further information on how to interpret these results please access https://meme-suite.org/meme. To get a copy of the MEME Suite software please access https://meme-suite.org. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** PRIMARY SEQUENCES= ../result/final_prediction/K562/fasta/RankLinear0.6_40/RBM17.fasta CONTROL SEQUENCES= --none-- ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ chr3:75785723-75785803 1.0000 80 chr7:32619626-32619706 1.0000 80 chr17:5185397-5185477 1.0000 80 chr2:179301490-179301570 1.0000 80 chr13:81242037-81242117 1.0000 80 chr11:65273560-65273640 1.0000 80 chr13:91353986-91354066 1.0000 80 chr22:22404870-22404950 1.0000 80 chr22:19419266-19419346 1.0000 80 chr20:1447514-1447594 1.0000 80 chr1:17231900-17231980 1.0000 80 chr22:19419957-19420037 1.0000 80 chr18:1573014-1573094 1.0000 80 chr1:189605462-189605542 1.0000 80 chr2:178058255-178058335 1.0000 80 chr7:152161507-152161587 1.0000 80 chr17:56756748-56756828 1.0000 80 chr1:78470007-78470087 1.0000 80 chr7:148581772-148581852 1.0000 80 chr13:91299548-91299628 1.0000 80 chr22:17296299-17296379 1.0000 80 chr2:172563497-172563577 1.0000 80 chr11:65272129-65272209 1.0000 80 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme -oc ../result/final_prediction/K562/inference_raw/MEME/RankLinear0.6_40_RBM17/ -dna -nmotifs 5 -w 8 -maxsize 250000 -nostatus ../result/final_prediction/K562/fasta/RankLinear0.6_40/RBM17.fasta model: mod= zoops nmotifs= 5 evt= inf objective function: em= E-value of product of p-values starts= E-value of product of p-values strands: + width: minw= 8 maxw= 8 nsites: minsites= 2 maxsites= 23 wnsites= 0.8 theta: spmap= uni spfuzz= 0.5 em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 trim: wg= 11 ws= 1 endgaps= yes data: n= 1840 N= 23 sample: seed= 0 hsfrac= 0 searchsize= 1840 norand= no csites= 1000 Letter frequencies in dataset: A 0.279 C 0.237 G 0.235 T 0.248 Background letter frequencies (from file dataset with add-one prior applied): A 0.279 C 0.237 G 0.235 T 0.248 Background model order: 0 ******************************************************************************** ******************************************************************************** MOTIF CGGGCSGG MEME-1 width = 8 sites = 4 llr = 39 E-value = 9.4e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif CGGGCSGG MEME-1 Description -------------------------------------------------------------------------------- Simplified A :::::::: pos.-specific C 83::a5:: probability G 38aa:5aa matrix T :::::::: bits 2.1 *** ** 1.9 *** ** 1.7 *** ** 1.5 *** ** Relative 1.3 ***** ** Entropy 1.0 ******** (14.0 bits) 0.8 ******** 0.6 ******** 0.4 ******** 0.2 ******** 0.0 -------- Multilevel CGGGCCGG consensus GC G sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif CGGGCSGG MEME-1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- -------- chr7:148581772-148581852 53 1.90e-05 GCAGAGTTCC CGGGCGGG GGAGGGGCGG chr22:19419957-19420037 73 1.90e-05 GGGGGCACCG CGGGCCGG chr7:152161507-152161587 53 5.70e-05 GGAAGCTGCG GGGGCCGG GGTCGCACTA chr22:19419266-19419346 25 5.70e-05 CCAAATCGCC CCGGCGGG AAACCGCTCC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif CGGGCSGG MEME-1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- chr7:148581772-148581852 1.9e-05 52_[+1]_20 chr22:19419957-19420037 1.9e-05 72_[+1] chr7:152161507-152161587 5.7e-05 52_[+1]_20 chr22:19419266-19419346 5.7e-05 24_[+1]_48 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif CGGGCSGG MEME-1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF CGGGCSGG width=8 seqs=4 chr7:148581772-148581852 ( 53) CGGGCGGG 1 chr22:19419957-19420037 ( 73) CGGGCCGG 1 chr7:152161507-152161587 ( 53) GGGGCCGG 1 chr22:19419266-19419346 ( 25) CCGGCGGG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif CGGGCSGG MEME-1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 8 n= 1679 bayes= 9.44829 E= 9.4e+002 -865 166 9 -865 -865 8 167 -865 -865 -865 208 -865 -865 -865 208 -865 -865 207 -865 -865 -865 108 109 -865 -865 -865 208 -865 -865 -865 208 -865 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif CGGGCSGG MEME-1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 8 nsites= 4 E= 9.4e+002 0.000000 0.750000 0.250000 0.000000 0.000000 0.250000 0.750000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.500000 0.500000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif CGGGCSGG MEME-1 regular expression -------------------------------------------------------------------------------- [CG][GC]GGC[CG]GG -------------------------------------------------------------------------------- Time 0.52 secs. ******************************************************************************** ******************************************************************************** MOTIF GCTGGCRG MEME-2 width = 8 sites = 8 llr = 65 E-value = 5.5e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif GCTGGCRG MEME-2 Description -------------------------------------------------------------------------------- Simplified A ::::::4: pos.-specific C 191::61: probability G 9::aa34a matrix T :19::11: bits 2.1 ** * 1.9 ** * 1.7 ** * 1.5 ***** * Relative 1.3 ***** * Entropy 1.0 ***** * (11.7 bits) 0.8 ****** * 0.6 ****** * 0.4 ****** * 0.2 ******** 0.0 -------- Multilevel GCTGGCAG consensus GG sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif GCTGGCRG MEME-2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- -------- chr22:19419266-19419346 73 2.18e-05 CGGCTCCTCC GCTGGCAG chr3:75785723-75785803 68 2.18e-05 aatgtttgaa gctggcag aggtt chr20:1447514-1447594 65 4.35e-05 GGGCCCAGCA GCTGGGAG CCCGCCCG chr11:65273560-65273640 31 7.45e-05 AGCAAAAGAT GCTGGTGG TTGGCACTCC chr7:148581772-148581852 14 1.17e-04 GGGACGAAGC GCCGGCGG CTCTTGGCGG chr7:152161507-152161587 37 1.17e-04 GCGAGCCTGG GTTGGCGG AAGCTGCGGG chr22:19419957-19420037 33 1.37e-04 ACGCCTGAGC GCTGGGCG CCTGGTCTGC chr18:1573014-1573094 8 3.22e-04 ACAGCAT CCTGGCTG AAATTCCTGA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif GCTGGCRG MEME-2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- chr22:19419266-19419346 2.2e-05 72_[+2] chr3:75785723-75785803 2.2e-05 67_[+2]_5 chr20:1447514-1447594 4.3e-05 64_[+2]_8 chr11:65273560-65273640 7.5e-05 30_[+2]_42 chr7:148581772-148581852 0.00012 13_[+2]_59 chr7:152161507-152161587 0.00012 36_[+2]_36 chr22:19419957-19420037 0.00014 32_[+2]_40 chr18:1573014-1573094 0.00032 7_[+2]_65 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif GCTGGCRG MEME-2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF GCTGGCRG width=8 seqs=8 chr22:19419266-19419346 ( 73) GCTGGCAG 1 chr3:75785723-75785803 ( 68) GCTGGCAG 1 chr20:1447514-1447594 ( 65) GCTGGGAG 1 chr11:65273560-65273640 ( 31) GCTGGTGG 1 chr7:148581772-148581852 ( 14) GCCGGCGG 1 chr7:152161507-152161587 ( 37) GTTGGCGG 1 chr22:19419957-19420037 ( 33) GCTGGGCG 1 chr18:1573014-1573094 ( 8) CCTGGCTG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif GCTGGCRG MEME-2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 8 n= 1679 bayes= 8.44622 E= 5.5e+002 -965 -92 189 -965 -965 188 -965 -99 -965 -92 -965 182 -965 -965 209 -965 -965 -965 209 -965 -965 140 9 -99 42 -92 67 -99 -965 -965 209 -965 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif GCTGGCRG MEME-2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 8 nsites= 8 E= 5.5e+002 0.000000 0.125000 0.875000 0.000000 0.000000 0.875000 0.000000 0.125000 0.000000 0.125000 0.000000 0.875000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.625000 0.250000 0.125000 0.375000 0.125000 0.375000 0.125000 0.000000 0.000000 1.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif GCTGGCRG MEME-2 regular expression -------------------------------------------------------------------------------- GCTGG[CG][AG]G -------------------------------------------------------------------------------- Time 0.91 secs. ******************************************************************************** ******************************************************************************** MOTIF GGRCGGGG MEME-3 width = 8 sites = 2 llr = 22 E-value = 1.8e+003 ******************************************************************************** -------------------------------------------------------------------------------- Motif GGRCGGGG MEME-3 Description -------------------------------------------------------------------------------- Simplified A ::5::::: pos.-specific C :::a:::: probability G aa5:aaaa matrix T :::::::: bits 2.1 ** ***** 1.9 ** ***** 1.7 ** ***** 1.5 ** ***** Relative 1.3 ** ***** Entropy 1.0 ******** (15.6 bits) 0.8 ******** 0.6 ******** 0.4 ******** 0.2 ******** 0.0 -------- Multilevel GGACGGGG consensus G sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif GGRCGGGG MEME-3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- -------- chr7:148581772-148581852 65 9.38e-06 GGCGGGGGAG GGGCGGGG CGGGTGCT chr11:65273560-65273640 58 2.05e-05 CTGGTTTCCA GGACGGGG TTCAAATCCC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif GGRCGGGG MEME-3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- chr7:148581772-148581852 9.4e-06 64_[+3]_8 chr11:65273560-65273640 2.1e-05 57_[+3]_15 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif GGRCGGGG MEME-3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF GGRCGGGG width=8 seqs=2 chr7:148581772-148581852 ( 65) GGGCGGGG 1 chr11:65273560-65273640 ( 58) GGACGGGG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif GGRCGGGG MEME-3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 8 n= 1679 bayes= 9.71167 E= 1.8e+003 -765 -765 208 -765 -765 -765 208 -765 84 -765 108 -765 -765 207 -765 -765 -765 -765 208 -765 -765 -765 208 -765 -765 -765 208 -765 -765 -765 208 -765 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif GGRCGGGG MEME-3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 8 nsites= 2 E= 1.8e+003 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.500000 0.000000 0.500000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif GGRCGGGG MEME-3 regular expression -------------------------------------------------------------------------------- GG[AG]CGGGG -------------------------------------------------------------------------------- Time 1.29 secs. ******************************************************************************** ******************************************************************************** MOTIF GCARGKGG MEME-4 width = 8 sites = 3 llr = 30 E-value = 7.8e+003 ******************************************************************************** -------------------------------------------------------------------------------- Motif GCARGKGG MEME-4 Description -------------------------------------------------------------------------------- Simplified A ::a3:::: pos.-specific C :a:::::: probability G a::7a3aa matrix T :::::7:: bits 2.1 ** * ** 1.9 *** * ** 1.7 *** * ** 1.5 *** * ** Relative 1.3 *** * ** Entropy 1.0 ******** (14.5 bits) 0.8 ******** 0.6 ******** 0.4 ******** 0.2 ******** 0.0 -------- Multilevel GCAGGTGG consensus A G sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif GCARGKGG MEME-4 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- -------- chr1:189605462-189605542 37 1.18e-05 TTCCTAATAG GCAGGTGG GTGAAATATA chr22:19419957-19420037 60 2.29e-05 CCGGACCTCA GCAGGGGG CACCGCGGGC chr11:65272129-65272209 66 3.68e-05 AATTAAACTG GCAAGTGG AAATGTT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif GCARGKGG MEME-4 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- chr1:189605462-189605542 1.2e-05 36_[+4]_36 chr22:19419957-19420037 2.3e-05 59_[+4]_13 chr11:65272129-65272209 3.7e-05 65_[+4]_7 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif GCARGKGG MEME-4 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF GCARGKGG width=8 seqs=3 chr1:189605462-189605542 ( 37) GCAGGTGG 1 chr22:19419957-19420037 ( 60) GCAGGGGG 1 chr11:65272129-65272209 ( 66) GCAAGTGG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif GCARGKGG MEME-4 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 8 n= 1679 bayes= 9.57399 E= 7.8e+003 -823 -823 208 -823 -823 207 -823 -823 184 -823 -823 -823 25 -823 150 -823 -823 -823 208 -823 -823 -823 50 142 -823 -823 208 -823 -823 -823 208 -823 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif GCARGKGG MEME-4 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 8 nsites= 3 E= 7.8e+003 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.333333 0.000000 0.666667 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.333333 0.666667 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif GCARGKGG MEME-4 regular expression -------------------------------------------------------------------------------- GCA[GA]G[TG]GG -------------------------------------------------------------------------------- Time 1.65 secs. ******************************************************************************** ******************************************************************************** MOTIF TTGGCGSG MEME-5 width = 8 sites = 2 llr = 22 E-value = 4.0e+003 ******************************************************************************** -------------------------------------------------------------------------------- Motif TTGGCGSG MEME-5 Description -------------------------------------------------------------------------------- Simplified A :::::::: pos.-specific C ::::a:5: probability G ::aa:a5a matrix T aa:::::: bits 2.1 ****** * 1.9 ****** * 1.7 ****** * 1.5 ****** * Relative 1.3 ****** * Entropy 1.0 ******** (15.5 bits) 0.8 ******** 0.6 ******** 0.4 ******** 0.2 ******** 0.0 -------- Multilevel TTGGCGCG consensus G sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif TTGGCGSG MEME-5 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- -------- chr7:148581772-148581852 25 2.10e-05 CCGGCGGCTC TTGGCGGG AACCGGCGCC chr7:152161507-152161587 72 2.10e-05 GTCGCACTAC TTGGCGCG G -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif TTGGCGSG MEME-5 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- chr7:148581772-148581852 2.1e-05 24_[+5]_48 chr7:152161507-152161587 2.1e-05 71_[+5]_1 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif TTGGCGSG MEME-5 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF TTGGCGSG width=8 seqs=2 chr7:148581772-148581852 ( 25) TTGGCGGG 1 chr7:152161507-152161587 ( 72) TTGGCGCG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif TTGGCGSG MEME-5 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 8 n= 1679 bayes= 9.71167 E= 4.0e+003 -765 -765 -765 200 -765 -765 -765 200 -765 -765 208 -765 -765 -765 208 -765 -765 207 -765 -765 -765 -765 208 -765 -765 107 108 -765 -765 -765 208 -765 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif TTGGCGSG MEME-5 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 8 nsites= 2 E= 4.0e+003 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.500000 0.500000 0.000000 0.000000 0.000000 1.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif TTGGCGSG MEME-5 regular expression -------------------------------------------------------------------------------- TTGGCG[CG]G -------------------------------------------------------------------------------- Time 2.01 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- chr3:75785723-75785803 5.30e-02 67_[+2(2.18e-05)]_5 chr7:32619626-32619706 5.68e-01 80 chr17:5185397-5185477 1.00e+00 80 chr2:179301490-179301570 7.31e-01 80 chr13:81242037-81242117 9.78e-01 80 chr11:65273560-65273640 3.83e-05 30_[+2(7.45e-05)]_19_[+3(2.05e-05)]_\ 15 chr13:91353986-91354066 1.00e+00 80 chr22:22404870-22404950 1.00e+00 80 chr22:19419266-19419346 6.79e-05 24_[+1(5.70e-05)]_40_[+2(2.18e-05)] chr20:1447514-1447594 2.30e-03 64_[+2(4.35e-05)]_8 chr1:17231900-17231980 1.00e+00 80 chr22:19419957-19420037 9.24e-06 59_[+4(2.29e-05)]_5_[+1(1.90e-05)] chr18:1573014-1573094 5.55e-01 80 chr1:189605462-189605542 4.76e-03 36_[+4(1.18e-05)]_36 chr2:178058255-178058335 1.00e+00 80 chr7:152161507-152161587 7.86e-07 54_[+3(3.99e-05)]_9_[+5(2.10e-05)]_\ 1 chr17:56756748-56756828 8.95e-01 80 chr1:78470007-78470087 1.00e+00 80 chr7:148581772-148581852 3.05e-08 24_[+5(2.10e-05)]_21_[+3(9.38e-06)]_\ 3_[+3(9.38e-06)]_8 chr13:91299548-91299628 1.00e+00 80 chr22:17296299-17296379 9.88e-01 80 chr2:172563497-172563577 9.90e-01 80 chr11:65272129-65272209 1.30e-01 65_[+4(3.68e-05)]_7 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because requested number of motifs (5) found. ******************************************************************************** CPU: c22n04.farnam.hpc.yale.internal ********************************************************************************