******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 5.3.3 (Release date: Sun Feb 7 15:39:52 2021 -0800) For further information on how to interpret these results please access https://meme-suite.org/meme. To get a copy of the MEME Suite software please access https://meme-suite.org. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** PRIMARY SEQUENCES= ../result/final_prediction/K562/fasta/RankLinear0.6_40/SRSF9.fasta CONTROL SEQUENCES= --none-- ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ chr22:22719466-22719546 1.0000 80 chr13:90678254-90678334 1.0000 80 chr8:9140756-9140836 1.0000 80 chr13:81416244-81416324 1.0000 80 chr8:9140955-9141035 1.0000 80 chr22:22719675-22719755 1.0000 80 chr13:81416513-81416593 1.0000 80 chr13:91177456-91177536 1.0000 80 chr7:82011683-82011763 1.0000 80 chr7:111289800-111289880 1.0000 80 chr17:45214242-45214322 1.0000 80 chr13:91087417-91087497 1.0000 80 chr22:17295244-17295324 1.0000 80 chr22:22352912-22352992 1.0000 80 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme -oc ../result/final_prediction/K562/inference_raw/MEME/RankLinear0.6_40_SRSF9/ -dna -nmotifs 5 -w 8 -maxsize 250000 -nostatus ../result/final_prediction/K562/fasta/RankLinear0.6_40/SRSF9.fasta model: mod= zoops nmotifs= 5 evt= inf objective function: em= E-value of product of p-values starts= E-value of product of p-values strands: + width: minw= 8 maxw= 8 nsites: minsites= 2 maxsites= 14 wnsites= 0.8 theta: spmap= uni spfuzz= 0.5 em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 trim: wg= 11 ws= 1 endgaps= yes data: n= 1120 N= 14 sample: seed= 0 hsfrac= 0 searchsize= 1120 norand= no csites= 1000 Letter frequencies in dataset: A 0.327 C 0.178 G 0.212 T 0.283 Background letter frequencies (from file dataset with add-one prior applied): A 0.327 C 0.178 G 0.213 T 0.283 Background model order: 0 ******************************************************************************** ******************************************************************************** MOTIF CWRCCTGG MEME-1 width = 8 sites = 3 llr = 32 E-value = 2.4e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif CWRCCTGG MEME-1 Description -------------------------------------------------------------------------------- Simplified A :37::::: pos.-specific C a::aa::: probability G ::3:::aa matrix T :7:::a:: bits 2.5 * ** 2.2 * ** ** 2.0 * ** ** 1.7 * ***** Relative 1.5 * ***** Entropy 1.2 * ***** (15.5 bits) 1.0 * ****** 0.7 ******** 0.5 ******** 0.2 ******** 0.0 -------- Multilevel CTACCTGG consensus AG sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif CWRCCTGG MEME-1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- -------- chr13:91087417-91087497 32 6.70e-06 tgtgtcatgg ctacctgg tacaaagtgt chr13:90678254-90678334 69 6.70e-06 actgtgctca ctacctgg caga chr22:22719466-22719546 12 2.39e-05 agttcaagaa cagcctgg gcaacaaagc -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif CWRCCTGG MEME-1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- chr13:91087417-91087497 6.7e-06 31_[+1]_41 chr13:90678254-90678334 6.7e-06 68_[+1]_4 chr22:22719466-22719546 2.4e-05 11_[+1]_61 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif CWRCCTGG MEME-1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF CWRCCTGG width=8 seqs=3 chr13:91087417-91087497 ( 32) CTACCTGG 1 chr13:90678254-90678334 ( 69) CTACCTGG 1 chr22:22719466-22719546 ( 12) CAGCCTGG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif CWRCCTGG MEME-1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 8 n= 1022 bayes= 8.85657 E= 2.4e+002 -823 249 -823 -823 3 -823 -823 123 103 -823 65 -823 -823 249 -823 -823 -823 249 -823 -823 -823 -823 -823 182 -823 -823 223 -823 -823 -823 223 -823 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif CWRCCTGG MEME-1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 8 nsites= 3 E= 2.4e+002 0.000000 1.000000 0.000000 0.000000 0.333333 0.000000 0.000000 0.666667 0.666667 0.000000 0.333333 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif CWRCCTGG MEME-1 regular expression -------------------------------------------------------------------------------- C[TA][AG]CCTGG -------------------------------------------------------------------------------- Time 0.36 secs. ******************************************************************************** ******************************************************************************** MOTIF CSCTGTCT MEME-2 width = 8 sites = 2 llr = 23 E-value = 5.0e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif CSCTGTCT MEME-2 Description -------------------------------------------------------------------------------- Simplified A :::::::: pos.-specific C a5a:::a: probability G :5::a::: matrix T :::a:a:a bits 2.5 * * * 2.2 * * * * 2.0 * * * * 1.7 * ****** Relative 1.5 * ****** Entropy 1.2 ******** (16.5 bits) 1.0 ******** 0.7 ******** 0.5 ******** 0.2 ******** 0.0 -------- Multilevel CCCTGTCT consensus G sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif CSCTGTCT MEME-2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- -------- chr22:22719466-22719546 34 4.85e-06 caaagcaagg ccctgtct ctacaaaaag chr7:111289800-111289880 51 1.06e-05 CTCAATCCTA CGCTGTCT TTCAAAAAGC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif CSCTGTCT MEME-2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- chr22:22719466-22719546 4.8e-06 33_[+2]_39 chr7:111289800-111289880 1.1e-05 50_[+2]_22 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif CSCTGTCT MEME-2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF CSCTGTCT width=8 seqs=2 chr22:22719466-22719546 ( 34) CCCTGTCT 1 chr7:111289800-111289880 ( 51) CGCTGTCT 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif CSCTGTCT MEME-2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 8 n= 1022 bayes= 8.99435 E= 5.0e+002 -765 248 -765 -765 -765 149 123 -765 -765 248 -765 -765 -765 -765 -765 182 -765 -765 223 -765 -765 -765 -765 182 -765 248 -765 -765 -765 -765 -765 182 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif CSCTGTCT MEME-2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 8 nsites= 2 E= 5.0e+002 0.000000 1.000000 0.000000 0.000000 0.000000 0.500000 0.500000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 1.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif CSCTGTCT MEME-2 regular expression -------------------------------------------------------------------------------- C[CG]CTGTCT -------------------------------------------------------------------------------- Time 0.67 secs. ******************************************************************************** ******************************************************************************** MOTIF TWCTGCAG MEME-3 width = 8 sites = 3 llr = 32 E-value = 2.8e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif TWCTGCAG MEME-3 Description -------------------------------------------------------------------------------- Simplified A :7::::a: pos.-specific C ::a::a:: probability G ::::a::a matrix T a3:a:::: bits 2.5 * * 2.2 * ** * 2.0 * ** * 1.7 * **** * Relative 1.5 * ****** Entropy 1.2 * ****** (15.5 bits) 1.0 * ****** 0.7 ******** 0.5 ******** 0.2 ******** 0.0 -------- Multilevel TACTGCAG consensus T sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif TWCTGCAG MEME-3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- -------- chr22:17295244-17295324 49 1.23e-05 acactatctt tactgcag ctttgtggta chr8:9140756-9140836 24 1.23e-05 CACTCATTCA TACTGCAG TACGCAAttc chr13:81416513-81416593 60 2.30e-05 AGGAACAGCA TTCTGCAG TCAGTGATGG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif TWCTGCAG MEME-3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- chr22:17295244-17295324 1.2e-05 48_[+3]_24 chr8:9140756-9140836 1.2e-05 23_[+3]_49 chr13:81416513-81416593 2.3e-05 59_[+3]_13 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif TWCTGCAG MEME-3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF TWCTGCAG width=8 seqs=3 chr22:17295244-17295324 ( 49) TACTGCAG 1 chr8:9140756-9140836 ( 24) TACTGCAG 1 chr13:81416513-81416593 ( 60) TTCTGCAG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif TWCTGCAG MEME-3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 8 n= 1022 bayes= 8.85657 E= 2.8e+002 -823 -823 -823 182 103 -823 -823 24 -823 249 -823 -823 -823 -823 -823 182 -823 -823 223 -823 -823 249 -823 -823 161 -823 -823 -823 -823 -823 223 -823 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif TWCTGCAG MEME-3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 8 nsites= 3 E= 2.8e+002 0.000000 0.000000 0.000000 1.000000 0.666667 0.000000 0.000000 0.333333 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif TWCTGCAG MEME-3 regular expression -------------------------------------------------------------------------------- T[AT]CTGCAG -------------------------------------------------------------------------------- Time 0.97 secs. ******************************************************************************** ******************************************************************************** MOTIF CTTGAGCA MEME-4 width = 8 sites = 2 llr = 23 E-value = 1.0e+003 ******************************************************************************** -------------------------------------------------------------------------------- Motif CTTGAGCA MEME-4 Description -------------------------------------------------------------------------------- Simplified A ::::a::a pos.-specific C a:::::a: probability G :::a:a:: matrix T :aa::::: bits 2.5 * * 2.2 * * ** 2.0 * * ** 1.7 **** ** Relative 1.5 ******** Entropy 1.2 ******** (16.3 bits) 1.0 ******** 0.7 ******** 0.5 ******** 0.2 ******** 0.0 -------- Multilevel CTTGAGCA consensus sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif CTTGAGCA MEME-4 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- -------- chr7:82011683-82011763 49 1.23e-05 TTGTGACTTG CTTGAGCA TGTTTCCCAG chr22:22719675-22719755 32 1.23e-05 catttcaata cttgagca aataggaaac -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif CTTGAGCA MEME-4 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- chr7:82011683-82011763 1.2e-05 48_[+4]_24 chr22:22719675-22719755 1.2e-05 31_[+4]_41 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif CTTGAGCA MEME-4 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF CTTGAGCA width=8 seqs=2 chr7:82011683-82011763 ( 49) CTTGAGCA 1 chr22:22719675-22719755 ( 32) CTTGAGCA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif CTTGAGCA MEME-4 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 8 n= 1022 bayes= 8.99435 E= 1.0e+003 -765 248 -765 -765 -765 -765 -765 182 -765 -765 -765 182 -765 -765 223 -765 161 -765 -765 -765 -765 -765 223 -765 -765 248 -765 -765 161 -765 -765 -765 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif CTTGAGCA MEME-4 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 8 nsites= 2 E= 1.0e+003 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif CTTGAGCA MEME-4 regular expression -------------------------------------------------------------------------------- CTTGAGCA -------------------------------------------------------------------------------- Time 1.28 secs. ******************************************************************************** ******************************************************************************** MOTIF CACACTAT MEME-5 width = 8 sites = 2 llr = 22 E-value = 1.3e+003 ******************************************************************************** -------------------------------------------------------------------------------- Motif CACACTAT MEME-5 Description -------------------------------------------------------------------------------- Simplified A :a:a::a: pos.-specific C a:a:a::: probability G :::::::: matrix T :::::a:a bits 2.5 * * * 2.2 * * * 2.0 * * * 1.7 * * ** * Relative 1.5 ******** Entropy 1.2 ******** (16.0 bits) 1.0 ******** 0.7 ******** 0.5 ******** 0.2 ******** 0.0 -------- Multilevel CACACTAT consensus sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif CACACTAT MEME-5 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- -------- chr22:22352912-22352992 50 1.58e-05 AACAAGCATG CACACTAT CAACATTGTT chr22:17295244-17295324 38 1.58e-05 gtgtcagtac cacactat ctttactgca -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif CACACTAT MEME-5 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- chr22:22352912-22352992 1.6e-05 49_[+5]_23 chr22:17295244-17295324 1.6e-05 37_[+5]_35 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif CACACTAT MEME-5 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF CACACTAT width=8 seqs=2 chr22:22352912-22352992 ( 50) CACACTAT 1 chr22:17295244-17295324 ( 38) CACACTAT 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif CACACTAT MEME-5 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 8 n= 1022 bayes= 8.99435 E= 1.3e+003 -765 248 -765 -765 161 -765 -765 -765 -765 248 -765 -765 161 -765 -765 -765 -765 248 -765 -765 -765 -765 -765 182 161 -765 -765 -765 -765 -765 -765 182 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif CACACTAT MEME-5 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 8 nsites= 2 E= 1.3e+003 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 1.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 1.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif CACACTAT MEME-5 regular expression -------------------------------------------------------------------------------- CACACTAT -------------------------------------------------------------------------------- Time 1.59 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- chr22:22719466-22719546 1.17e-04 11_[+1(2.39e-05)]_14_[+2(4.85e-06)]_\ 39 chr13:90678254-90678334 5.63e-03 68_[+1(6.70e-06)]_4 chr8:9140756-9140836 4.31e-02 23_[+3(1.23e-05)]_49 chr13:81416244-81416324 9.72e-01 80 chr8:9140955-9141035 7.07e-01 80 chr22:22719675-22719755 3.56e-02 31_[+4(1.23e-05)]_41 chr13:81416513-81416593 9.76e-02 59_[+3(2.30e-05)]_13 chr13:91177456-91177536 8.93e-01 80 chr7:82011683-82011763 9.51e-03 48_[+4(1.23e-05)]_24 chr7:111289800-111289880 2.21e-02 50_[+2(1.06e-05)]_22 chr17:45214242-45214322 3.20e-01 80 chr13:91087417-91087497 1.50e-02 31_[+1(6.70e-06)]_41 chr22:17295244-17295324 5.13e-05 37_[+5(1.58e-05)]_3_[+3(1.23e-05)]_\ 24 chr22:22352912-22352992 2.45e-02 49_[+5(1.58e-05)]_23 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because requested number of motifs (5) found. ******************************************************************************** CPU: c17n10.farnam.hpc.yale.internal ********************************************************************************