For approximately 50-55% level of true promoter region recognition, the TSSG program will give one false positive prediction for about 5000 bp. (this accuracy is similar with the test sequences anlysis by Prestridge's method). We estimate an accuracy of defining TSS position on 10 test genes where both (our and Prestridge's) algorithms found promoter region:
Deviation of predicted TSS from the real TSS:
_____________________________________________________________________
Method/deviation I 5b I 50 b I 150 b I mean of observed
________________________I_______I_______I_______I___deviations_______
Prestridge's I 0 I 3 I 7 I 81.2 base
________________________I_______I_______I_______I_____________________
TSSG I 7 I 3 I 0 I 7.3 base
________________________I_______I_______I_______I_____________________
Name: Seq_name:all
First three lines of sequence:
TTANGATTCGTTTCCATGGAGCTGCCCATGACCATTTACACCATATACATACTGTCTCTGAGCAGAGATACGACA
CTCAGGCTGGTGATAAGGGAACACAGCTGTCAGGGGGCCAGAAGCAGCGTGTCGCCATAGCCCGAGCCATCATCC
GCAACCCCAAACTGTTGCTCCTGGACGAGGCCACGTCTGCGCTCGACACTGAGAGTGAGAAGGTGAGACTTTATT
tssg Thu Jun 3 20:39:13 CDT 1999
>Seq_name:all
Length of sequence- 39951
Threshold for LDF- 4.00
8 promoter(s) were predicted
Pos.: 20076 LDF- 9.94 TATA box predicted at 20044
Pos.: 25269 LDF- 9.57
Pos.: 29344 LDF- 9.32 TATA box predicted at 29314
Pos.: 29652 LDF- 7.93 TATA box predicted at 29637
Pos.: 37397 LDF- 6.36
Pos.: 33041 LDF- 6.16 TATA box predicted at 33010
Pos.: 17708 LDF- 4.56 TATA box predicted at 17678
Pos.: 21328 LDF- 4.46 TATA box predicted at 21310
Transcription factor binding sites:
for promoter at position - 20076
20076 (+) S01069 ACCNNNNNNGGT
19910 (-) S01027 ACGCCC
20063 (-) S00922 AGAGG
20057 (+) S01554 ANCCTCTCY
19830 (+) S00880 ATTGG
19933 (-) S01904 CACCTG
20014 (+) S00089 CANYYY
20055 (+) S00089 CANYYY
19897 (-) S00089 CANYYY
19833 (-) S00089 CANYYY
19815 (-) S01616 CATTW
19805 (-) S01616 CATTW
19834 (-) S00633 CCAAT
19999 (+) S01187 CCCCGCCC
20000 (+) S00801 CCCGCC
19999 (+) S01936 CCCMNSSS
19866 (-) S00245 CCGAAAC
20001 (+) S00802 CCGCCC
19888 (-) S00489 CGTCA
19792 (+) S01622 CWKKANNY
20016 (-) S01622 CWKKANNY
19860 (-) S01622 CWKKANNY
19815 (-) S01622 CWKKANNY
20064 (-) S00090 GAGAGGA
19835 (-) S01089 GCCAA
19998 (+) S00216 GCCCCGCC
19836 (-) S01738 GGCCAAT
19833 (+) S00437 GGCCG
20005 (-) S00781 GGCGGG
20006 (-) S00978 GGGCGG
20002 (-) S00974 GGGGC
20006 (-) S01193 GGGNGGRR
20025 (+) S01998 GRGRTTKCAY
20025 (+) S00159 GRGRTTYCAY
20007 (-) S00064 KGGGCGGRRY
20007 (-) S01542 KRGGCGKRRY
19848 (+) S02023 MAMAG
20068 (-) S02023 MAMAG
19976 (-) S02023 MAMAG
19862 (-) S02023 MAMAG
19927 (+) S01950 RCAGNTG
19982 (+) S01190 RYYWSGTG
20004 (-) S01964 SCGSSSC
19802 (-) S00435 TACAAA
19822 (+) S01424 TGANTMA
19828 (-) S01424 TGANTMA
20007 (-) S01375 TGGGC
20007 (-) S00323 TGGGCGGGGC
19944 (+) S00250 TGRMCC
19950 (+) S00250 TGRMCC
19956 (+) S00250 TGRMCC
19851 (-) S02000 TKNNGNAAK
19795 (+) S01974 TRTTTGY
20013 (-) S01974 TRTTTGY
19944 (+) S01629 WGNAMCYK
19950 (+) S01629 WGNAMCYK
19956 (+) S01629 WGNAMCYK
20033 (-) S01629 WGNAMCYK
19999 (+) S01081 YYCCGCCC
for promoter at position - 25269
25130 (+) S01152 AAGTGA
25130 (+) S01153 AARKGA
25142 (-) S01027 ACGCCC
25082 (+) S01249 ACGTMAC
25218 (-) S00922 AGAGG
25021 (-) S00922 AGAGG
24971 (-) S00922 AGAGG
25161 (-) S00392 AGGAAG
25204 (-) S00536 CAGCTGGC
25204 (-) S02128 CAGNTGGC
25147 (+) S00089 CANYYY
25181 (-) S00089 CANYYY
25134 (-) S00089 CANYYY
25050 (-) S00089 CANYYY
25115 (+) S01616 CATTW
25198 (+) S02113 CCAGCTG
25164 (-) S01003 CCCAG
25256 (+) S01187 CCCCGCCC
25257 (+) S00801 CCCGCC
25257 (+) S00256 CCCKCCCWCCT
25256 (+) S01936 CCCMNSSS
25258 (+) S00802 CCGCCC
25061 (+) S00040 CCTGC
25249 (+) S00489 CGTCA
25140 (+) S00753 CGTGAC
25210 (+) S00794 CTTTCC
25005 (+) S01622 CWKKANNY
25088 (+) S01622 CWKKANNY
25115 (+) S01622 CWKKANNY
25255 (-) S01622 CWKKANNY
25038 (-) S01622 CWKKANNY
25002 (+) S00038 GAACAG
25217 (-) S01502 GAGGAA
25168 (-) S00973 GAGGC
25168 (-) S02135 GAGGCC
25175 (-) S00539 GATGGCCG
25242 (-) S00741 GATTTC
25026 (-) S01089 GCCAA
24975 (-) S01089 GCCAA
25255 (+) S00216 GCCCCGCC
25038 (+) S00437 GGCCG
25172 (-) S00437 GGCCG
25262 (-) S00781 GGCGGG
25263 (-) S00978 GGGCGG
25259 (-) S00974 GGGGC
25263 (-) S01193 GGGNGGRR
25088 (-) S00399 GTKACGT
25088 (-) S00104 GTKACGW
25003 (+) S02023 MAMAG
25065 (+) S02023 MAMAG
25086 (+) S02023 MAMAG
25128 (+) S02023 MAMAG
25192 (+) S02023 MAMAG
25233 (+) S02023 MAMAG
25024 (-) S02023 MAMAG
25205 (-) S01950 RCAGNTG
25261 (-) S01964 SCGSSSC
25141 (+) S00143 STGACTMA
25035 (-) S00484 TATCTC
25142 (+) S01426 TGACTCA
25142 (+) S01424 TGANTMA
25148 (-) S01424 TGANTMA
25142 (+) S01935 TGASTMA
25148 (-) S01935 TGASTMA
24972 (+) S02137 TGGCA
25161 (+) S01375 TGGGC
25161 (+) S00044 TGGNNNNNNGCCA
25225 (+) S00864 TGTCCT
25148 (-) S01595 TKAGTCA
25006 (+) S01773 XGGAYGT
25015 (-) S01773 XGGAYGT
25241 (+) S00346 YCSCCMNSSS
25256 (+) S01081 YYCCGCCC
for promoter at position - 29344
29046 (-) S00192 AATAAAT
29344 (+) S00880 ATTGG
29272 (+) S00089 CANYYY
29166 (-) S00089 CANYYY
29140 (-) S00089 CANYYY
29085 (+) S01003 CCCAG
29281 (+) S01003 CCCAG
29325 (+) S00801 CCCGCC
29326 (+) S00802 CCGCCC
29268 (-) S00057 CCTGAWWA
29079 (+) S01622 CWKKANNY
29144 (+) S01622 CWKKANNY
29235 (+) S01622 CWKKANNY
29275 (+) S01622 CWKKANNY
29137 (-) S01622 CWKKANNY
29343 (+) S00780 GATTGG
29330 (-) S00781 GGCGGG
29331 (-) S00978 GGGCGG
29332 (-) S00974 GGGGC
29332 (-) S00979 GGGGCGGG
29332 (-) S00331 GGGGCGGGAC
29331 (-) S01193 GGGNGGRR
29332 (-) S00064 KGGGCGGRRY
29332 (-) S01542 KRGGCGKRRY
29343 (-) S02023 MAMAG
29326 (+) S01964 SCGSSSC
29334 (-) S01964 SCGSSSC
29131 (+) S00087 TATAAA
29129 (+) S01540 TATAWAW
29072 (+) S00483 TATCTT
29088 (-) S01375 TGGGC
29286 (+) S00250 TGRMCC
29145 (+) S01052 TTTAAA
29315 (+) S01052 TTTAAA
29320 (-) S01052 TTTAAA
29150 (-) S01052 TTTAAA
29286 (-) S02121 WCTGG
29090 (-) S02121 WCTGG
29184 (+) S00487 WCTRG
29286 (-) S00487 WCTRG
29090 (-) S00487 WCTRG
29266 (-) S00381 WGATAR
29273 (-) S01629 WGNAMCYK
29324 (+) S01081 YYCCGCCC
for promoter at position - 29652
29536 (+) S01153 AARKGA
29537 (+) S01090 AATGA
29590 (+) S01027 ACGCCC
29553 (-) S00392 AGGAAG
29368 (+) S00880 ATTGG
29562 (+) S00880 ATTGG
29387 (+) S00089 CANYYY
29473 (+) S00089 CANYYY
29500 (+) S00089 CANYYY
29525 (+) S00089 CANYYY
29600 (+) S00089 CANYYY
29614 (+) S00089 CANYYY
29540 (-) S00089 CANYYY
29420 (-) S00089 CANYYY
29540 (-) S01616 CATTW
29482 (-) S01616 CATTW
29566 (-) S00633 CCAAT
29372 (-) S00633 CCAAT
29593 (+) S01187 CCCCGCCC
29582 (+) S00801 CCCGCC
29594 (+) S00801 CCCGCC
29586 (+) S01936 CCCMNSSS
29587 (+) S01936 CCCMNSSS
29593 (+) S01936 CCCMNSSS
29583 (+) S00802 CCGCCC
29595 (+) S00802 CCGCCC
29357 (+) S00040 CCTGC
29607 (-) S01954 CGGAAGTG
29402 (+) S00489 CGTCA
29524 (-) S00794 CTTTCC
29576 (+) S01622 CWKKANNY
29622 (-) S01622 CWKKANNY
29482 (-) S01622 CWKKANNY
29547 (-) S01502 GAGGAA
29367 (+) S00780 GATTGG
29592 (+) S00216 GCCCCGCC
29599 (-) S00781 GGCGGG
29587 (-) S00781 GGCGGG
29600 (-) S00978 GGGCGG
29588 (-) S00978 GGGCGG
29596 (-) S00974 GGGGC
29589 (-) S00974 GGGGC
29589 (-) S00979 GGGGCGGG
29600 (-) S01193 GGGNGGRR
29601 (-) S00064 KGGGCGGRRY
29601 (-) S01542 KRGGCGKRRY
29400 (+) S00144 KWCGTCA
29533 (-) S02023 MAMAG
29506 (-) S02023 MAMAG
29445 (-) S02023 MAMAG
29414 (-) S02023 MAMAG
29609 (-) S01770 RNMGGAWGT
29583 (+) S01964 SCGSSSC
29598 (-) S01964 SCGSSSC
29479 (-) S00972 TAGGC
29449 (-) S00087 TATAAA
29406 (-) S01418 TGACGACA
29553 (+) S00869 TGACTTCT
29632 (-) S02137 TGGCA
29601 (-) S01375 TGGGC
29601 (-) S00323 TGGGCGGGGC
29610 (-) S02000 TKNNGNAAK
29373 (+) S02121 WCTGG
29373 (+) S00487 WCTRG
29523 (+) S01629 WGNAMCYK
29474 (-) S02003 YGTCAGC
29593 (+) S01081 YYCCGCCC
for promoter at position - 37397
37252 (-) S00922 AGAGG
37152 (-) S00922 AGAGG
37228 (+) S00392 AGGAAG
37346 (+) S01905 CACGTG
37351 (-) S01905 CACGTG
37116 (+) S00089 CANYYY
37158 (+) S00089 CANYYY
37209 (+) S00089 CANYYY
37245 (+) S00089 CANYYY
37254 (+) S00089 CANYYY
37396 (+) S00089 CANYYY
37378 (-) S00089 CANYYY
37285 (-) S00089 CANYYY
37236 (-) S00089 CANYYY
37241 (+) S00243 CCACCA
37392 (+) S00243 CCACCA
37334 (-) S00956 CCCCCGCCCC
37333 (-) S01187 CCCCGCCC
37332 (-) S00801 CCCGCC
37327 (-) S00801 CCCGCC
37343 (-) S01936 CCCMNSSS
37342 (-) S01936 CCCMNSSS
37341 (-) S01936 CCCMNSSS
37340 (-) S01936 CCCMNSSS
37339 (-) S01936 CCCMNSSS
37338 (-) S01936 CCCMNSSS
37337 (-) S01936 CCCMNSSS
37336 (-) S01936 CCCMNSSS
37335 (-) S01936 CCCMNSSS
37334 (-) S01936 CCCMNSSS
37333 (-) S01936 CCCMNSSS
37304 (-) S01936 CCCMNSSS
37331 (-) S00802 CCGCCC
37120 (+) S00040 CCTGC
37323 (-) S00040 CCTGC
37319 (-) S00040 CCTGC
37315 (-) S00040 CCTGC
37227 (+) S01622 CWKKANNY
37236 (+) S00741 GATTTC
37329 (-) S00216 GCCCCGCC
37137 (-) S00437 GGCCG
37322 (+) S00781 GGCGGG
37327 (+) S00781 GGCGGG
37326 (+) S00978 GGGCGG
37189 (+) S00974 GGGGC
37301 (+) S00974 GGGGC
37325 (+) S00974 GGGGC
37340 (+) S00974 GGGGC
37325 (+) S00979 GGGGCGGG
37325 (+) S00326 GGGGCGGGGG
37326 (+) S01193 GGGNGGRR
37330 (+) S01193 GGGNGGRR
37331 (+) S01193 GGGNGGRR
37332 (+) S01193 GGGNGGRR
37333 (+) S01193 GGGNGGRR
37334 (+) S01193 GGGNGGRR
37335 (+) S01193 GGGNGGRR
37336 (+) S01193 GGGNGGRR
37196 (-) S00608 GTCGCC
37244 (-) S00839 GTGGAAA
37225 (+) S02023 MAMAG
37269 (+) S02023 MAMAG
37274 (+) S02023 MAMAG
37226 (+) S01770 RNMGGAWGT
37227 (+) S02024 SAGGAAGY
37187 (+) S01964 SCGSSSC
37299 (+) S01964 SCGSSSC
37323 (+) S01964 SCGSSSC
37331 (-) S01964 SCGSSSC
37309 (-) S01964 SCGSSSC
37302 (-) S01964 SCGSSSC
37219 (+) S00079 TGCRCNC
37219 (+) S01987 TGCRCRC
37284 (+) S02137 TGGCA
37106 (+) S01375 TGGGC
37394 (-) S01375 TGGGC
37159 (-) S00250 TGRMCC
37257 (+) S02121 WCTGG
37150 (+) S00487 WCTRG
37257 (+) S00487 WCTRG
37361 (+) S01773 XGGAYGT
37343 (-) S00346 YCSCCMNSSS
37342 (-) S00346 YCSCCMNSSS
37341 (-) S00346 YCSCCMNSSS
37340 (-) S00346 YCSCCMNSSS
37339 (-) S00346 YCSCCMNSSS
37338 (-) S00346 YCSCCMNSSS
37337 (-) S00346 YCSCCMNSSS
37336 (-) S00346 YCSCCMNSSS
37335 (-) S00346 YCSCCMNSSS
37331 (-) S00346 YCSCCMNSSS
37365 (+) S02003 YGTCAGC
37333 (-) S01081 YYCCGCCC
for promoter at position - 33041
32923 (+) S00922 AGAGG
33030 (+) S00922 AGAGG
33040 (+) S00922 AGAGG
32872 (-) S00922 AGAGG
32958 (-) S01946 ANATGG
32949 (-) S00908 CAACCAC
32840 (-) S01904 CACCTG
32793 (+) S00089 CANYYY
32954 (+) S00089 CANYYY
32969 (-) S00089 CANYYY
32931 (-) S00089 CANYYY
32859 (-) S00089 CANYYY
32750 (-) S00089 CANYYY
32954 (+) S01616 CATTW
32970 (+) S00243 CCACCA
32875 (-) S01003 CCCAG
32826 (+) S00040 CCTGC
32830 (+) S00040 CCTGC
33024 (-) S00489 CGTCA
32976 (+) S01622 CWKKANNY
32774 (-) S01622 CWKKANNY
33031 (+) S00973 GAGGC
32871 (-) S00973 GAGGC
32844 (-) S00973 GAGGC
33031 (+) S02135 GAGGCC
33008 (-) S00437 GGCCG
32873 (+) S00974 GGGGC
33026 (-) S00144 KWCGTCA
32880 (+) S02023 MAMAG
32963 (+) S02023 MAMAG
33027 (+) S02023 MAMAG
32833 (+) S01190 RYYWSGTG
33011 (+) S00087 TATAAA
33011 (+) S00615 TATAAAA
33011 (+) S01540 TATAWAW
33000 (-) S00483 TATCTT
32972 (-) S02137 TGGCA
32836 (-) S02137 TGGCA
32870 (+) S02121 WCTGG
32975 (+) S02121 WCTGG
33039 (-) S02121 WCTGG
33011 (-) S02121 WCTGG
32856 (-) S02121 WCTGG
32870 (+) S00487 WCTRG
32975 (+) S00487 WCTRG
33039 (-) S00487 WCTRG
33011 (-) S00487 WCTRG
32856 (-) S00487 WCTRG
32758 (-) S00487 WCTRG
33026 (-) S02101 WTCGTCA
32926 (+) S01773 XGGAYGT
32977 (+) S01773 XGGAYGT
32827 (-) S01773 XGGAYGT
for promoter at position - 17708
17684 (+) S01153 AARKGA
17634 (-) S01153 AARKGA
17633 (-) S01090 AATGA
17572 (+) S00922 AGAGG
17644 (-) S00922 AGAGG
17590 (+) S00392 AGGAAG
17425 (+) S00395 CACGCW
17501 (-) S00395 CACGCW
17620 (+) S00089 CANYYY
17638 (+) S00089 CANYYY
17666 (-) S00089 CANYYY
17601 (-) S00089 CANYYY
17445 (-) S00089 CANYYY
17409 (-) S00089 CANYYY
17630 (+) S01616 CATTW
17554 (+) S00243 CCACCA
17529 (-) S00243 CCACCA
17561 (+) S02113 CCAGCTG
17447 (-) S01003 CCCAG
17673 (+) S00753 CGTGAC
17649 (-) S00481 CTATCA
17655 (-) S01622 CWKKANNY
17589 (+) S01502 GAGGAA
17688 (+) S00973 GAGGC
17688 (+) S02135 GAGGCC
17451 (+) S00539 GATGGCCG
17454 (+) S00437 GGCCG
17601 (+) S01445 GTGAGTCAG
17437 (+) S02023 MAMAG
17547 (+) S02023 MAMAG
17660 (+) S02023 MAMAG
17683 (+) S02023 MAMAG
17697 (-) S02023 MAMAG
17603 (-) S02023 MAMAG
17544 (-) S02023 MAMAG
17525 (-) S02023 MAMAG
17568 (-) S01950 RCAGNTG
17472 (-) S01950 RCAGNTG
17609 (-) S00143 STGACTMA
17658 (+) S00435 TACAAA
17634 (+) S00972 TAGGC
17608 (-) S01426 TGACTCA
17602 (+) S00476 TGAGTCAG
17602 (+) S01424 TGANTMA
17608 (-) S01424 TGANTMA
17602 (+) S01935 TGASTMA
17608 (-) S01935 TGASTMA
17543 (+) S02137 TGGCA
17416 (-) S02137 TGGCA
17418 (+) S01037 TGTTCT
17602 (+) S01595 TKAGTCA
17442 (+) S02121 WCTGG
17418 (-) S02121 WCTGG
17442 (+) S00487 WCTRG
17418 (-) S00487 WCTRG
17644 (+) S00381 WGATAR
17627 (-) S01629 WGNAMCYK
for promoter at position - 21328
21100 (+) S01153 AARKGA
21314 (+) S01153 AARKGA
21168 (-) S01153 AARKGA
21039 (+) S01090 AATGA
21253 (+) S01090 AATGA
21315 (+) S01090 AATGA
21126 (-) S01090 AATGA
21257 (+) S00534 ACGTCA
21260 (-) S00534 ACGTCA
21260 (-) S01257 ACGTCAT
21146 (-) S00922 AGAGG
21306 (+) S00880 ATTGG
21049 (+) S00014 CACACACACA
21149 (+) S00395 CACGCW
21083 (+) S00089 CANYYY
21091 (+) S00089 CANYYY
21179 (+) S00089 CANYYY
21196 (+) S00089 CANYYY
21275 (+) S00089 CANYYY
21247 (-) S00089 CANYYY
21123 (+) S01616 CATTW
21318 (-) S01616 CATTW
21256 (-) S01616 CATTW
21226 (-) S01051 CCAAGT
21310 (-) S00633 CCAAT
21075 (-) S00345 CCCCCGGC
21075 (-) S01936 CCCMNSSS
21258 (+) S00489 CGTCA
21259 (-) S00489 CGTCA
21294 (+) S00252 CTGATTA
21123 (+) S01622 CWKKANNY
21182 (+) S01622 CWKKANNY
21159 (-) S01622 CWKKANNY
21112 (-) S01622 CWKKANNY
21159 (+) S00741 GATTTC
21256 (+) S00144 KWCGTCA
21261 (-) S00144 KWCGTCA
21328 (-) S02023 MAMAG
21237 (-) S02023 MAMAG
21169 (-) S02023 MAMAG
21254 (+) S00559 NTGACGTCAN
21263 (-) S00559 NTGACGTCAN
21254 (+) S00153 RTGACGT
21273 (-) S01190 RYYWSGTG
21091 (+) S01205 SWATWWAG
21311 (+) S00435 TACAAA
21106 (+) S00087 TATAAA
21255 (+) S01059 TGACGT
21262 (-) S01059 TGACGT
21255 (+) S00969 TGACGTC
21262 (-) S00969 TGACGTC
21255 (+) S00072 TGACGTCA
21262 (-) S00072 TGACGTCA
21255 (+) S02107 TGACGTYW
21262 (-) S02107 TGACGTYW
21255 (+) S01940 TGACGYMR
21262 (-) S01940 TGACGYMR
21041 (+) S01424 TGANTMA
21295 (+) S01424 TGANTMA
21047 (-) S01424 TGANTMA
21293 (-) S01037 TGTTCT
21094 (+) S02000 TKNNGNAAK
21240 (+) S00563 TNNAKYNNKNNMTNATGA
21278 (+) S00487 WCTRG
21202 (+) S00381 WGATAR
21319 (+) S01629 WGNAMCYK