Transition probabilities from selected texts: Difference between revisions

From Derek
Jump to navigation Jump to search
A1133050 (talk | contribs)
Reference to bit score in HMMER manual
A1133050 (talk | contribs)
New data using new punctuation filter
Line 5: Line 5:
HMMER score<ref>ftp://selab.janelia.org/pub/software/hmmer/CURRENT/Userguide.pdf Page 43</ref> is the log (base 2) of Markov probability / null probability (1/26^44)
HMMER score<ref>ftp://selab.janelia.org/pub/software/hmmer/CURRENT/Userguide.pdf Page 43</ref> is the log (base 2) of Markov probability / null probability (1/26^44)


<!--
This software output has been formatted in html for quick entry into the project wiki
-->
==First order==
==First order==
===All letters===
===All letters===
<br/>(1984 - George Orwell.txt) All Letters
<br/>(..\Texts\1984 - George Orwell.txt) All Letters
<br/>Markov Probability: 1.4641414719132942E-71
<br/>Markov Probability: 1.4822672916815308E-71
<br/>Corrected Zeroes:    1
<br/>Corrected Zeroes:    1
<br/>HMMER Score:        -28.487492178774602
<br/>HMMER Score:        -28.46974151192516
<br/>
<br/>
<br/>(Les Orientales - Victor Hugo.txt) All Letters
<br/>(..\Texts\Les Orientales - Victor Hugo.txt) All Letters
<br/>Markov Probability: 1.1571661202766167E-78
<br/>Markov Probability: 7.955726018472886E-79
<br/>Corrected Zeroes:    2
<br/>Corrected Zeroes:    2
<br/>HMMER Score:        -52.08044781349958
<br/>HMMER Score:        -52.620978304803444
<br/>
<br/>
<br/>(Traumdeutung - Sigmund Freud.txt) All Letters
<br/>(..\Texts\Traumdeutung - Sigmund Freud.txt) All Letters
<br/>Markov Probability: 3.866259362091221E-77
<br/>Markov Probability: 3.749298888974187E-77
<br/>Corrected Zeroes:    1
<br/>Corrected Zeroes:    1
<br/>HMMER Score:        -47.018177286334875
<br/>HMMER Score:        -47.062494868234964
<br/>
<br/>
<br/>(Vigenere - 1984.txt) All Letters
<br/>(..\Texts\Vigenere - 1984.txt) All Letters
<br/>Markov Probability: 1.646391769425068E-70
<br/>Markov Probability: 1.646391769425068E-70
<br/>Corrected Zeroes:    0
<br/>Corrected Zeroes:    0
Line 28: Line 31:
<br/>
<br/>
===Initial letters===
===Initial letters===
<br/>(1984 - George Orwell.txt) Initial Letters
<br/>(..\Texts\1984 - George Orwell.txt) Initial Letters
<br/>Markov Probability: 1.9187432339606176E-56
<br/>Markov Probability: 2.0136596296001355E-56
<br/>Corrected Zeroes:    0
<br/>Corrected Zeroes:    0
<br/>HMMER Score:        21.731535947650737
<br/>HMMER Score:        21.80119412864152
<br/>
<br/>
<br/>(Les Orientales - Victor Hugo.txt) Initial Letters
<br/>(..\Texts\Les Orientales - Victor Hugo.txt) Initial Letters
<br/>Markov Probability: 7.809561685705767E-61
<br/>Markov Probability: 3.3267604806714393E-60
<br/>Corrected Zeroes:    0
<br/>Corrected Zeroes:    0
<br/>HMMER Score:        7.14697538897068
<br/>HMMER Score:        9.237779904103608
<br/>
<br/>
<br/>(Traumdeutung - Sigmund Freud.txt) Initial Letters
<br/>(..\Texts\Traumdeutung - Sigmund Freud.txt) Initial Letters
<br/>Markov Probability: 4.553994899282612E-68
<br/>Markov Probability: 3.820168061668581E-68
<br/>Corrected Zeroes:    1
<br/>Corrected Zeroes:    1
<br/>HMMER Score:        -16.88463017855261
<br/>HMMER Score:        -17.138126745609156
<br/>
<br/>
==Second order==
==Second order==
===All letters===
===All letters===
<br/>(1984 - George Orwell.txt) All Letters
<br/>(..\Texts\1984 - George Orwell.txt) All Letters
<br/>Markov Probability: 2.115089006082555E-99
<br/>Markov Probability: 3.9262648017739784E-100
<br/>Corrected Zeroes:    14
<br/>Corrected Zeroes:    15
<br/>HMMER Score:        -120.970815420271
<br/>HMMER Score:        -123.40030441377017
<br/>
<br/>
<br/>(Les Orientales - Victor Hugo.txt) All Letters
<br/>(..\Texts\Les Orientales - Victor Hugo.txt) All Letters
<br/>Markov Probability: 4.429249667205306E-106
<br/>Markov Probability: 2.1087630055723357E-106
<br/>Corrected Zeroes:    18
<br/>Corrected Zeroes:    18
<br/>HMMER Score:        -143.15796813874405
<br/>HMMER Score:        -144.22863349364334
<br/>
<br/>
<br/>(Traumdeutung - Sigmund Freud.txt) All Letters
<br/>(..\Texts\Traumdeutung - Sigmund Freud.txt) All Letters
<br/>Markov Probability: 3.796449095384246E-119
<br/>Markov Probability: 3.731464295941246E-119
<br/>Corrected Zeroes:    21
<br/>Corrected Zeroes:    21
<br/>HMMER Score:        -186.56544502943777
<br/>HMMER Score:        -186.5903538114491
<br/>
<br/>
<br/>(Vigenere - 1984.txt) All Letters
<br/>(..\Texts\Vigenere - 1984.txt) All Letters
<br/>Markov Probability: 1.669944098510842E-92
<br/>Markov Probability: 1.669944098510842E-92
<br/>Corrected Zeroes:    8
<br/>Corrected Zeroes:    8
Line 66: Line 69:
<br/>
<br/>
===Initial letters===
===Initial letters===
<br/>(1984 - George Orwell.txt) Initial Letters
<br/>(..\Texts\1984 - George Orwell.txt) Initial Letters
<br/>Markov Probability: 7.009981410871375E-61
<br/>Markov Probability: 7.555198589304339E-61
<br/>Corrected Zeroes:    2
<br/>Corrected Zeroes:    2
<br/>HMMER Score:        6.991144428568879
<br/>HMMER Score:        7.0992034873802545
<br/>
<br/>
<br/>(Les Orientales - Victor Hugo.txt) Initial Letters
<br/>(..\Texts\Les Orientales - Victor Hugo.txt) Initial Letters
<br/>Markov Probability: 2.9707877167599384E-89
<br/>Markov Probability: 1.0973476039668194E-80
<br/>Corrected Zeroes:    12
<br/>Corrected Zeroes:    9
<br/>HMMER Score:        -87.26140732840628
<br/>HMMER Score:        -58.80087939586076
<br/>
<br/>
<br/>(Traumdeutung - Sigmund Freud.txt) Initial Letters
<br/>(..\Texts\Traumdeutung - Sigmund Freud.txt) Initial Letters
<br/>Markov Probability: 2.9078792518323414E-100
<br/>Markov Probability: 1.457883499720296E-103
<br/>Corrected Zeroes:    17
<br/>Corrected Zeroes:    18
<br/>HMMER Score:        -123.83349452718522
<br/>HMMER Score:        -134.7953707374809
<br/>
<br/>


==References==
==References==

Revision as of 22:42, 13 July 2009

The Somerton Man's code (without the extra line) is 44 characters long. So, if the text is purely random (1/26 chance of each letter appearing) then the probability of attaining this particular string of 44 is (1/26)^44 = 5.51027E-63. This is a good initial comparison.

For transitions that have p=0, corrections to p=0.0001 have been performed to attain a non-zero Markov probability.

HMMER score[1] is the log (base 2) of Markov probability / null probability (1/26^44)

First order

All letters


(..\Texts\1984 - George Orwell.txt) All Letters
Markov Probability: 1.4822672916815308E-71
Corrected Zeroes: 1
HMMER Score: -28.46974151192516

(..\Texts\Les Orientales - Victor Hugo.txt) All Letters
Markov Probability: 7.955726018472886E-79
Corrected Zeroes: 2
HMMER Score: -52.620978304803444

(..\Texts\Traumdeutung - Sigmund Freud.txt) All Letters
Markov Probability: 3.749298888974187E-77
Corrected Zeroes: 1
HMMER Score: -47.062494868234964

(..\Texts\Vigenere - 1984.txt) All Letters
Markov Probability: 1.646391769425068E-70
Corrected Zeroes: 0
HMMER Score: -24.99631136880728

Initial letters


(..\Texts\1984 - George Orwell.txt) Initial Letters
Markov Probability: 2.0136596296001355E-56
Corrected Zeroes: 0
HMMER Score: 21.80119412864152

(..\Texts\Les Orientales - Victor Hugo.txt) Initial Letters
Markov Probability: 3.3267604806714393E-60
Corrected Zeroes: 0
HMMER Score: 9.237779904103608

(..\Texts\Traumdeutung - Sigmund Freud.txt) Initial Letters
Markov Probability: 3.820168061668581E-68
Corrected Zeroes: 1
HMMER Score: -17.138126745609156

Second order

All letters


(..\Texts\1984 - George Orwell.txt) All Letters
Markov Probability: 3.9262648017739784E-100
Corrected Zeroes: 15
HMMER Score: -123.40030441377017

(..\Texts\Les Orientales - Victor Hugo.txt) All Letters
Markov Probability: 2.1087630055723357E-106
Corrected Zeroes: 18
HMMER Score: -144.22863349364334

(..\Texts\Traumdeutung - Sigmund Freud.txt) All Letters
Markov Probability: 3.731464295941246E-119
Corrected Zeroes: 21
HMMER Score: -186.5903538114491

(..\Texts\Vigenere - 1984.txt) All Letters
Markov Probability: 1.669944098510842E-92
Corrected Zeroes: 8
HMMER Score: -98.05823732223358

Initial letters


(..\Texts\1984 - George Orwell.txt) Initial Letters
Markov Probability: 7.555198589304339E-61
Corrected Zeroes: 2
HMMER Score: 7.0992034873802545

(..\Texts\Les Orientales - Victor Hugo.txt) Initial Letters
Markov Probability: 1.0973476039668194E-80
Corrected Zeroes: 9
HMMER Score: -58.80087939586076

(..\Texts\Traumdeutung - Sigmund Freud.txt) Initial Letters
Markov Probability: 1.457883499720296E-103
Corrected Zeroes: 18
HMMER Score: -134.7953707374809


References

See also

Back