Transition probabilities from selected texts: Difference between revisions
Reference to bit score in HMMER manual |
New data using new punctuation filter |
||
Line 5: | Line 5: | ||
HMMER score<ref>ftp://selab.janelia.org/pub/software/hmmer/CURRENT/Userguide.pdf Page 43</ref> is the log (base 2) of Markov probability / null probability (1/26^44) | HMMER score<ref>ftp://selab.janelia.org/pub/software/hmmer/CURRENT/Userguide.pdf Page 43</ref> is the log (base 2) of Markov probability / null probability (1/26^44) | ||
<!-- | |||
This software output has been formatted in html for quick entry into the project wiki | |||
--> | |||
==First order== | ==First order== | ||
===All letters=== | ===All letters=== | ||
<br/>(1984 - George Orwell.txt) All Letters | <br/>(..\Texts\1984 - George Orwell.txt) All Letters | ||
<br/>Markov Probability: 1. | <br/>Markov Probability: 1.4822672916815308E-71 | ||
<br/>Corrected Zeroes: 1 | <br/>Corrected Zeroes: 1 | ||
<br/>HMMER Score: -28. | <br/>HMMER Score: -28.46974151192516 | ||
<br/> | <br/> | ||
<br/>(Les Orientales - Victor Hugo.txt) All Letters | <br/>(..\Texts\Les Orientales - Victor Hugo.txt) All Letters | ||
<br/>Markov Probability: | <br/>Markov Probability: 7.955726018472886E-79 | ||
<br/>Corrected Zeroes: 2 | <br/>Corrected Zeroes: 2 | ||
<br/>HMMER Score: -52. | <br/>HMMER Score: -52.620978304803444 | ||
<br/> | <br/> | ||
<br/>(Traumdeutung - Sigmund Freud.txt) All Letters | <br/>(..\Texts\Traumdeutung - Sigmund Freud.txt) All Letters | ||
<br/>Markov Probability: 3. | <br/>Markov Probability: 3.749298888974187E-77 | ||
<br/>Corrected Zeroes: 1 | <br/>Corrected Zeroes: 1 | ||
<br/>HMMER Score: -47. | <br/>HMMER Score: -47.062494868234964 | ||
<br/> | <br/> | ||
<br/>(Vigenere - 1984.txt) All Letters | <br/>(..\Texts\Vigenere - 1984.txt) All Letters | ||
<br/>Markov Probability: 1.646391769425068E-70 | <br/>Markov Probability: 1.646391769425068E-70 | ||
<br/>Corrected Zeroes: 0 | <br/>Corrected Zeroes: 0 | ||
Line 28: | Line 31: | ||
<br/> | <br/> | ||
===Initial letters=== | ===Initial letters=== | ||
<br/>(1984 - George Orwell.txt) Initial Letters | <br/>(..\Texts\1984 - George Orwell.txt) Initial Letters | ||
<br/>Markov Probability: | <br/>Markov Probability: 2.0136596296001355E-56 | ||
<br/>Corrected Zeroes: 0 | <br/>Corrected Zeroes: 0 | ||
<br/>HMMER Score: 21. | <br/>HMMER Score: 21.80119412864152 | ||
<br/> | <br/> | ||
<br/>(Les Orientales - Victor Hugo.txt) Initial Letters | <br/>(..\Texts\Les Orientales - Victor Hugo.txt) Initial Letters | ||
<br/>Markov Probability: | <br/>Markov Probability: 3.3267604806714393E-60 | ||
<br/>Corrected Zeroes: 0 | <br/>Corrected Zeroes: 0 | ||
<br/>HMMER Score: | <br/>HMMER Score: 9.237779904103608 | ||
<br/> | <br/> | ||
<br/>(Traumdeutung - Sigmund Freud.txt) Initial Letters | <br/>(..\Texts\Traumdeutung - Sigmund Freud.txt) Initial Letters | ||
<br/>Markov Probability: | <br/>Markov Probability: 3.820168061668581E-68 | ||
<br/>Corrected Zeroes: 1 | <br/>Corrected Zeroes: 1 | ||
<br/>HMMER Score: - | <br/>HMMER Score: -17.138126745609156 | ||
<br/> | <br/> | ||
==Second order== | ==Second order== | ||
===All letters=== | ===All letters=== | ||
<br/>(1984 - George Orwell.txt) All Letters | <br/>(..\Texts\1984 - George Orwell.txt) All Letters | ||
<br/>Markov Probability: | <br/>Markov Probability: 3.9262648017739784E-100 | ||
<br/>Corrected Zeroes: | <br/>Corrected Zeroes: 15 | ||
<br/>HMMER Score: - | <br/>HMMER Score: -123.40030441377017 | ||
<br/> | <br/> | ||
<br/>(Les Orientales - Victor Hugo.txt) All Letters | <br/>(..\Texts\Les Orientales - Victor Hugo.txt) All Letters | ||
<br/>Markov Probability: | <br/>Markov Probability: 2.1087630055723357E-106 | ||
<br/>Corrected Zeroes: 18 | <br/>Corrected Zeroes: 18 | ||
<br/>HMMER Score: - | <br/>HMMER Score: -144.22863349364334 | ||
<br/> | <br/> | ||
<br/>(Traumdeutung - Sigmund Freud.txt) All Letters | <br/>(..\Texts\Traumdeutung - Sigmund Freud.txt) All Letters | ||
<br/>Markov Probability: 3. | <br/>Markov Probability: 3.731464295941246E-119 | ||
<br/>Corrected Zeroes: 21 | <br/>Corrected Zeroes: 21 | ||
<br/>HMMER Score: -186. | <br/>HMMER Score: -186.5903538114491 | ||
<br/> | <br/> | ||
<br/>(Vigenere - 1984.txt) All Letters | <br/>(..\Texts\Vigenere - 1984.txt) All Letters | ||
<br/>Markov Probability: 1.669944098510842E-92 | <br/>Markov Probability: 1.669944098510842E-92 | ||
<br/>Corrected Zeroes: 8 | <br/>Corrected Zeroes: 8 | ||
Line 66: | Line 69: | ||
<br/> | <br/> | ||
===Initial letters=== | ===Initial letters=== | ||
<br/>(1984 - George Orwell.txt) Initial Letters | <br/>(..\Texts\1984 - George Orwell.txt) Initial Letters | ||
<br/>Markov Probability: 7. | <br/>Markov Probability: 7.555198589304339E-61 | ||
<br/>Corrected Zeroes: 2 | <br/>Corrected Zeroes: 2 | ||
<br/>HMMER Score: | <br/>HMMER Score: 7.0992034873802545 | ||
<br/> | <br/> | ||
<br/>(Les Orientales - Victor Hugo.txt) Initial Letters | <br/>(..\Texts\Les Orientales - Victor Hugo.txt) Initial Letters | ||
<br/>Markov Probability: | <br/>Markov Probability: 1.0973476039668194E-80 | ||
<br/>Corrected Zeroes: | <br/>Corrected Zeroes: 9 | ||
<br/>HMMER Score: - | <br/>HMMER Score: -58.80087939586076 | ||
<br/> | <br/> | ||
<br/>(Traumdeutung - Sigmund Freud.txt) Initial Letters | <br/>(..\Texts\Traumdeutung - Sigmund Freud.txt) Initial Letters | ||
<br/>Markov Probability: | <br/>Markov Probability: 1.457883499720296E-103 | ||
<br/>Corrected Zeroes: | <br/>Corrected Zeroes: 18 | ||
<br/>HMMER Score: - | <br/>HMMER Score: -134.7953707374809 | ||
<br/> | <br/> | ||
==References== | ==References== |
Revision as of 22:42, 13 July 2009
The Somerton Man's code (without the extra line) is 44 characters long. So, if the text is purely random (1/26 chance of each letter appearing) then the probability of attaining this particular string of 44 is (1/26)^44 = 5.51027E-63. This is a good initial comparison.
For transitions that have p=0, corrections to p=0.0001 have been performed to attain a non-zero Markov probability.
HMMER score[1] is the log (base 2) of Markov probability / null probability (1/26^44)
First order
All letters
(..\Texts\1984 - George Orwell.txt) All Letters
Markov Probability: 1.4822672916815308E-71
Corrected Zeroes: 1
HMMER Score: -28.46974151192516
(..\Texts\Les Orientales - Victor Hugo.txt) All Letters
Markov Probability: 7.955726018472886E-79
Corrected Zeroes: 2
HMMER Score: -52.620978304803444
(..\Texts\Traumdeutung - Sigmund Freud.txt) All Letters
Markov Probability: 3.749298888974187E-77
Corrected Zeroes: 1
HMMER Score: -47.062494868234964
(..\Texts\Vigenere - 1984.txt) All Letters
Markov Probability: 1.646391769425068E-70
Corrected Zeroes: 0
HMMER Score: -24.99631136880728
Initial letters
(..\Texts\1984 - George Orwell.txt) Initial Letters
Markov Probability: 2.0136596296001355E-56
Corrected Zeroes: 0
HMMER Score: 21.80119412864152
(..\Texts\Les Orientales - Victor Hugo.txt) Initial Letters
Markov Probability: 3.3267604806714393E-60
Corrected Zeroes: 0
HMMER Score: 9.237779904103608
(..\Texts\Traumdeutung - Sigmund Freud.txt) Initial Letters
Markov Probability: 3.820168061668581E-68
Corrected Zeroes: 1
HMMER Score: -17.138126745609156
Second order
All letters
(..\Texts\1984 - George Orwell.txt) All Letters
Markov Probability: 3.9262648017739784E-100
Corrected Zeroes: 15
HMMER Score: -123.40030441377017
(..\Texts\Les Orientales - Victor Hugo.txt) All Letters
Markov Probability: 2.1087630055723357E-106
Corrected Zeroes: 18
HMMER Score: -144.22863349364334
(..\Texts\Traumdeutung - Sigmund Freud.txt) All Letters
Markov Probability: 3.731464295941246E-119
Corrected Zeroes: 21
HMMER Score: -186.5903538114491
(..\Texts\Vigenere - 1984.txt) All Letters
Markov Probability: 1.669944098510842E-92
Corrected Zeroes: 8
HMMER Score: -98.05823732223358
Initial letters
(..\Texts\1984 - George Orwell.txt) Initial Letters
Markov Probability: 7.555198589304339E-61
Corrected Zeroes: 2
HMMER Score: 7.0992034873802545
(..\Texts\Les Orientales - Victor Hugo.txt) Initial Letters
Markov Probability: 1.0973476039668194E-80
Corrected Zeroes: 9
HMMER Score: -58.80087939586076
(..\Texts\Traumdeutung - Sigmund Freud.txt) Initial Letters
Markov Probability: 1.457883499720296E-103
Corrected Zeroes: 18
HMMER Score: -134.7953707374809