Transition probabilities from selected texts

From Derek
Jump to navigation Jump to search

The Somerton Man's code (without the extra line) is 44 characters long. So, if the text is purely random (1/26 chance of each letter appearing) then the probability of attaining this particular string of 44 is (1/26)^44 = 5.51027E-63. This is a good initial comparison.

First Order Transition Probabilities

English (1984 - George Orwell)
Markov Probability: 1.4641414719132793E-67
Corrected Zeroes: 1

French (Les Orientales - Victor Hugo)
Markov Probability: 1.1571661202766258E-70
Corrected Zeroes: 2

Vigenere Cipher (1984 - George Orwell, Keyword LEMON)
Markov Probability: 1.646391769425068E-70
Corrected Zeroes: 0

German (Traumdeutung - Sigmund Freud)
Note: does not account for Eszett (sharp s) character
Markov Probability: 3.8662593620911806E-73
Corrected Zeroes: 1

English Initial Letters (1984 - George Orwell)
Markov Probability: 1.9187432339606176E-56
Corrected Zeroes: 0

French Initial Letters (Les Orientales - Victor Hugo)
counting words like l'hopital as two words (le followed by hopital):
Markov Probability: 7.809561685705767E-61
Corrected Zeroes: 0
discounting the l' (only consider the hopital)
Markov Probability: 1.1841007473332175E-60
Corrected Zeroes: 0


German Initial Letters (Traumdeutung - Sigmund Freud)
Note: does not account for Eszett (sharp s) character. Though I don't think a word can ever start with this character
Markov Probability: 4.29592233581315E-64
Corrected Zeroes: 1

Second Order Transition Probabilies

English (1984 - George Orwell)
Markov Probability: 2.115089006082431E-43
Corrected Zeroes: 14

German (Traumdeutung - Sigmund Freud)
Note: does not account for Eszett (sharp s) character
Markov Probability: 3.79644909538402E-35
Corrected Zeroes: 21

French (Les Orientales - Victor Hugo)
Markov Probability: 4.429249667204738E-34
Corrected Zeroes: 18

Vigenere (English - 1984 - Orwell)
Markov Probability: 1.6699440985106574E-60
Corrected Zeroes: 8

English Initial Letters (1984 - George Orwell)
Markov Probability: 7.009981410871232E-53
Corrected Zeroes: 2

German Initial Letters (Traumdeutung - Sigmund Freud)
Note: does not account for Eszett (sharp s) character
Markov Probability: 2.908650572588623E-32
Corrected Zeroes: 17

Les Orientales - Victor Hugo.txt
Not counting l' as a word (but counting the word contracted with it):
Markov Probability: 1.0762921500526206E-40
Corrected Zeroes: 12
Counting the l' as one word and the other contracted word as another word:
Markov Probability: 2.970787716759867E-41
Corrected Zeroes: 12