Cipher Cross-off List

From Derek
Jump to navigation Jump to search

Purpose

Previous studies into the Tamam Shud case have concluded that the mysterious code left behind is not just random letters; it is in fact a code. This raises the question: What code was used in encrypting this code? This page is aimed at addressing this question. The Cipher Cross-off list is a place where cipher schemes are listed that have been identified as potentially being used in creating the Somerton Man's code. As part of our project, we will be methodically investigating many of these listed ciphers to see if we can rule them out as being used in the encryption of the code.

Cipher Cross-off List

  • Random Sequence of letters
  • Initial letters of an unordered list
  • Initial letters of a sentence
  • Anagram/Transposition Cipher
  • Rail Fence Cipher

Substitution Ciphers

  • Playfair Cipher
  • Trifid Cipher
  • Bifid Cipher
  • Vigenere Cipher
  • Hill Cipher
  • One Time Pad
  • Two-square Cipher
  • Four-square Cipher
First Order Substitution Ciphers
  • Alphabet Reversal Cipher
  • Shift Cipher
  • Affine Cipher
  • Book Cipher
  • Null Cipher

Stream Ciphers

  • Auto-Key Cipher

Substitution and Transposition Ciphers

  • ADFGVX Cipher
  • VIC Cipher

Reasoning

The following section contains the explanations and/or proofs behind the ruled-out ciphers.

Random Sequence of Letters

As part of their work, the students undertaking this project in both 2009 and 2010 conducted surveys of both sober and intoxicated people to see if the letter frequencies obtained were similar to the letter frequencies evident in the Somerton Man's code. Neither of the groups' surveys were consistent with the code and subsequently it was concluded that it is not simply a random bunch of letters. The relevant sections of the previous groups' work can be seen at the following links:

Anagram/Transposition Cipher

By looking at letter frequency plots of various languages against the code, and by identifying other anomalies such as the existence of a 'Q' but no 'U' in the code, the Honours students in 2009 concluded the code did not use a Transposition Cipher alone. The relevant section of their report can be seen here.

Playfair Cipher

The 2009 students concluded that the cipher used was not likely to be the Playfair Cipher based on an empirical test they performed. Their conclusion can be seen here: 2009 Playfair Cipher Conclusion

Vigenere Cipher

The students in 2009 empirically examined the Vigenere Cipher and concluded that it was not likely to have been used. Their summary can be seen here: 2009 Vigenere Cipher Conclusion

One Time Pad

The One-Time pad has been investigated by the 2009 group. They experimented using the Rubaiyat of Omar Khayyam (which is closely linked with the case) and the King James Bible (common at the time) and found no key to de-ciphering the code. Their conclusions can be seen here: 2009 One-Time Pad Conclusion

Alphabet Reversal Cipher

The Alphabet Reversal cipher is a substitution cipher where A becomes Z, B becomes Y, C becomes X etc. This leads to the following encoding and decoding key (read in vertical order):

ABCDEFGHIJKLMNOPQRSTUVWXYZ

ZYXWVUTSRQPONMLKJIHGFEDCBA

Thus, for example, "HELLO" becomes "SVOOL". This cipher was tested on the Tamam Shud code by the 2011 group. A small Java program has been written that takes input from the command line or from a text file and produces output in reversed form. The result of running a file containing the code through the program turns the input of:

MRGOABABD
MTBIMPANETP
MLIABOAIAQC
ITTMTSAMSTGAB

into

NITLZYZYW
NGYRNKZMVGK
NORZYLZRZJX
RGGNGHZNHGTZY

As can be seen, there can be no meaning deciphered from the alphabet-reversed text and thus we can rule out the Alphabet Reversal Cipher as being used in encrypting the Somerton Man's code.

ADFGVX Cipher

The ADFGVX Cipher is a substitution cipher used by German soldiers during WW1 (more information on the cipher available here. We can trivially disprove the use of this cipher by looking at the letter distribution of the Tamam Shud code. The ADFGVX cipher converts messages into a code using only the letters "A,D,F,G,V,X". The Tamam Shud code contains at least 16 different letters of the alphabet, thus we can rule out the ADFGVX cipher and any similar cipher using less than 16 letters.


Affine Cipher

The Affine Cipher is a substitution cipher where each letter is enciphered using a linear formula (ax+b)mod26 where x is a numeric representation of the letter. More information on it can be found here.

The Affine Cipher has been tested by the 2011 students (15/05/2011). Code has been created that cycles through each of the 312 possible combinations for the Affine Cipher, testing each case. The results have been uploaded and can be viewed here. The results do not show any understandable text, and as such, the Affine Cipher has been ruled out.

Two-square Cipher

The Two-square Cipher is similar to the Playfair Cipher in that it is a digraph cipher - it encrypts letters in pairs. This means that the output code should occur in even numbers. In the case of the Somerton Man's code, the lines consist of 9, 11, 11 and 13 letters - no even numbers. This would indicate that a simple digraph encryption technique such as the Two-square Cipher has not been used.

Rail Fence Cipher

The Rail Fence Cipher (or zigzag cipher) was identified as a possibility because of the four distinct lines indicating it could be a 4-rail cipher. A Rail Fence Cipher involves writing out the unencrypted message in a zigzag and then reading it in rows to form the encrypted version. For example, take "Rail Fence Cipher" in a 3-rail cipher:


R   F   E   H    
 A L E C C P E   
  I   N   I   R  

The encrypted form is therefore: RFEH ALECCPE INIR.

We can discount the Rail Fence Cipher as being used in the Tamam Shud code for several reasons. Firstly, it is simply a transposition cipher and previous studies have shown the letter frequency plot is not consistent with a transposition. The presence of a 'Q' in the code without a 'U' also indicates it is unlikely to be a transposition cipher. The final indicator comes from testing the code itself:

   M     R     G     O     A     B     A     B     D 
  M T   B I   M P   A N   E T   P     
 M   L I   A B   O A   I A   Q C      
I     T     T     M     T     S     A     M     S     T     G     A     B

As we can see, the zigzags do not form recognisable words, and there are extra letters overflowing from the top and bottom lines.

VIC Cipher

The VIC Cipher was a cipher scheme issued by the Soviet Union. The version that was examined in the following investigation was the one adapted to the English language thus coinciding with the Somerton Man code. Further information about implementation of the code is available here. Use of the VIC Cipher to generate the Somerton Man code can be trivially disproved as its formula outputs ciphertext consisting only of numerical blocks of length five while the Somerton Man code contains only letters.

For completeness a conversion between numbers and letters was considered. Two cases were examined. The first that there was a two digit number representing each letter in the code and the second using the conventional representation of Z26 with A = 0, B = 1, etc. Both instances failed to produce a numerical representation that was a factor of five which would be inherent in the use of the VIC Cipher system. The possibility that dummy variables could have been used to pad the size were dismissed as too remote and it was decided no change would be made to the original conclusion that the VIC Cipher was not used.

As mentioned above the VIC Cipher scheme that was investigated was the version adapted to the English language thus there is an opportunity for future exploration of alternative languages.

Shift Cipher

Auto-Key Cipher

Bifid Cipher

The Bifid Cipher was first published in 1901. A Polybius square is used with transposition for fractionation encryption. The fractionation that is achieved gives a dependency of each ciphertext character on two plaintext characters, like in the Playfair cipher assessed in 2009. Further information about the Bifid Cipher methodology can be found here.

To test the Bifid Cipher mechanism a known plain text was encoded and the resultant ciphertext was letter frequency analysed and compared to the Somerton Man code. A graph of the relative frequency of each English alphabet letter is shown below. The absence of the letter J in the case of the Bifid Cipher results is in accordance with the encryption methodology where the letters “I” and “J” are represented by the former only.

Bifid Cipher Frequency Analysis
Bifid Cipher Frequency Analysis

Comparison of the Bifid encryption results with the Somerton Man code shows a weak correlation. The results for the Bifid Cipher case show a distribution between all possible ciphertext letters with a deviation significantly smaller than the Somerton Man code. These results were sufficient to conclude that the Bifid Cipher mechanism had not been used to generate the Somerton Man code. The conclusion is not definitive given the small sample size of the Somerton Man code. An interesting observation is that the letter “J” is absent in both results.

Trifid Cipher

The Trifid Cipher was invented in 1901 following publication of the Bifid Cipher. It extends the Bifid Cipher into a third dimension which consequently achieves fractionation that sees each ciphertext character dependent on three plaintext characters. Further information about the Trifid Cipher and example of the encryption methodology can be found here. As the Trifid Cipher requires 27 ciphertext letters, the full-stop was used for the additional character like in the reference material.

Since the Somerton Man code did not contain any characters beyond the traditional English Alphabet, the Trifid Cipher mechanism could not be trivially discounted. Testing therefore followed the same procedure as the Bifid Cipher; a known plaintext was encoded and the resultant ciphertext was letter frequency analysed and compared to the Somerton Man code. The relative frequency of each English Alphabet letter is shown in the graph below, with the “Dot” letter representing the 27th ciphertext character.

Trifid Cipher Frequency Analysis
Trifid Cipher Frequency Analysis

The Trifid Cipher shows an approximately even distribution across all ciphertext letters. The Somerton Man code in comparison is sporadic, with the proportion of letters “A”, “B”, “M” and “T” much larger. From these results it was decided that the Trifid Cipher had not been used to generate the Somerton Man code however as was the case with the Bifid Cipher, the small sample size of the Somerton Man code prevents a definitive conclusion being reached.

See also

References and useful resources

If you find any useful external links, list them here:

Back