Editing
Final Report 2012
(section)
Jump to navigation
Jump to search
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
===Statistical Analysis of Letters=== [[Image:OED.png|thumb|300px|right|Initial Letters Frequency in Oxford Dictionary.]] The initial step within this project, involved reviewing the previous work accomplished by other projects and to set up tests that would confirm the results found were consistent. The main focus for verification involved the analysis performed on the code, and the statistical analysis that lead to these results. The results from previous years suggested that the code was most likely to be in English, and represented the initial letters of words. To test this theory and to create some baseline results, the online Oxford English Dictionary was searched and the number of words for each letter of the alphabet was extracted. From this data, the frequency of each letter being used in English was calculated. The results show that there were many inconsistencies with the Somerton Code, but can be used as a baseline for comparison with other results. The main issue with the results produced, is that most of the words that were included are not commonly used, and as such are a poor representation of the English language. Therefore the likelihood of them being used is not related to the number of words for each letter. In order to produce some useful results for comparison a source text was found which had been translated into over 100 different [8]. The text was the Tower of Babel passage from the Bible, and consisted of approximately 100 words and 1000 characters; which allowed it to be a suitable size for testing. <ref name="Tower of Babel"> Ager, S "Translations of the Tower of Babel"; http://www.omniglot.com/babel/index.htm</ref> In order to create the frequency representation for these results a java program was created which would take in the text for each language, and output the occurrence of each of the letters. This process was repeated for 85 of the most common languages available and from this data, the standard deviation and sum of difference to the Somerton Man Code was determine. The results were quite inconsistent with previous results, as well as each other. The top four results for both the standard deviation and sum of difference were: Sami North, Ilocano, White Hmong and Wolof; but were in different orders. These languages are all geographically inconsistent as they represent languages spoken in Eastern Europe, Southern Asia and Western Africa. This suggests that these arenβt likely to represent the language used for the Code. Previous studies suggested that the Code represents the initial letters of words, so this theory was tested as well. The process from before was repeated, with a modified java program which would record the first letter of each word within the text. From these results, the frequencies of each letter occurring was calculated. [[Image:Top20SD.png|thumb|300px|left|Top 20 Standard Deviation for Initial Letters.]] [[Image:Top20SoD.png|thumb|300px|right|Top 20 Sum of Difference for Initial Letters.]] The results described in the figures above, provide more consistent results from before. The top three languages for both the sum of difference and standard deviation were, in order: Ilocano, Tagalog and English. The first two languages are from the Philippines, and since all the other information about the Somerton Man suggests he is of "Britisher" appearance, it is unlikely that he spoke either of these languages. This leaves English as the next likely, as it is the most consistent with the information known about both the code and the Somerton Man. This also reinforces the results that the code most likely represents the first letters of an unknown sentence and is consistent with results from previous years.
Summary:
Please note that all contributions to Derek may be edited, altered, or removed by other contributors. If you do not want your writing to be edited mercilessly, then do not submit it here.
You are also promising us that you wrote this yourself, or copied it from a public domain or similar free resource (see
Derek:Copyrights
for details).
Do not submit copyrighted work without permission!
Cancel
Editing help
(opens in new window)
Navigation menu
Personal tools
Not logged in
Talk
Contributions
Create account
Log in
Namespaces
Page
Discussion
English
Views
Read
Edit
View history
More
Search
Navigation
Main page
Recent changes
Random page
Help about MediaWiki
Tools
What links here
Related changes
Special pages
Page information