Editing
Markov models
Jump to navigation
Jump to search
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
In a first order Markov model, you compute the letter transition probabilities, <math>P(a | b)</math>, etc. from the frequencies, so for example for every pair of letters in abbcab the counts are * <math>C(ab) = 2</math> * <math>C(bb) = 1</math> * <math>C(bc) = 1</math> * <math>C(ca) = 1</math> * <math>C(a) = 2</math> * <math>C(b) = 3</math> * <math>C(c) = 1</math> Then compute the probabilities: * <math>P(b|a) = 2 / (2+1+1+1)</math> * <math>P(b|b) = 1 / (2+1+1+1)</math> * <math>P(b|c) = 1 / (2+1+1+1)</math> * <math>P(c|a) = 1 / (2+1+1+1)</math> * <math>P(\textrm{any other combinations}) = 0</math> * <math>P(a) = 2/(2+3+1)</math> * <math>P(b) = 3/(2+3+1)</math> * <math>P(c) = 1/(2+3+1)</math> * <math>P(d)..P(z) = 0</math> Then for some sequence of letters <math>x_1,\ldots ,x_n</math> (the code) The likelihood that that code came from a language (or initial letters of language) that our sequence abbcab came from is <math>P(x_1)P(x_2 | x_1)\ldots P(x_n | x_{n-1} ) =</math> a product of probabilities, assuming independence (since you are treating it as a first Note that you should first take logs (using Java doubles), then compute the sum of the log(probabilities), then raise back as a power of the base of the logarithm, otherwise you'll get too many numerical errors. If the probability is zero, then treat it as -9999 (in the log transformed version). The first term is why you need the single letter frequencies as well. Doing this for the case where you have fifty samples of a text gives you a confidence level in your final p-value, but it makes the estimates of the probabilities wrong. 50 may be too many for short texts. You can extend this to higher order Markov models, eg. for a 2nd order one you estimate <math>P(b|bc)</math> from counts of bcb, etc. == See also == [http://en.wikipedia.org/wiki/Markov_chain Markov chain (Wikipedia)]
Summary:
Please note that all contributions to Derek may be edited, altered, or removed by other contributors. If you do not want your writing to be edited mercilessly, then do not submit it here.
You are also promising us that you wrote this yourself, or copied it from a public domain or similar free resource (see
Derek:Copyrights
for details).
Do not submit copyrighted work without permission!
Cancel
Editing help
(opens in new window)
Navigation menu
Personal tools
Not logged in
Talk
Contributions
Create account
Log in
Namespaces
Page
Discussion
English
Views
Read
Edit
View history
More
Search
Navigation
Main page
Recent changes
Random page
Help about MediaWiki
Tools
What links here
Related changes
Special pages
Page information