Saturday, September 08 2007 @ 09:20 AM CEST Contributed by: ColdT Views: 6594
This is a small, and by no means complete, digestion of some crypto technics.
(All sources that are mentioned in the tutorial below, can be found here)
One method of
encrypting a text is transposition.
One simple transpose is to write out the message on two rows.
Letters 1,3, 5 and so on, you write on the first row.
Letters 2, 4, 6 etc. you write on the second row.
Your secret is your prisoner, let it go and you become its prisoner.
is substitution where you substitute one letter with another letter.
A code substitute a word or group of letter.
Mono alphabetic substitution
One variant is the Caesar roll. This is based on crypto alphabet that's
shifted a number
of steps relative the clear text alphabet. It's common in cryptography
to write out the
clear text in lowercase letter and the crypted text in uppercase letter.
DEFGHIJKLMNOPQRSTUVWXYZABC (here with a three letter right rotation)
veni, vidi vici
YHQL, YLGL, YLFL
To crypt a text, you write out the crypto alphabet under the clear alphabet.
Then you take the letter from the text, here starting with "v",
and go to the
column where v is in the clear alphabet.
Look in the same column in the crypto alphabet to get the letter to substitute
Here v is substituted with Y. Repeat this for all letters in the text.
As you may see, this is not a very strong protection. With a computer
it's easy to
try all possible shifts, where each shift can be regarded as a key to
Mono alphabetic substitution with key
If you choose a more general algorithm where the clear alphabet can be
reordered in any possible way,
you get 400 000 000 000 000 000 000 000 000 possible keys to choose from.
If the enemy gets hold of
the crypted text and he knows the algorithm, he still have to test all
keys to decrypt the message.
If he test one key per second, he approximately would need the time equal
the age of
universe multiplied with 10e9.
It can be shown with mathematical proof, that if you use a random key
as long as the text you want to crypt,
the crypted text CAN NOT be decrypted if you don't know the key. In any
crypto a weak spot is repetitions
that occur if you use a to short key. Another thing to remember is that
you can't use the same key twice,
and you must use a really random key. If you use the key more then once,
there's methods to break the crypto.
A variant to this, but with a less number of keys, is to use a key or
key phrase. Instead of a random alphabet
you take a key or a key phrase, for example "Julius Ceasar".
Remove all spaces and duplicate letters and use
the rest, JULISCAER, as beginning of the crypto alphabet. The rest of
the alphabet is just a shift beginning
where the keyword ends and with the letters in the keyword removed from
This can also be rotated a number of steps to.
A good thing about this is that it is easy to remember the key word or
key phrase. The simplicity in combination
with the strength made this the common method to crypt a text the first
thousand years a.c.
Anyone interested in cracking a monoalphabetic text can look at the mono
crackme in the crackme's.
@Detten add link
But there is ways to break this crypto, the Arabs was the first ones
The oldest known
description, by an Arab, to break a mono alphabetic crypto is from 800 a.c.
The trick is that in any given language there's letters that's used more
If you know the letter frequency for a language, you take the letters frequency
in the crypted
text and substitute it with the one that have about the same frequency for
It's relatively easy to make a qualified guess what letter it is.
The same rules can be applied on one, two, three or more, letter words.
As you are a cracker, it's easy to write a small app to analyse some
text files on your computer.
What kind of text you analyse of course affects the result of the letter
frequency. If you for example
analyze a sorce code file, you get a funny letter and word distribution.
If it's a *.asm file you get a
lot of three letter words like: eax,ebx,ecx...:-).
For many centuries
the mono alphabetic substitution crypto was the prevailing method to write
a secret text.
But after the Arabs invention of the frequency analyze, this method was
no longer a safe one to use.
The original idea for this new crypto came from a man named Leon Battista
Alberti, born 1404.
His idea was use two or more crypto alphabet and switch between them.
Crypto alphabet 1:
Crypto alphabet 2:
If we crypt the word "hello" we use the first alphabet for
h, that become A.
For the second letter e we use the second alphabet, e becomes F, and so
With this method hello becomes AFPAD.
A man named Blaise de Vigenére, born 1523, developed this idea
Instead of using two or three alphabet, he used 26 (for a-z for English).
Although he based his work on ideas of previous thinkers, this method
is known as the Vigenére crypto.
The use of more than one crypto alphabet gives this type of crypto the
name polyalphabetic crypto.
The first step to use this is to write out a Vigenére table. This
is done by writing down a clear alphabet
followed by 26 crypto alphabets, each one rotated on step in relation
to the previous one.
The first row is a crypto alphabet with a Caesar roll, a rotation of one
The second one have a two step rotation and so on.
To use it, you use a new row for each letter you crypt.
To decrypt it the user must know which row to use for which letter.
You maybe use row 5 for the first letter, row 14 for the second, row 21
for the third and so on.
To decrypt the text the receiver must know which row shifts to use.
One way to do this is to use a key word.
To crypt the message "Begin attack at sundown" and the key GREEN
we do like this.
Write out the key word above the message over and over, so each letter
in the message
is linked to a letter in the key word.
Then you create the crypto text like this:
Crypt the letter b by taking the letter in the key word above, here it is
This letter is the index to a row in the Vigenére table.
The row beginning with G, row 6, is the crypto alphabet to use for
the first letter b. In the row for the clear alphabet, go to the column
for the letter b. To find the substitution letter, follow this column down
to row 6, here you find the letter H to substitute for b.
Repeat this procedure for the letter e. The key letter above e is R which
us the row that start with R, row number 17. In the clear alphabet, go
to the column
for the letter e in the clear alphabet, follow this column down to row
17 to get the substitution letter V.
Repeat this for each letter in the message.
Each letter in the keyword gives us a different crypto alphabet to use.
The number of letters in the key word gives us the number of rows to use
Vigenére table. In this example we use four rows, G, R, E and N
(E is doubled here).
The message "beginattackatsundown" crypted with the keyword
GREEN comes out as:
With a longer keyword, or a key sentence, you add more rows from the
table you make the crypto more
difficult to break. The frequency analysis, described above, has no effect
on this kind of crypto.
If you have a word with at double letter, like attack above, the first
t becomes K and the second becomes X.
If you do a frequency analysis of this, there's no way to tell that this
is the letter t.
Another strength is the enormous number of keys you can use. Any word
in the book can be used,
you even can make up you own words. It's just not possible to try all
The work by Vigenéres was published in Tracicté des Chiffres
1586 but did not come to common use
until 200 years later.
Babbage and Vigenére
earlier, the strength of the Vigenére crypto is that a letter can
be crypted on a number of
different ways. If the keyword is KING is used, there's four different ways
to crypt a given letter.
The same apply to words that's crypted. The word "the" can be
crypted as, DQR, BUK, GNO or ZRM depending on
the words relation to the keyword KING. You get no clues from the frequency
This making the decrypting much harder, but not impossible.
A man named Charles Babbage, born 1791, was the first one to break this
kind of crypto.
Among many things, he was devoted to statistics. He was one of the first
one making statistic on born/death relations,
today commonly used by insurance companies.
He thought that if there's only four ways to crypt the word "the",
and the word is used many times in the text,
the possibility that the same crypto variant is repeated on a number of
places in the text is rather significant.
The longer the text, the greater the chance. It was this kind of repetition
that lead Babbage to the decryption
of the Vigenére crypto.
Babbage stated some fairly simple steps to break the Vigenére
Look for letter sequences that repeat more that once in the text. There's
a possibility that it's the same word
crypted with the same part of the keyword. The distance between the repetitions
and the distance from the beginning
of the text, gives you some clues of the length of the keyword. When you
know the length of the keyword, you know
how many crypto alphabet that was used to crypt the text. If you use just
one alphabet, it's just a
mono alphabetic crypto, and that you know how to decrypt.
If you found out that the keyword is five letters, you use the first
alphabet for letter 1, 6 ,11 and so on.
For letter 2, 7, 12... you use the second alphabet and so on. You only
have to use as many alphabet as there
are letters in the keyword. But how do we know what alphabet to use?
You already know the answer: Frequency analysis.
When you know the length of the keyword, you just have to use frequency
Remember that the crypto alphabet is just a clear alphabet rotated one
step in relation to the previous one.
Implementing Vigenére in ASM
To implement Vigenére crypto in asm we need a lot of Vigenére
tables in memory. To this we need a lot of
index and key char pointers. NOT
This was my first approach, but if we stop and think a bit we can reduce
it to this.
Crypting: add al,ah ;al is clear char and ah is key char sub al,"A"+"A" cmp al,25 jng @F sub al,26 ;Overflow, wrap around @@: add al,"A" ;al is now crypted char
sub al,ah ;al is crypt char and ah is key char
add al,26 ;Underflow, wrap around.
add al,"A" ;al is now clear char
To understand the code above, remember that the Vigenére table
is just 26 alphabets where each one is
rotated one step in relation to the previous. If we simplify it a bit
and just use one alphabet,
a-z, we can look at it with a new approach. In the above example with
the word "the" and the key
"KING" we get "t" as the first letter to crypt with
key char "K". As you remember,
we go to the "t" column in the clear alphabet, and from there
down to the line starting with k.
In the point of intersection we find "D".
Another way to look at it is, go to the "K" column of the alphabet
starting with "A", column 10 (or "A" - "K",
where the "A"-column is column 0). Add the number of the "T"
column, column 19, to the "K" column and we get 29.
As the English alphabet only have 26 letters we get an overflow, or wrap
around of 4 (remember "A" column is 0).
If we add 4 to 0 we end up in the "D" column, which is the substitution
letter for "T" with the key char "K".
We apply the same rules for the decryption, but here we subtract the
key character from the crypted character.
Instead of overflow we look for underflow and if this happen, we shift
it up 26 letters.
@Detten add link vigenere.asm
The complete source code can be found here, if Detten is so kind and add
a link to it ;-).
And last, an exercise in Vigenére:
BBLM RS VRJ XTYOETOSWP UNTYOJH XBLHCOQ DLVTSQX FHO T PRQMJLJ UJG?
QXJ CD FJDG YK JWTBTKM FHO BB DCXLYCHDS HYW WSBUDTOS NZ IUAA GNNS,
MQE QDMYC BB UUOI NZ VJRTI LLZVNRKOX.
QSTC IU DMY OBOFGBJHNX KEVGJYY XAOVSH UYW TIPUD?
YCHCIE SX ODBWG C PJUEANR....MSSEJ BB UUSSA EAN WJYQY NARCMOS.
The strength of the Vigenére crypto also makes this kind of crypto
harder to use.
One candidate to a crypto stronger than the monoalphabetic crypto, but
easier to use than the
Vigenére crypto is the homophonic substitution crypto.
In this crypto you substitute a letter with many letters, and the number
of substitute characters is
proportional to the frequency of the letter. If for example the letter
a have a frequency of 9%,
we assign 8 characters to substitute for a. For every occurence of the
letter a in the clear text,
we substitute it with any of the 8 characters we assigned to a, not important
After crypting the text, every character substituted for a then have about
1% frequency in the crypted text.
If b, or any letter, have 1% frequency, we only have to substitute it
with 2 characters.
When all letters in the clear text is crypted, every character then have
about 1% frequency in the crypted text.
An example of homophonic substitution crypto. Here number is used. a b c d e f g h i j k l m n o p q r s t u v w x y z 09 48 13 01 14 10 06 23 32 15 04 26 22 18 00 38 94 29 11 17 08 34 60 28 21 02 12 81 41 03 16 31 25 39 70 37 27 58 05 95 35 19 20 61 89 52 33 62 45 24 50 73 51 59 07 40 36 30 63 47 79 44 56 83 84 66 54 42 76 43 53 46 65 88 71 72 77 86 49 67 55 68 93 91 90 80 96 69 78 57 99 75 92 64 85 74 97 82 87 98
With a frequency of 1% for every character, there's no way to use
frequency analysis on the crypted text. Is this unbreakable? NOT
There's a number of discrete clues for a clever decrypter.
Every letter in any language have its own quality and relation to other
This can be distinguished even if homophonic substitution crypto is used.
In english the letter q is always followed by the letter u. No other
letter can follow q.
If we want to decrypt an english text crypted with homophonic substitution
we now that q is a rare
letter probably only substituted with one character (or number).
We also know that the letter u have e frequency of about 3% of all letters
in an english text and therefore
probably substituted with 3 characters.
If we find a character in the crypted text that is always followed by
the same three characters,
we have reason to believe that these characters stands for the letter
u, and the first stand for q.
Other letter are harder to pick, but their internal relation reveals which
one it is.
It's possible to break this kind of crypto, but it's a lot safer than
a simple monoalphabetic crypto.
At first glance, the homophobic substitution crypto seems to be some
kind of polyalphabetic crypto.
Every clear text letter can be substituted with a number of characters,
but there's a significant difference.
In the example above, the letter a is represented by 8 different numbers.
These numbers represent the letter a and only the letter a. In a polyalphabetic
crypto a clear text letter
can be substituted with different letters, but here the substitution letters
can represent different clear
text letters. Because of this, the homophonic substitution crypto is said
to be a monoalphabetic crypto.
By the time the crypto alphabet is made, it's used through out the whole
It makes no difference that there's a number of substitute for a letter.
If you use a polyalphabetic crypto, you constantly change between different
@Detten add link homo.asm homo.inc
My way of implementing homophonic substitution crypto in asm can be found
The playfair crypto
was created by Lyon Playfair. This crypto substitutes every two letters
in the clear
text with another letter pair. To crypt and decrypt a text, the sender and
receiver must first agree on a keyword to use.
The crypto is used like this.
Write out the letters in an 5 x 5 square (a-z) and let I and J share the
same place. Start with the keyword .
If we use the keyword CHARLES we get:
C H A R L E S B D F G I/J K M N O P Q T U V W X Y Z
Then you divide the message into letter pairs, called bigram. Every bigram
must consist of two different letters.
Therefore you put in an x for double letters that otherwise would end
up in the same bigram.
Also put an x at the end if the text end with a single letter.
We meet at hammersmith bridge at seven.
Text in bigram:
Then the crypting start.
Every bigram fall into one of the following categories.
1 Both letters is on the same line
2 Both letter is in the same column
3 Neither 1 or 2.
1 If both letter is in the same row, substitute them with the letter
to the right if each one.
MI becomes NK. If the one letter is the last in the row, substitute it
with the first letter in the row.
2 If both letters is in the same column, substitute them with the letter
in the column below each one. GE becomes OG. If one letter is in the last
row, substitute it with the letter from the first row. YR becomes RD.
3 If neither 1 or 2 apply, we do like this. To crypt the first letter,
follow the row it's on until you
reach the column where the second letter is. The letter where the two
intersect is used as a substitute for
letter one. When you crypt the second letter, follow the row it's on until
you reach the column of the first
letter. The letter where the two intersect is used as a substitute for
letter 2. VI becomes WG and SV becomes EW.
If you look at the clear text letters as the corner of a diagonal, the
substitute letters is in the opposite corners.
The bigram text:
we me et at ha mx me rs mi th br id ge at se ve nx
The crypted text:
In ADFGVX crypto, both substitution and transposition is used. The crypto
is used like this.
In a 6 x 6 square, 36 positions, write out the letters a-z and the numbers
0-9 in a random order.
Both rows and columns in the square is named with the letters A, D, F,
G, V and X.
The order of the letters in the square becomes a part of the key and therefore
the receiver must have a
copy of the square.
A D F G V X A 8 p 3 d 1 n D l t 4 o a h F 7 k b c 5 z G j u 6 w g m V x s v i r 2 X 9 e y 0 f q
The first step is to look at in which row and column each letter in the
Write down the two letters that refer to this position in the square.
In this example 8 is substituted with AA and p with AD.
Attack at 2230
Crypt step 1: A t t a c k a t 2 2 3 0 DV DD DD DV FG FD DV DD VX VX AF XG
So far it is a simple monoalphabetic substitution crypto, breakable with
The second step in the crypto is the traspose. The traspose is based on
a keyword, and in this example we use MARK.
This keyword must also be know by the receiver.
The transposition is done like this:
Write out the letters of the keyword as the first row of a new square.
Then write out the text from the first step of the crypto below the keyword
using the same row length as the keyword.
Then arrange the columns so that the letters in the keyword is sorted
from a to z.
The other columns follows the keyword columns.
M A R K -> A K M R D V D D V D D D D D D V D V D D F G F D G D F F D V D D V D D D A V X G V G A X A D G X D X A G
Final crypto text: VDDDDVDDGDFFVDDDVGAXDXAG
The reason for using A, D, F, G, V and X is that they are the letters
in the Morse alphabet that
differs the most from each other. This was done to reduce the number of
errors when the text was sent.
If you are interested in the subject and want to read more, I can recommend
The Code Book by Simon Singh 1999. In this book you can read more about
the cryptos I have covered and more.