December 20 2022
Home | Cryptography by Hand | Prev | Next · |
Cryptography is the discipline of scrambling secret messages so that unintended (and presumably malicious) recipients cannot read or modify them. It has an impressive pedigree that stretches back to early recorded history. Rulers, government officials, generals, spies, and even lovers have used it to communicate secret information. Our computers use it to secure data and Internet communications. Internet commerce would be impossible without cryptography: How could we safely send sellers sensitive personal information and credit card numbers if we couldn't hide them from thieves listening to our messages?
Given that computers and the Digital Age aren't even a century old, the majority of cryptography's history involved ciphers (message scrambling and de-scrambling techniques) that were calculated or operated by hand or with simple mechanical aids. Nowadays, computers and sophisticated statistical procedures can break (render useless) most of these "classical" ciphers. But the one you'll learn here, the one-time pad, is unbreakable if used correctly: No amount of time, mathematical genius, or computing power can defeat it.
The one-time pad gets its name from little pads of paper that some Cold War spies carried in order to use the cipher. A pad was a collection of sheets, each covered in random numbers, usually in groups of five digits:
Key 21956 94809 08073 46992 97968 50891 04507 18697 90662 68118 29582 09502 02598 57719 04014 31783 43527 78015 38275 48739 53685 18989 94226 87427 30332 34767 02247 40442 85884 84779 78675 22582 62749 68854 28269 32996 15942 43445 87924 42588 14951 01202 43987 58983 71268 19516 64844 22101 07391 68238 65637
Each sheet is called a key and has a numeric key ID. We'll give keys random 5-digit key IDs for reasons I'll explain later.
Each key has exactly two copies: one for the sender and one for the receiver. Both sender and receiver must keep their keys absolutely secret: They can't reveal them or even hint at their existence to anyone. And when they finish using their keys, they destroy them, usually by burning or ingesting them. We'll investigate why this is necessary below.
To use the one-time pad, you'll need to create keys. You'll need:
Here's my recommended procedure. It assumes you're using only one 10-sided die labeled 0 to 9 (if it's labeled 1-10, subtract 1 from each die roll):
Create many keys and deliver their copies to your partner in bulk. This reduces how often you have to create and deliver new ones.
Why should you create random 5-digit key IDs? Why not use sequential key IDs (1, 2, 3, and so on)? Encrypted messages include key IDs, so if you were to use sequential key IDs, anyone who intercepts your messages would discover how many you've already sent. Random key IDs hide how many messages you've written.
Before you encrypt a message, you and your partner need to decide:
For an example alphabet and a way to translate its symbols to and from decimal digits, see the CT-37c translation table on this website. This table is excellent for English messages, but you can create any scheme you want. Make sure your partner knows the scheme. (Note: Neither the alphabet nor the translation scheme needs to be secret.)
To encrypt a message:
For example, suppose your secret message is:
meet at 5pm
Suppose you and your partner agreed to use the CT-37c translation table to convert messages to and from decimal digits. Using this table, encode your message as:
m e e t _ a t _ 5 p m 79 6 6 3 99 2 3 99 9055590 80 79
Suppose you choose the following unused key:
Key 21956 94809 08073 46992 97968 50891 04507 18697 90662 68118 29582
Write your encoded message's digits above the key's digits (ignoring the key ID's digits):
Key 21956 79663 99239 99055 59080 79 94809 08073 46992 97968 50891 04507 18697 90662 68118 29582
For each digit you wrote, subtract the key digit below it, modulo 10, and write the result underneath the key's digits:
Key 21956 79663 99239 99055 59080 79 94809 08073 46992 97968 50891 85864 91266 53163 62122 29 04507 18697 90662 68118 29582
Those digits are your encrypted message. Send your partner the key's ID and the encrypted message:
21956 85864 91266 53163 62122 29
Finally, burn the key and any scratch paper you used.
Decryption is the reverse of encryption. To decrypt a message:
For example, suppose you receive this message:
21956 85864 91266 53163 62122 29
Suppose the sender and you agreed to use 5-digit key IDs. Then 21956 is the key's ID. You find that key, which looks like this:
Key 21956 94809 08073 46992 97968 50891 04507 18697 90662 68118 29582
Remove 21956 from the message and write the remaining ciphertext digits on top of the key's digits (ignoring the key ID's digits):
Key 21956 85864 91266 53163 62122 29 94809 08073 46992 97968 50891 04507 18697 90662 68118 29582
For each digit you wrote, add the key digit below it, modulo 10, and write the result underneath the key's digits:
Key 21956 85864 91266 53163 62122 29 94809 08073 46992 97968 50891 79663 99239 99055 59080 79 04507 18697 90662 68118 29582
Those digits are the decrypted (but still encoded) message. If you and your partner agreed to use the CT-37c translation table to convert messages to and from decimal digits, use it to decode the message:
79 6 6 3 99 2 3 99 9055590 80 79 m e e t _ a t _ 5 p m
The sender's secret message is "meet at 5pm".
Finally, burn the key and any scratch paper you used.
Each digit of a key obscures a single digit of a secret message. If the secret message's digit is mi and the key digit is ki, the resulting encrypted message (ciphertext) digit ci is:
Since all three values are single decimal digits 0 to 9, for any given ci and mi, there is exactly one ki for which the above is true.
Attackers who see ci but know neither mi nor ki must solve one equation with two unknown variables. This is impossible because there are many mi and ki combinations that produce ci.
Furthermore, ki was chosen uniformly at random independently of the key's other digits. From an attacker's perspective, all mi are equally likely because all ki are equally likely. This means the encrypted message:
c = 89762 00253 23110
could be any 15-digit secret message. One key k1 would decrypt it to:
c = 89762 00253 23110 +k1 = 90901 23738 35969 ----------------------- m1 = 79663 23981 58079 ("meetat5pm")
But an equally-likely key (from the attacker's perspective) k2 would decrypt the message to:
c = 89762 00253 23110 +k2 = 07551 23509 56581 ----------------------- m2 = 86213 23754 79691 ("waitathome.")
Both k1 and k2 are equally likely from an attacker's perspective, so m1 and m2 are also equally likely. In fact, all keys are equally likely, so given the encrypted message c alone, attackers cannot guess what the secret message is because all secret messages are equally likely!
Used correctly, one-time pads guarantee the following:
You only get the one-time pad's guarantees if you do all of the following:
If you neglect any of these requirements, your messages will be vulnerable. There are many examples throughout modern history of people who forgot these requirements or took shortcuts and subsequently left their messages open to attack. For example:
Despite its perfect secrecy guarantee, one-time pads have serious weaknesses that deter people from using them:
These Python functions encrypt and decrypt a plaintext string of digits using a one-time pad, respectively:
import re digits = re.compile(r'^[0-9]+$') def digit(x: str) -> int: return ord(x) - 48 # ASCII code for '0' def otp_encrypt(plaintext: str, otp_key: str) -> str: if not digits.match(plaintext): raise Exception("plaintext has non-digit characters") elif not digits.match(otp_key): raise Exception("key has non-digit characters") elif len(plaintext) > len(otp_key): raise Exception("key is too short") return ''.join(chr((ord(v[0]) - ord(v[1])) % 10 + 48) for v in zip(plaintext, otp_key)) def otp_decrypt(ciphertext: str, otp_key: str) -> str: if not digits.match(ciphertext): raise Exception("ciphertext has non-digit characters") elif not digits.match(otp_key): raise Exception("key has non-digit characters") elif len(ciphertext) > len(otp_key): raise Exception("key is too short") return ''.join(chr((digit(v[0]) + digit(v[1])) % 10 + 48) for v in zip(ciphertext, otp_key))