Decoding Cryptography: Understanding the Attacker (and the 4 Attack Models)

The Love Letter Problem

Remember when you were a kid and wanted to write a love letter to your sweetheart, but a nosy neighbour kept peeking while you were trying to write?

This has been a problem faced by people ever since we learnt how to communicate.

We can't always stop people from peeking, but what if, even if they read the message, only the one for whom the message is intended can understand it? Enter the concept of Cryptography .

From Art to Science

Cryptography started as an art. An art of making puzzles that could only be solved by people in the know. An example would be "crypto" turned to "dtbqvr" using Vigenère cipher. But you couldn't be sure with a certain probability that no one would be able to crack the cipher (other than the people meant to, of course). So how do you actually know your puzzle is hard to crack?

As our computing powers progressed, cryptography turned from an art form to a solid mathematical field. Read on to find why that matters.

Symmetric Encryption — Alice and Bob

Now, let's get back to your problem of having a Peeping Tom over your shoulder but wanting to convey your love in writing. For a moment, let's forget this distracting analogy and consider the case of two famous friends Alice and Bob.

Alice and Bob want to correspond secretly, but they don't trust the postman to not open their letters. So they come up with a key — a tool they would use to lock their letters so that only those with the key can open the letters. For the sake of avoiding any future confusion, I want to clarify that whenever I mention key, I mean a number (could be any random number really:P).

Alice (the sender) writes the letter (message), locks the letter with the key (this process is called encryption). The locked letter (ciphertext) is sent through the post office (the channel). Bob (the receiver) obtains the letter and uses the same key to open it (a process termed decryption).

Symmetric Encryption Flow

Whenever the sender and receiver share a secret key to encrypt and decrypt their messages, the process is known as "symmetric encryption". Encryption protects confidentiality, but by itself does not necessarily protect against tampering. Some examples of Symmetric Encryption schemes are AES, 3DES, ChaCha20 etc.


Kerckhoffs' Principle — Why We Need Math

Now here's the thing. Pre-modern cryptography relied on clever looking algorithms and assumed that the attacker (our peeping Tom) won't know the algorithm being used. The security came from keeping the method, i.e., the algorithm, secret. That changed in 1883.

That year, Kerckhoffs' Principle was coined. It states that "a cryptographic system should be secure even if everything about the system, except the key, is public knowledge".

In our analogy, we simply assume the postman already knows exactly how Alice and Bob are locking their letters. The only thing he doesn't know is the key. And yet, the system should still be secure.

Why is this a good idea? A few reasons:

  • A key is short (say, 256 bits). An algorithm is thousands of lines of code. Keeping a key secret is far easier than keeping an entire algorithm secret.
  • If a key leaks, you just pick a new one. If an algorithm leaks, you need to invent a new algorithm — that's much harder.
  • Published algorithms go through public scrutiny. An algorithm that has survived 30+ years of attacks from the world's best cryptographers is far more trustworthy than a proprietary black box.

But this also means we can no longer just wave our hands and say "trust me, it's secure". If the algorithm is public, we need mathematical proof that knowing the algorithm doesn't help the attacker. This is what turned cryptography from an art into a science.


The Key Sharing Problem

As the reader, now you wonder, how did Alice and Bob actually share the key in the first place? If they could share the key, couldn't they share the secret message the same way? These are the questions we answer next, just be patient.

Asymmetric Encryption — No Shared Key Needed

It is easy to calculate 1013 * 9973 by hand. If you have a calculator, you won't even be afraid of 10 times longer multiplications. But what if I ask you to factorize "15972623"? It is very difficult to do so by hand.

And what if instead of an 8 digit number, it was a 500 digit number with just two prime factors? Just the number of 250 digit numbers is too huge to fathom and finding the factors is believed to be computationally infeasible for sufficiently large numbers.

The best known attacks against RSA, an asymmetric encryption algorithm, involve factoring a very large integer into its prime factors. No efficient algorithm for this problem is known.

Asymmetric encryption involves sharing a message over an unsecure channel without a shared key. Instead, there are two keys: a public key (which everyone can see) and a private key (which only the receiver holds). Anyone can encrypt a message using the public key, but only the holder of the private key can decrypt it.

RSA, ECC, ElGamal are a few commonly used Asymmetric Encryption schemes.

Asymmetric Encryption Flow

Why Both? Symmetric + Asymmetric

If I can share messages without a shared key, why do I need the key in the first place? Turns out using a symmetric key is much faster, and as the message gets longer, the difference becomes dramatic. So in practice, we use asymmetric encryption to share a short key (e.g. 256 bits), and then use symmetric encryption with that key for the actual messages. Best of both worlds.


Note: readers beware that this blog has gone wild and will be a bit technical hereon.

You might find my use of a few terms (like negligible, difficult, short, long etc) vague. From now on, we'll start being more precise about what these things actually mean.

Deterministic vs Randomized Algorithms

Before we formalize encryption, we need to understand a distinction that is fundamental to making encryption secure.

A deterministic algorithm is what you'd normally expect: give it the same input, and it always produces the same output. Run it a hundred times with input x, and you get the same output y a hundred times. No surprises.

A randomized algorithm is different. Like flipping a coin, running the same algorithm with the same input twice might give you two different outputs. The output depends not just on the input, but also on chance.

Why does this matter for encryption? Imagine Alice encrypts "I love you" with the same key twice. If the encryption algorithm is deterministic, the postman sees the same ciphertext appear twice and exclaims: "Aha, she's sending the same message again!" He doesn't know what the message is, but he knows it's being repeated — and that alone can be valuable information.

So, encryption must use fresh randomness (or an equivalent unique nonce) so that encrypting the same message twice does not produce the same ciphertext.

The Formal Syntax: Gen, Enc, Dec

Now that you understand randomness, let's be precise about what a symmetric encryption scheme actually is. It consists of 3 algorithms:

Gen() → k (Key Generation)

The key generation algorithm takes no input. Internally, it generates random bits and outputs a key k. Since it's randomized, every time you run Gen, you could get a different key. The set of all possible keys it could output is called the key space K.

For example, if Gen outputs 256-bit keys, then the key space is the set of all 256-bit strings — that's 2²⁵⁶ possible keys.
Enc(m, k) → c (Encryption)

The encryption algorithm takes the plaintext m and a key k, uses fresh randomness (or a nonce), and outputs a ciphertext c. Since it uses randomness, encrypting the same message with the same key multiple times will produce different ciphertexts — exactly what we want.
Dec(c, k) → m (Decryption)

The decryption algorithm takes a ciphertext c and the key k, and outputs the original plaintext m. This one is deterministic — and it has to be! If Bob decrypts the same message as "I love you" once and "I hate you" the second time around, that would be a tragedy.

Notice the asymmetry: Gen and Enc are randomized, but Dec is deterministic.

Correctness

The first property we demand from any encryption scheme is straightforward: it should actually work.

Formally:

Dec(Enc(m, k), k) = m

(for every key k that Gen could output and every message m)

Think of it like a physical lock. If Alice locks a box with a key and sends it to Bob, and Bob has a copy of the same key, he should be able to open the box. If the key doesn't open the lock it created — well, that's a pretty useless lock.

That was simple right?

Privacy — What Does "Secure" Even Mean?

Intuitively, we want the ciphertext to not reveal anything about the message to someone who doesn't have the key. But formalizing this intuition, It’s Tricky. Let me walk you through a series of reasonable looking attempts.

Attempt 1: "An encryption scheme is secure if the ciphertext doesn't reveal the key."

Makes sense — if the attacker gets the key, they can decrypt everything. So at minimum, the ciphertext shouldn't leak the key.

But consider an encryption algorithm that just outputs the plaintext as the ciphertext: Enc(m, k) = m. The ciphertext reveals nothing about the key. Obviously, this fails.

Attempt 2: "An encryption scheme is secure if the ciphertext doesn't reveal the plaintext."

Alice just writes "I l$ve you" and sends the letter. How can you say that this reveals the plaintext? We need to be more precise.

Attempt 3: "An encryption scheme is secure if the ciphertext doesn't reveal any individual character of the plaintext."

This plugs the hole from attempt 2. But again, our example of Vigenère cipher in the beginning should be a hint to why this attempt is still not final.

Attempt 4: "An encryption scheme is secure if the ciphertext doesn't reveal any meaningful information about the plaintext."

Sounds good! But what does meaningful mean? Maths might have the answer.

Attempt 5: "An encryption scheme is secure if the ciphertext doesn't help the attacker compute any function of the plaintext."

Now we're getting somewhere. The ciphertext should be useless for computing anything about the message, whether that's the message itself, a single character, a range, or any other property.

But two questions remain:

  1. How do we mathematically formalize whether the ciphertext "helped" the attacker?
  2. What exactly can the attacker do? Just stare at the ciphertext? Or something more?

The first question leads to deep probability theory (which I'll save for another blog). The second question is what we tackle next.

Attack Models — What Can the Attacker Do?

In all the models below, the setup is the same: Alice and Bob have agreed on a key k (unknown to the postman), Alice has encrypted a message, and the postman has intercepted the ciphertext c. The postman knows the encryption and decryption algorithms (thanks, Kerckhoffs). The goal of the postman is to compute some function of the underlying plaintext.

What differs across models is what additional powers the postman has.

COA — Ciphertext Only Attack

The simplest scenario. The postman has only the ciphertext. That's it. The postman reads the envelope and tries to figure out the message. No tricks, no prior knowledge — just the scrambled text.

This is a passive, eavesdropping-only attack. Every encryption scheme should at least be secure against this.

Mathematically, we write this as:

Adversary sees: c₁, c₂, ..., cₜ (intercepted ciphertexts)
Adversary knows: Enc, Dec algorithms (Kerckhoffs' Principle)
Adversary DOES NOT know: key k
Goal: Learn something about the plaintexts m₁, m₂, ...

KPA — Known Plaintext Attack

The postman now has something extra: a collection of (message, ciphertext) pairs encrypted under the same unknown key k. Think of it this way — the postman knows that every letter starts with "Dear Bob", so he already knows what the encryption of "Dear Bob" looks like under this key.

Armed with this database, a new letter arrives, and the postman tries to crack it. Notice: if encryption were deterministic and the new message happened to be one the postman already has, the ciphertext would match and the game would be over. Here lies another reason encryption must be randomized.

Mathematically:

Adversary has: database of (mᵢ, cᵢ) pairs under key k
These came from: previous messages that lost confidentiality
Goal: Decrypt a NEW ciphertext c (not in the database)

CPA — Chosen Plaintext Attack

Now things get serious. The postman can now choose messages and get their encryptions under the unknown key k. It's as if the postman can trick Alice into encrypting messages of the postman's choice — without Alice realizing it.

"Wait, how is that possible in real life?" you ask. I'll show some concrete examples in later blogs. For now, Go with the Flow.

The postman builds up a dictionary of (chosen message, ciphertext) pairs, and then tries to learn something about a fresh encrypted message. The postman is now actively interacting with the system, not just listening.

Mathematically:

Adversary can: CHOOSE plaintexts m₁, m₂, ... and get Enc_k(mᵢ)
Modeled as: adversary has an "encryption oracle" — a black box that encrypts anything it asks, under the unknown key k
Goal: Decrypt a NEW challenge ciphertext c (that they did not previously ask the oracle to encrypt)

CCA — Chosen Ciphertext Attack

The nuclear option. The postman can:

  1. Get encryptions of any chosen messages (like CPA), AND
  2. Get decryptions of any chosen ciphertexts (except the challenge ciphertext itself) — by tricking Bob into decrypting ciphertexts of the postman's choice.

The postman can mix and match these services in any order, building up databases of both (message → ciphertext) and (ciphertext → message) pairs under the unknown key.

Then a fresh ciphertext arrives, and the postman tries to learn something about a fresh encrypted message using everything they've accumulated.

When we say "an encryption scheme is CCA-secure", we mean that even with all this power, the postman still can't compute any function of the fresh plaintext. That's a very strong guarantee.

Formally:

Adversary can: Get encryption oracle (like CPA) PLUS get decryption oracle — submit ciphertexts of its choice (except the challenge ciphertext itself) and receive the plaintexts
Goal: Decrypt a FRESH new challenge ciphertext c

Attack Models Hierarchy

Quick Reference

Attack Model What the Postman Has
COA (Ciphertext-Only) Only the intercepted ciphertext.
KPA (Known-Plaintext) A database of historical plaintexts and their matching ciphertexts.
CPA (Chosen-Plaintext) An "encryption oracle" to get the ciphertext for any message they choose.
CCA (Chosen-Ciphertext) Both an encryption oracle and a decryption oracle to test any ciphertext.

The security hierarchy:

CCA security implies CPA security, which implies KPA security, which implies COA security.

Each successive model gives the adversary strictly more capabilities than the previous one. An encryption scheme that is secure under CCA is secure under all the weaker models too.


Beyond Secure Communication

Cryptography goes far beyond Alice and Bob sending secret letters. Here's a taste of what else it enables:

  • Cryptocurrency (e.g. Bitcoin) — digital cash that prevents double-spending and preserves anonymity, without any central bank.
  • Zero-Knowledge Proofs — proving you know something (like the factors of a large number) without revealing what you know.
  • Secret Sharing — splitting a secret across multiple parties so that no single party can reconstruct it alone.
  • Secure Multi-Party Computation — multiple parties jointly compute a function over their private inputs, without revealing those inputs to each other.
  • Homomorphic Encryption — performing computations on encrypted data without ever decrypting it.

Each of these deserves its own blog post. For now, just know that the foundations we've discussed — symmetric encryption, asymmetric encryption, formal security definitions, and attack models — are the building blocks for all of these.


Acknowledgment

A lot of the technical rigor and structure in this blog is inspired by my reading of "Introduction to Modern Cryptography (Second Edition)" by Jonathan Katz and Yehuda Lindell. It's a fantastic textbook if you want to dive deeper into the math!


So the next time you want to write that love letter, remember: cryptography has your back. Just make sure you pick the right key, keep it to yourself, and for the love of security — don't use a proprietary cipher.


Note: The diagrams in this blog are AI-generated and meant to be illustrative, not precise technical specifications.