I created to challenges in TPCTF 2025: Misc - ipvm and Rev - superbooru. Personally speaking, I would rate them as of medium difficulty, with superbooru being a bit harder. The overall idea was to introduce some novel components in CTFs, enriching the experience and providing inspiration. Feel free to reach out if you have any thoughts or feedback.
Result: superbooru has one solve from 0ops, and ipvm has no solves.
superbooru
This challenge took inspiration from the booru image hosting service. Booru is a type of image hosting service that allows users to upload images and tag them. This challenge designed a simple booru with static tag rules, allowing to add/remove tags automatically based on rules. The rules are in the form of condition -> consequence, and their EBNF expression is as follows:
1 2 3 4 5 6 7 8 9 10
TAG = /\\w+/ ATOM = TAG | GROUP | NEG GROUP = "(" CONDITION ")" NEG = "-" ATOM
OR_TERM = ATOM ("/" ATOM)+ AND_TERM = OR_TERM ("," OR_TERM)+
dog, male -> male_with_dog, -pet_only: If the tags contain dog and male, then automatically add the tag male_with_dog and remove the tag pet_only.
(dog / cat), -male -> pet_only, animal_only: If the tags contain dog or cat and do not contain male, then automatically add the tags pet_only and animal_only.
Well it looks good! Let’s check the handout:
Note that here i is different from Ukrainian і.
If consequence does not contain -, the challenge would be simple. We can use z3 to define the tags as the disjunction of their implications and solve the constraint flag_correct. The annoying part is the -. How are we even supposed to add a tag and then remove it?
According to the code, the rules are applied round by round. In each round, rules that will be applied will be collected first, and be applied at the end of the round. By running flag check multiple times, we can notice an interesting insight: the number of rounds is almost fixed (around 2476 rounds). This hints that the rules might have a fixed set of logic. Additionally, the author kindly mentioned in the code:
1 2
# It's guaranteed that the same implication applied # multiple times will not change the result
This means that the same rule can be applied at most once, and even if it is applied multiple times, it will not change the result. In fact, if we record when these rules are applied, we will find that they are basically fixed. Then we can have a rough guess that these rules are divided into many “layers”, and within each layer, the rules either do not apply or apply in the same round (as the other rules in this layer).
But how EXACTLY are these rules divided into layers? As the hint released in the second half of the competition says, starting from the initial check_flag tag, there will be a chain of tags like check_flag -> -check_flag, new_flag1, new_flag1 -> -new_flag1, new_flag2, and so on. For the nth layer of rules, we add the condition of new_flag{n} to achieve this layered structure.
Given this information, we can write a script to simplify the rules and remove the unnecessary tags used for obfuscation.
withopen("implications.txt") as f: imps = [] for i, line inenumerate(f): line = line.strip() ifnot line: continue
imps.append(parse_implication(line))
imps = imps[6:]
for imp in tqdm(imps): imp.condition = imp.condition.simplify()
who_implies = {} who_implies_neg = {} for i, imp inenumerate(imps): for tag in imp.consequence: ifisinstance(tag, Tag): who_implies.setdefault(tag.tag, []).append(i) elifisinstance(tag, Neg): who_implies_neg.setdefault(tag.query.tag, []).append(i)
used = set() queue = ["flag_correct"] head = 0 while head < len(queue): cur = queue[head] head += 1 for i in who_implies.get(cur, []): imp = imps[i] for tag in imp.condition.tags(): if tag notin used: used.add(tag) queue.append(tag)
all_tags = set() for imp in imps: all_tags.update(imp.condition.tags()) all_tags.update(Group("and", imp.consequence).tags())
unused = all_tags - used - {"hooray", "flag_correct"} for imp in imps: imp.consequence = [ tag for tag in imp.consequence ifnot (isinstance(tag, Tag) and tag.tag in unused) andnot (isinstance(tag, Neg) and tag.query.tag in unused) ]
withopen("mapping.txt", "w") as f: for name in used: ifnot name.startswith("flag") and name != "check_flag": name_map[name] = f"t{len(name_map)}" print(f"{name_map[name]} = {name}", file=f)
withopen("implications_new.txt", "w") as f: for imp in imps: print(imp, file=f)
Then we extract the tag chain starting from check_flag (denoted as PC chain). After that, we can attach the layer information to the consequences. For example, for the following rules:
1 2 3 4 5 6 7 8 9 10 11
check_flag -> -check_flag, new_flag1 new_flag1, a -> c new_flag1, -a -> -c
new_flag1 -> -new_flag1, new_flag2 new_flag2, b -> a new_flag2, -b -> -a
new_flag2 -> -new_flag2, new_flag3 new_flag3, c -> b new_flag3, -c -> -b
They’re actually equivalent to executing c = a, a = b, b = c in order. After attaching the layer information, we have:
So basically we can convert the problem into SMT then. To avoid the model from being way too complex, we only preserve the tags that have changed in each round (e.g., a1, b1, b2, c2, a3, c3 in the example above will be discarded). Finally, we can directly use z3 to solve it. On this basis, we can also use z3 to verify that the solution is unique.
classTag(Query): def__init__(self, tag: str): self.tag = tag
def__str__(self): return self.tag
def__eq__(self, other): returnisinstance(other, Tag) and self.tag == other.tag
def__hash__(self): returnhash(self.tag)
defsimplify(self): return self
deftags(self): yield self.tag
defto_z3(self): assert cur_pc isnotNone if self.tag in pc_order: return BoolVal(True) if self.tag.startswith("flag_bin_") or self.tag == "check_flag": return Bool(self.tag)
for i inreversed(tag_pcs.get(self.tag, [])): if i < cur_pc: return format_z3(i, self.tag)
withopen("implications_new.txt") as f: imps = [] for i, line inenumerate(f): line = line.strip() ifnot line: continue
imps.append(parse_implication(line))
for imp in tqdm(imps): imp.condition = imp.condition.simplify()
who_implies = {} who_implies_neg = {} for i, imp inenumerate(imps): for tag in imp.consequence: ifisinstance(tag, Tag): who_implies.setdefault(tag.tag, []).append(i) elifisinstance(tag, Neg): who_implies_neg.setdefault(tag.query.tag, []).append(i)
pcs = ["check_flag"] whileTrue: pc = pcs[-1] if pc notin who_implies_neg: break assertlen(who_implies_neg[pc]) == 1 imp = imps[who_implies_neg[pc][0]] assertlen(imp.consequence) == 2 other = ( imp.consequence[0] ifisinstance(imp.consequence[1], Neg) else imp.consequence[0] ) assertisinstance(other, Tag) pcs.append(other.tag)
print(len(pcs))
pc_order = {pc: i for i, pc inenumerate(pcs)}
important_imps = [] tag_pcs = {} for imp in tqdm(imps): ifisinstance(imp.condition, Tag) and imp.condition.tag in pcs: continue assertlen(imp.consequence) == 1 tag = imp.consequence[0] ifisinstance(tag, Neg): continue
tag = tag.tag if tag == "hooray": continue
pc = None for t in imp.condition.tags(): if t in pc_order: assert pc isNone pc = t
assert pc pc = pc_order[pc] imp.pc = pc important_imps.append(imp) tag_pcs.setdefault(tag, []).append(pc)
for pcs in tag_pcs.values(): pcs.sort()
defs = {}
solver = Solver() for imp in tqdm(important_imps): tag = imp.consequence[0].tag pc = imp.pc
cur_pc = pc
key = (pc, tag) val = defs.setdefault(key, BoolVal(False)) defs[key] = val | imp.condition.to_z3()
for (pc, tag), val in defs.items(): solver.add(format_z3(pc, tag) == val)
chs = [] for i inrange(32): bs = bits[i * 8 : (i + 1) * 8] chs.append(chr(int("".join(reversed(bs)), 2)))
print("".join(chs))
The code seems verbose, but most of it is just repeated expression parsing. It takes about a minute to run the entire exp, which is still acceptable.
ipvm
This challenge is based on IPFS, a decentralized file storage protocol. IPFS essentially splits a piece of data into many blocks, each uniquely identified by its hash value. The entire data is also identified by a hash value (the hash of all its sub-blocks concatenated and hashed again, see Merkle Tree), called CID. These blocks are then distributed across a P2P network. Ideally, you only need the CID of a data or file to recursively download all its sub-blocks from the network. Sounds cool! But in reality, P2P is far from being ideal; moreover, IPFS has various design flaws, see How IPFS is broken. There’re still a batch of people using IPFS, so yeah, it’s up to you to decide.
Back to this challenge. This challenge provides a WASM runtime platform based on IPFS. You can:
build: Upload a folder containing wat / wasm files (yes, IPFS supports folders), and the server will optimize and compile it, sign it, and return the compiled package CID.
run: Upload a package CID, and the server will verify the signature and run it.
That’s it, really simple. Where can the vulnerability even be?
We know that the running process of wasm generally involves AOT compilation followed by local execution. Here build and run basically reproduce this process, which could lead to RCE because the output of build is essentially native code. If we can control the input to run, we can do whatever we want. However, the signature verification is quite annoying. If the output is not generated by a normal build, it won’t pass the signature verification. Wasmtime, the runtime environment, is known for its security, making it difficult to exploit a 0day RCE from wasm. What should we do then?
If you are carefully enough, you can spot an inconsistency in the way the server reads files from the folder. The first method is ipfs_read, which directly calls ipfs cat <path> to print the file content. Here, path can be either the CID itself (of course, this requires that the CID corresponds to a file rather than a folder) or a sub-file path of the CID (e.g., CID/config.json). The second method is to create a temporary folder and use ipfs get <CID> to download all files from the CID into this temporary folder.
This inconsistency turns out to be the key to the vulnerability. After some digging, we know that IPFS uses DAG-PB to store directories. The protobuf definition is as follows:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
message PBLink { // binary CID (with no multibase prefix) of the target object optional bytes Hash = 1;
message PBNode { // refs to other objects repeated PBLink Links = 2;
// opaque user data optional bytes Data = 1; }
An idea emerges: what if we have multiple PBLink with the same name in a PBNode (corresponding to multiple files with the same name in a folder)? It turns out that ipfs cat will return the content of the first file, while ipfs get will write all files sequentially, resulting in the last file’s content being the final output. This inconsistency allows us to maliciously append a main.cwasm to the package generated by build. During signature verification, ipfs cat will work fine, but when we download and execute it using ipfs get, it will execute our malicious main.cwasm. This is how we achieve RCE.
As for constructing malicious payload main.cwasm, I patched a compiled main.cwasm using IDA, injecting a segment of shellcode into the function execution part.
P.S. Use protoc dag.proto --python_out . to compile protobuf