written by squiddy
Another problem :sob: Can you help?
nc chal.amt.rs 1411
Don’t forget to orz the omniscient larry!
The Problem
As with all the algo problems in this CTF, the challenge is to analyze their given solution in lib.rs
and optimize it.
We’re also given the testcase generator, gen.rs
, and the nc
server code, in main.rs
.
lib.rs
, their solution
1 | pub const MOD: u32 = 1e9 as u32 + 9; |
gen.rs
, testcase generator
1 | use rand::prelude::*; |
main.rs
, nc
server
1 | use omniscient_larry::{gen, omniscient_god}; |
I can’t compile Rust, so I’ll translate their solution and the testcase generator into Python:
1 | MOD = 1_000_000_009 |
Much better.
Analysing the given solution
So, what does the solution do?
Basically, we’re provided a string $S$ of length $1 \le N \le 100000$ (from the nc
code), containing only characters o
, z
, l
or y
. The provided code tries to reach any permutation of $S$ by following these steps:
- Start from a string $S’$, either
"o"
,"l"
,"z"
or"y"
. - Replace a character in $S’$ with one of its possible expansions.
o
->lo
oroy
z
->yz
orzl
l
->ll
oroz
y
->yy
orzo
- Repeat step 2 until $S’$ is of length $N$.
- Count how many unique permutations of $S$ that $S’$ can be.
As someone with algorithm experience, this is quite inefficient. The big O time complexity is exponential, on the order of $O(2^N)$! This means it’s only good for $N \le 30$ or so; $N = 100000$ would take $~10^{30,000}$ years! We can do much better.
This algorithm's time complexity
We start with 4 states,"o"
, "z"
, "l"
or "y"
.At each state, we can expand it to 2 states, and we do this for each character in the string.
So, if we let $s(i)$ be the number of states with length $i$, then $s(i + 1) = 2^i \cdot s(i)$ for $i \ge 1$. $s(1) = 4$.
However, because we use a visited array, $s(i)$ is actually bounded by $4^N$ (which is how many possible strings of length $N$ there are containing only
olzy
).The code explores all states until length $N$, so the time complexity is (roughly)
$\begin{aligned} \sum_{i=1}^{N}s(i) &= 4 + 2 * (4) + 4 * (2 * 4) + 4^4 + 4^5 + \ldots\\ &\approx (4^0 + 4^1 + 4^2 + 4^3 + \ldots)\\ &= O(4^N) \end{aligned} $
Practically, though, many of these states are not reachable. Printing the length of the visited set for $N = 15$, we get $\sum_{i=1}^{15}s(i) = 131,068$, much smaller than $4^{15} = 1,073,741,824$.
Empirically, printing out $\sum_{i=1}^{n}s(i)$ for $1 \le n \le 17$, we see it's more on the order of $O(2^N)$.
$n$ | $\sum_{i=1}^{n}s(i)$ | Ratio |
---|---|---|
1 | 4 | - |
2 | 12 | $\frac{12}{4}=3$ |
3 | 28 | $\frac{28}{12}=2.333$ |
4 | 60 | $\frac{60}{28}=2.143$ |
5 | 124 | $\frac{124}{60}=2.067$ |
6 | 252 | $\frac{252}{124}=2.032$ |
7 | 508 | $\frac{508}{252}=2.016$ |
8 | 1020 | $\frac{1020}{508}=2.008$ |
9 | 2044 | $\frac{2044}{1020}=2.004$ |
10 | 4092 | $\frac{4092}{2044}=2.002$ |
11 | 8188 | $\frac{8188}{4092}=2.001$ |
12 | 16380 | $\frac{16380}{8188}=2.0004$ |
13 | 32764 | $\frac{32764}{16380}=2.0002$ |
14 | 65532 | $\frac{65532}{32764}=2.0001$ |
15 | 131068 | $\frac{131068}{65532}=2.00006$ |
16 | 262140 | $\frac{262140}{131068}=2.00003$ |
17 | 524284 | $\frac{524284}{262140}=2.00001$ |
The reason for this has to do with the solution to this challenge itself, presented later, which allows us to get a better estimate of the upper bound of $s(i)$.
This analysis also ignores the time complexity of set insertion, which in the case of Python, is $O(1)$, but may be $O(\log N)$ in other languages, giving us a final time complexity of $O(2^N\log N)$ in those cases.
Space complexity is just the size of the visited set, which is $O(2^N)$, plus some negligible $O(N)$ space for the depth first search.
How long $N=100000$ would take
o
and z
First, notice that the number of o
s and z
s differs by at most 1 only. This is because l
and y
both create both o
and z
in equal numbers; starting from l
or y
leads only to strings with the same number of o
s and z
s. The only way to get a string with a different number of o
s and z
s is to start from o
or z
. o
and z
are thus specially constrained.
Secondly, as we are working with permutations of $S$, we can simply count how many of each character there are in $S$. A string $S’$ is only a valid permutation of $S$ if it has the same number of o
s, z
s, l
s and y
s as $S$. From now, I will refer to these counts as $o$, $z$, $l$ and $y$.
Let’s work backwards instead of forwards. Given a permutation of $S$, is it possible to reach either "o"
, "l"
, "z"
or "y"
by performing the operations in reverse order?
Our operations are now:
lo
oroy
->o
yz
orzl
->z
ll
oroz
->l
yy
orzo
->y
Let’s explore some permutations of only o
and z
, and see if they can be decomposed.
ozoz
->ll
->l
zozo
->yy
->y
oozz
->olz
zzoo
->zyo
ozzo
->ly
It seems that the only valid permutations of only o
and z
must follow an alternating pattern ozozoz...
or zozozo...
. Any oo
or zz
causes the permutation to not be decomposable.
If $o>z$, it follows that only ozozozo...
is possible, and vice versa. If $o=z$, both are possible.
l
and y
Let’s then explore how to insert l
and y
into this “backbone” of o
and z
.
We can insert l
in front of an o
or behind a z
, and y
in front of a z
or behind an o
. We can, of course, insert multiple l
s or y
s in a row, as they will simply be reduced to single characters via ll
-> l
and yy
-> y
.
For example, for the ozo
permutation, we can place an l
in 2 places, and a y
in 2 places, like so:
loyzloy |
In fact, it turns out that for every case where $o \neq z$, there are $\max{(o,z)}$ places to put both l
and y
.
For the $o=z$ case, it’s based on which permutation we use:
ozozoz...
: $o+1$ places forl
, $o$ places fory
zozozo...
: $o$ places forl
, $o+1$ places fory
Stars and bars
If we have $l$ l
s, and $p_l$ places for l
s, as determined from above, how many ways are there to insert the l
s?
Turns out, this is a standard combinations problem, solved with the stars and bars method. The answer is
Stars and bars derivation
Let's represent thel
s with *
s.We want to partition these
*
s into $p_l$ groups, so let us create $p_l-1$ dividers, |
s.We can then represent any combination of
l
s in these groups as a string of $l$ *
s and $p_l-1$ |
s, like this permutation:
**|***|**
l
s in the first group, 3 in the second, and 2 in the third.Importantly, (i) any permutation of this string is a valid arrangement, and (ii) no two permutations of this string correspond to the same arrangement.
Thus, the number of ways to partition the
l
s is just the number of permutations of this string.The number of permutations of a string of $l$ identical
*
s and $p_l-1$ identical |
s is $\frac{(l+p_l-1)!}{l!(p_l-1)!}$, which is the formula above.
Similarly, for $y$ y
s, and $p_y$ places for y
s, the number of ways to insert the y
s is $\binom{y+p_y-1}{y}$.
Putting it together
Now, we have $l$ l
s, and $y$ y
s, and we have to find the number of ways to insert them into the oz
backbone.
We simply combine our previous results:
- If $o>z$ or $o<z$, we have $p_l$ = $p_y$ = $\max{(o,z)}$. We can only use 1 of the 2 backbones,
oz...zo
only for $o>z$, andzo...oz
only for $z>o$. Thus, the number of permutations of $S$ that we can make is $\binom{l+p_l-1}{l} \cdot \binom{y+p_y-1}{y}$. - If $o=z$, we have 2 cases:
- If we use
oz...oz
, then $p_l = o+1$, $p_y = o$. The number of permutations is $\binom{l+p_l-1}{l} \cdot \binom{y+p_y-1}{y} = \binom{l+o}{l} \cdot \binom{y+o-1}{y}$. - If we use
zo...zo
, then $p_l = o$, $p_y = o+1$. The number of permutations is $\binom{l+p_l-1}{l} \cdot \binom{y+p_y-1}{y} = \binom{l+o-1}{l} \cdot \binom{y+o}{y}$. - The total number of permutations we can make is the sum of these two cases.
- If we use
We can even see this from the output of the provided solution. With the testcase $S$=oozllyy
, and printing the moves just before they are placed in the visited array, these are the permutations we get:
oyyzllo oyzlloy ozlloyy loyyzlo loyzloy lozloyy lloyyzo lloyzoy llozoyy |
The ozo
backbone has been highlighted, and note how the l
s and the y
s “drift” through their respective allowed boxes.
We then code a (slightly sus) solution in Python:
1 | MOD = 1_000_000_009 |
Yay! Good job, here’s your flag and remember to orz larry: amateursCTF{orz-larry-how-is-larry-so-orz-5318bfae97e201a66dc12069058e1b11d971ac7b24a8c87b2aec826dd39098d4}
The time complexity of this solution is $O(N)$, to read in the string $S$, and to compute $N!$. The space complexity is $O(N)$ because we store the factorials (and the string $S$, technically).
Bonus: Re-examining the time complexity (warning: math)
Let's recalculate $s(n)$ more accurately, using the combinatoric solution we've found.First, let's write $s(n)$ = $\sum_{i=1}^{n}bb(n, i)$, where $bb(n, i)$ represents the number of states of total length $n$ with an
oz
backbone of length $i$.$bb(n, i)$ can be split into 2 behaviours, odd $i$ and even $i$.
For odd $i$, either $o>z$ or $o \lt z$. Either way, $p_l=p_y=\max{(o,z)}=\frac{i+1}{2}$. Let $l$ be the number of
l
s in the state. $i+l+y=n$, so $y=n-i-l$.\[\begin{aligned} bb(n,i) &= 2\cdot\sum_{l=0}^{n-i}\binom{l+\frac{i+1}{2}-1}{l}\binom{n-i-l+\frac{i+1}{2}-1}{n-i-l}\\ &= 2\cdot\sum_{l=0}^{n-i}\binom{l+\frac{i-1}{2} }{\frac{i-1}{2} }\binom{n-i-l+\frac{i-1}{2} }{\frac{i-1}{2} }\\ &= 2\cdot\sum_{l=\frac{i-1}{2} }^{n-\frac{i+1}{2} }\binom{l}{\frac{i-1}{2} }\binom{n-1-l}{\frac{i-1}{2} }\\ &= 2\cdot\left(\sum_{l=0 }^{n-1 }\binom{l}{\frac{i-1}{2} }\binom{n-1-l}{\frac{i-1}{2} } - \sum_{l=0 }^{\frac{i-3}{2} }\binom{l}{\frac{i-1}{2} }\binom{n-1-l}{\frac{i-1}{2} } - \sum_{l=n-\frac{i-1}{2} }^{n-1}\binom{l}{\frac{i-1}{2} }\binom{n-1-l}{\frac{i-1}{2} }\right)\\ &= 2\cdot\left(\sum_{l=0 }^{n-1 }\binom{l}{\frac{i-1}{2} }\binom{n-1-l}{\frac{i-1}{2} } - 2 \cdot \sum_{l=0 }^{\frac{i-3}{2} }\binom{l}{\frac{i-1}{2} }\binom{n-1-l}{\frac{i-1}{2} }\right)\\ &= 2\cdot\left(\binom{n}{i} - 2 \cdot \sum_{l=0 }^{\frac{i-3}{2} }\binom{l}{\frac{i-1}{2} }\binom{n-1-l}{\frac{i-1}{2} }\right) \text{[Chu-Vandermonde Identity]}\\ \end{aligned}\] Note, however, that from $l=0$ to $\frac{i-3}{2}$, the summand is 0, as $\binom{l}{\frac{i-1}{2} }=0$ for $l<\frac{i-1}{2}$.
Thus, precisely, $bb(n,i) = 2\cdot\binom{n}{i}$ for odd $i$.
For even $i$, $o=z$. Let $l$ be the number of
l
s in the state. There are 2 cases, one where $p_l=\frac{i}{2}$ and $p_y=\frac{i}{2}+1$, and where $p_l=\frac{i}{2}+1$ and $p_y=\frac{i}{2}$. $y=n-i-l$ still.\[\begin{aligned} bb(n,i) &= \sum_{l=0}^{n-i}\binom{l+\frac{i}{2} }{l}\binom{n-i-l+\frac{i}{2}-1}{n-i-l} + \sum_{l=0}^{n-i}\binom{l+\frac{i}{2}-1 }{l}\binom{n-i-l+\frac{i}{2} }{n-i-l}\\ &= \sum_{l=0}^{n-i}\binom{l+\frac{i}{2} }{\frac{i}{2} }\binom{n-i-l+\frac{i}{2}-1}{\frac{i}{2}-1} + \sum_{l=0}^{n-i}\binom{l+\frac{i}{2}-1 }{\frac{i}{2}-1}\binom{n-i-l+\frac{i}{2} }{\frac{i}{2} }\\ &= 2\cdot\binom{n}{i} - \sum_{l=0}^{\frac{i}{2}-1}\binom{l}{\frac{i}{2} }\binom{n-1-l}{\frac{i}{2}-1} - \sum_{l=n-\frac{i}{2}+1}^{n-1}\binom{l}{\frac{i}{2} }\binom{n-1-l}{\frac{i}{2}-1} - \sum_{l=0}^{\frac{i}{2}-3}\binom{l}{\frac{i}{2}-1 }\binom{n-1-l}{\frac{i}{2} } - \sum_{l=n-\frac{i}{2} }^{n-1}\binom{l}{\frac{i}{2}-1 }\binom{n-1-l}{\frac{i}{2} }\\ &= 2\cdot\binom{n}{i} \end{aligned}\] So, as it turns out, $bb(n,i) = 2\cdot\binom{n}{i}$ for all $i$.
Then, \[\begin{aligned} s(n) &= \sum_{i=0}^{n}bb(n, i)\\ &= \sum_{i=0}^{n}2\cdot\binom{n}{i}\\ &= 2 \cdot \sum_{i=0}^{n}\binom{n}{i}\\ &= 2 \cdot 2^n\\ &= 2^{n+1} \end{aligned}\] Thus, the total states $S(N)$ for some string length $N$ is then \[\begin{aligned} S(N) &= \sum_{n=1}^{N}s(n)\\ &= \sum_{n=2}^{N+1}2^n\\ &= 2^{N+2}-4\\ &= O(2^N) \end{aligned}\] which is exactly what we got from the empirical analysis (try it yourself!)
$n$ | $\sum_{i=1}^{n}s(i)$ | Ratio |
---|---|---|
17 | 524284 | $\frac{524284}{262140}=2.00001$ |
Thus, the time complexity is indeed $O(2^N)$!