written by
squiddy
Another problem :sob: Can you help?
nc chal.amt.rs 1411
Don’t forget to orz the omniscient larry!
The Problem
As with all the algo problems in this CTF, the challenge is to analyze their given solution in lib.rs and optimize it.
We’re also given the testcase generator, gen.rs, and the nc server code, in main.rs.
lib.rs, their solution
1 | pub const MOD: u32 = 1e9 as u32 + 9; |
gen.rs, testcase generator
1 | use rand::prelude::*; |
main.rs, nc server
1 | use omniscient_larry::{gen, omniscient_god}; |
I can’t compile Rust, so I’ll translate their solution and the testcase generator into Python:
1 | MOD = 1_000_000_009 |
Much better.
Analysing the given solution
So, what does the solution do?
Basically, we’re provided a string $S$ of length $1 \le N \le 100000$ (from the nc code), containing only characters o, z, l or y. The provided code tries to reach any permutation of $S$ by following these steps:
- Start from a string $S’$, either
"o","l","z"or"y". - Replace a character in $S’$ with one of its possible expansions.
o->looroyz->yzorzll->llorozy->yyorzo
- Repeat step 2 until $S’$ is of length $N$.
- Count how many unique permutations of $S$ that $S’$ can be.
As someone with algorithm experience, this is quite inefficient. The big O time complexity is exponential, on the order of $O(2^N)$! This means it’s only good for $N \le 30$ or so; $N = 100000$ would take $~10^{30,000}$ years! We can do much better.
This algorithm's time complexity
We start with 4 states,"o", "z", "l" or "y".At each state, we can expand it to 2 states, and we do this for each character in the string.
So, if we let $s(i)$ be the number of states with length $i$, then $s(i + 1) = 2^i \cdot s(i)$ for $i \ge 1$. $s(1) = 4$.
However, because we use a visited array, $s(i)$ is actually bounded by $4^N$ (which is how many possible strings of length $N$ there are containing only
olzy).The code explores all states until length $N$, so the time complexity is (roughly)
$\begin{aligned} \sum_{i=1}^{N}s(i) &= 4 + 2 * (4) + 4 * (2 * 4) + 4^4 + 4^5 + \ldots\\ &\approx (4^0 + 4^1 + 4^2 + 4^3 + \ldots)\\ &= O(4^N) \end{aligned} $
Practically, though, many of these states are not reachable. Printing the length of the visited set for $N = 15$, we get $\sum_{i=1}^{15}s(i) = 131,068$, much smaller than $4^{15} = 1,073,741,824$.
Empirically, printing out $\sum_{i=1}^{n}s(i)$ for $1 \le n \le 17$, we see it's more on the order of $O(2^N)$.
| $n$ | $\sum_{i=1}^{n}s(i)$ | Ratio |
|---|---|---|
| 1 | 4 | - |
| 2 | 12 | $\frac{12}{4}=3$ |
| 3 | 28 | $\frac{28}{12}=2.333$ |
| 4 | 60 | $\frac{60}{28}=2.143$ |
| 5 | 124 | $\frac{124}{60}=2.067$ |
| 6 | 252 | $\frac{252}{124}=2.032$ |
| 7 | 508 | $\frac{508}{252}=2.016$ |
| 8 | 1020 | $\frac{1020}{508}=2.008$ |
| 9 | 2044 | $\frac{2044}{1020}=2.004$ |
| 10 | 4092 | $\frac{4092}{2044}=2.002$ |
| 11 | 8188 | $\frac{8188}{4092}=2.001$ |
| 12 | 16380 | $\frac{16380}{8188}=2.0004$ |
| 13 | 32764 | $\frac{32764}{16380}=2.0002$ |
| 14 | 65532 | $\frac{65532}{32764}=2.0001$ |
| 15 | 131068 | $\frac{131068}{65532}=2.00006$ |
| 16 | 262140 | $\frac{262140}{131068}=2.00003$ |
| 17 | 524284 | $\frac{524284}{262140}=2.00001$ |
The reason for this has to do with the solution to this challenge itself, presented later, which allows us to get a better estimate of the upper bound of $s(i)$.
This analysis also ignores the time complexity of set insertion, which in the case of Python, is $O(1)$, but may be $O(\log N)$ in other languages, giving us a final time complexity of $O(2^N\log N)$ in those cases.
Space complexity is just the size of the visited set, which is $O(2^N)$, plus some negligible $O(N)$ space for the depth first search.
How long $N=100000$ would take
o and z
First, notice that the number of os and zs differs by at most 1 only. This is because l and y both create both o and z in equal numbers; starting from l or y leads only to strings with the same number of os and zs. The only way to get a string with a different number of os and zs is to start from o or z. o and z are thus specially constrained.
Secondly, as we are working with permutations of $S$, we can simply count how many of each character there are in $S$. A string $S’$ is only a valid permutation of $S$ if it has the same number of os, zs, ls and ys as $S$. From now, I will refer to these counts as $o$, $z$, $l$ and $y$.
Let’s work backwards instead of forwards. Given a permutation of $S$, is it possible to reach either "o", "l", "z" or "y" by performing the operations in reverse order?
Our operations are now:
looroy->oyzorzl->zlloroz->lyyorzo->y
Let’s explore some permutations of only o and z, and see if they can be decomposed.
ozoz->ll->lzozo->yy->yoozz->olzzzoo->zyoozzo->ly
It seems that the only valid permutations of only o and z must follow an alternating pattern ozozoz... or zozozo.... Any oo or zz causes the permutation to not be decomposable.
If $o>z$, it follows that only ozozozo... is possible, and vice versa. If $o=z$, both are possible.
l and y
Let’s then explore how to insert l and y into this “backbone” of o and z.
We can insert l in front of an o or behind a z, and y in front of a z or behind an o. We can, of course, insert multiple ls or ys in a row, as they will simply be reduced to single characters via ll -> l and yy -> y.
For example, for the ozo permutation, we can place an l in 2 places, and a y in 2 places, like so:
loyzloy |
In fact, it turns out that for every case where $o \neq z$, there are $\max{(o,z)}$ places to put both l and y.
For the $o=z$ case, it’s based on which permutation we use:
ozozoz...: $o+1$ places forl, $o$ places foryzozozo...: $o$ places forl, $o+1$ places fory
Stars and bars
If we have $l$ ls, and $p_l$ places for ls, as determined from above, how many ways are there to insert the ls?
Turns out, this is a standard combinations problem, solved with the stars and bars method. The answer is
Stars and bars derivation
Let's represent thels with *s.We want to partition these
*s into $p_l$ groups, so let us create $p_l-1$ dividers, |s.We can then represent any combination of
ls in these groups as a string of $l$ *s and $p_l-1$ |s, like this permutation:
**|***|**
ls in the first group, 3 in the second, and 2 in the third.Importantly, (i) any permutation of this string is a valid arrangement, and (ii) no two permutations of this string correspond to the same arrangement.
Thus, the number of ways to partition the
ls is just the number of permutations of this string.The number of permutations of a string of $l$ identical
*s and $p_l-1$ identical |s is $\frac{(l+p_l-1)!}{l!(p_l-1)!}$, which is the formula above.
Similarly, for $y$ ys, and $p_y$ places for ys, the number of ways to insert the ys is $\binom{y+p_y-1}{y}$.
Putting it together
Now, we have $l$ ls, and $y$ ys, and we have to find the number of ways to insert them into the oz backbone.
We simply combine our previous results:
- If $o>z$ or $o<z$, we have $p_l$ = $p_y$ = $\max{(o,z)}$. We can only use 1 of the 2 backbones,
oz...zoonly for $o>z$, andzo...ozonly for $z>o$. Thus, the number of permutations of $S$ that we can make is $\binom{l+p_l-1}{l} \cdot \binom{y+p_y-1}{y}$. - If $o=z$, we have 2 cases:
- If we use
oz...oz, then $p_l = o+1$, $p_y = o$. The number of permutations is $\binom{l+p_l-1}{l} \cdot \binom{y+p_y-1}{y} = \binom{l+o}{l} \cdot \binom{y+o-1}{y}$. - If we use
zo...zo, then $p_l = o$, $p_y = o+1$. The number of permutations is $\binom{l+p_l-1}{l} \cdot \binom{y+p_y-1}{y} = \binom{l+o-1}{l} \cdot \binom{y+o}{y}$. - The total number of permutations we can make is the sum of these two cases.
- If we use
We can even see this from the output of the provided solution. With the testcase $S$=oozllyy, and printing the moves just before they are placed in the visited array, these are the permutations we get:
oyyzllo oyzlloy ozlloyy loyyzlo loyzloy lozloyy lloyyzo lloyzoy llozoyy |
The ozo backbone has been highlighted, and note how the ls and the ys “drift” through their respective allowed boxes.
We then code a (slightly sus) solution in Python:
1 | MOD = 1_000_000_009 |
Yay! Good job, here’s your flag and remember to orz larry: amateursCTF{orz-larry-how-is-larry-so-orz-5318bfae97e201a66dc12069058e1b11d971ac7b24a8c87b2aec826dd39098d4}
The time complexity of this solution is $O(N)$, to read in the string $S$, and to compute $N!$. The space complexity is $O(N)$ because we store the factorials (and the string $S$, technically).
Bonus: Re-examining the time complexity (warning: math)
Let's recalculate $s(n)$ more accurately, using the combinatoric solution we've found.First, let's write $s(n)$ = $\sum_{i=1}^{n}bb(n, i)$, where $bb(n, i)$ represents the number of states of total length $n$ with an
oz backbone of length $i$.$bb(n, i)$ can be split into 2 behaviours, odd $i$ and even $i$.
For odd $i$, either $o>z$ or $o \lt z$. Either way, $p_l=p_y=\max{(o,z)}=\frac{i+1}{2}$. Let $l$ be the number of
ls in the state. $i+l+y=n$, so $y=n-i-l$.\[\begin{aligned} bb(n,i) &= 2\cdot\sum_{l=0}^{n-i}\binom{l+\frac{i+1}{2}-1}{l}\binom{n-i-l+\frac{i+1}{2}-1}{n-i-l}\\ &= 2\cdot\sum_{l=0}^{n-i}\binom{l+\frac{i-1}{2} }{\frac{i-1}{2} }\binom{n-i-l+\frac{i-1}{2} }{\frac{i-1}{2} }\\ &= 2\cdot\sum_{l=\frac{i-1}{2} }^{n-\frac{i+1}{2} }\binom{l}{\frac{i-1}{2} }\binom{n-1-l}{\frac{i-1}{2} }\\ &= 2\cdot\left(\sum_{l=0 }^{n-1 }\binom{l}{\frac{i-1}{2} }\binom{n-1-l}{\frac{i-1}{2} } - \sum_{l=0 }^{\frac{i-3}{2} }\binom{l}{\frac{i-1}{2} }\binom{n-1-l}{\frac{i-1}{2} } - \sum_{l=n-\frac{i-1}{2} }^{n-1}\binom{l}{\frac{i-1}{2} }\binom{n-1-l}{\frac{i-1}{2} }\right)\\ &= 2\cdot\left(\sum_{l=0 }^{n-1 }\binom{l}{\frac{i-1}{2} }\binom{n-1-l}{\frac{i-1}{2} } - 2 \cdot \sum_{l=0 }^{\frac{i-3}{2} }\binom{l}{\frac{i-1}{2} }\binom{n-1-l}{\frac{i-1}{2} }\right)\\ &= 2\cdot\left(\binom{n}{i} - 2 \cdot \sum_{l=0 }^{\frac{i-3}{2} }\binom{l}{\frac{i-1}{2} }\binom{n-1-l}{\frac{i-1}{2} }\right) \text{[Chu-Vandermonde Identity]}\\ \end{aligned}\] Note, however, that from $l=0$ to $\frac{i-3}{2}$, the summand is 0, as $\binom{l}{\frac{i-1}{2} }=0$ for $l<\frac{i-1}{2}$.
Thus, precisely, $bb(n,i) = 2\cdot\binom{n}{i}$ for odd $i$.
For even $i$, $o=z$. Let $l$ be the number of
ls in the state. There are 2 cases, one where $p_l=\frac{i}{2}$ and $p_y=\frac{i}{2}+1$, and where $p_l=\frac{i}{2}+1$ and $p_y=\frac{i}{2}$. $y=n-i-l$ still.\[\begin{aligned} bb(n,i) &= \sum_{l=0}^{n-i}\binom{l+\frac{i}{2} }{l}\binom{n-i-l+\frac{i}{2}-1}{n-i-l} + \sum_{l=0}^{n-i}\binom{l+\frac{i}{2}-1 }{l}\binom{n-i-l+\frac{i}{2} }{n-i-l}\\ &= \sum_{l=0}^{n-i}\binom{l+\frac{i}{2} }{\frac{i}{2} }\binom{n-i-l+\frac{i}{2}-1}{\frac{i}{2}-1} + \sum_{l=0}^{n-i}\binom{l+\frac{i}{2}-1 }{\frac{i}{2}-1}\binom{n-i-l+\frac{i}{2} }{\frac{i}{2} }\\ &= 2\cdot\binom{n}{i} - \sum_{l=0}^{\frac{i}{2}-1}\binom{l}{\frac{i}{2} }\binom{n-1-l}{\frac{i}{2}-1} - \sum_{l=n-\frac{i}{2}+1}^{n-1}\binom{l}{\frac{i}{2} }\binom{n-1-l}{\frac{i}{2}-1} - \sum_{l=0}^{\frac{i}{2}-3}\binom{l}{\frac{i}{2}-1 }\binom{n-1-l}{\frac{i}{2} } - \sum_{l=n-\frac{i}{2} }^{n-1}\binom{l}{\frac{i}{2}-1 }\binom{n-1-l}{\frac{i}{2} }\\ &= 2\cdot\binom{n}{i} \end{aligned}\] So, as it turns out, $bb(n,i) = 2\cdot\binom{n}{i}$ for all $i$.
Then, \[\begin{aligned} s(n) &= \sum_{i=0}^{n}bb(n, i)\\ &= \sum_{i=0}^{n}2\cdot\binom{n}{i}\\ &= 2 \cdot \sum_{i=0}^{n}\binom{n}{i}\\ &= 2 \cdot 2^n\\ &= 2^{n+1} \end{aligned}\] Thus, the total states $S(N)$ for some string length $N$ is then \[\begin{aligned} S(N) &= \sum_{n=1}^{N}s(n)\\ &= \sum_{n=2}^{N+1}2^n\\ &= 2^{N+2}-4\\ &= O(2^N) \end{aligned}\] which is exactly what we got from the empirical analysis (try it yourself!)
| $n$ | $\sum_{i=1}^{n}s(i)$ | Ratio |
|---|---|---|
| 17 | 524284 | $\frac{524284}{262140}=2.00001$ |
Thus, the time complexity is indeed $O(2^N)$!