Project Euler #219

Every item in this size \(n\) collection must be a leaf node to the prefix-free tree. An internal node cannot be included because, by construction, it acts as a prefix to its children. Letting the current cost be \(C\), every split destroys that node and creates two new nodes with costs \(C+1\) and \(C + 4\). To minimize the skewed pricing, we must greedily split the cheapest available node.

A priority queue (PQ) simulating this until the tree size reaches \(10^9\) would work, but it will take around 20 to 30 minutes. ~~That’s good enough for me.~~

Alright, here is the optimization: instead of splitting every node individually, we track the count of each cost. For instance, if there are currently ten nodes of cost 4 at the top of the PQ, we do not need to process ten separate splits. We simply add 10 to the frequencies of cost 5 and cost 8, bounding the bulk split by the exact number of operations remaining to reach \(n\). With this optimization, we bypass simulating individual strings and can find the solution instantly, even for \(Cost(10^{4000})\).