In the last of our many online rounds, contestants faced four problems. Teaching Assistant presents a somewhat complex situation, but the problem has a simple greedy solution. Forest University poses the challenge of uniformly sampling tree traversals, and can be solved via simulation. Rebel Against The Empire is a 3D geometry / graph problem with a potentially tricky implementation. Finally, Go++ is a "wild card" problem about concurrency; it has a surprisingly simple answer, although it is tough to see it!
52 minutes into the contest, every dataset had been solved at least once; contestants raced to solve the three toughest datasets (the Larges for B, C, and D). apiad was first to 100 points on the scoreboard at 1:50:30, but had C-large wrong. xyz111 had the first complete set of correct submissions, but with enough penalties to push the total time to 2:42:09; nobody else registered a perfect score, though, so that record stood. Go++ ultimately had more correct Large attempts than Rebel (perhaps in part because its point value was higher), and so it turned out to be necessary to solve it to make the top 25, although the cutoff line was very close right up until the end.
Congratulations to everyone who made it this far and participated in the round. 25 contestants plus our defending champion (Gennady.Korotkevich) will head to New York to face off in this year's Finals on August 5.
Cast
Problem A (Teaching Assistant): Written by Devendra Agarwal. Prepared by Karol Pokorski.
Problem B (Forest University): Written and prepared by Petr Mitrichev.
Problem C (Rebel Against The Empire): Written by Onufry Wojtaszczyk. Prepared by Yerzhan Utkelbayev.
Problem D (Go++): Written by David Arthur. Prepared by Igor Naverniouk.
Solutions and other problem preparation and review by Shane Carr, John Dethridge, Minh Doan, Jackson Gatenby, Pablo Heiber, Andy Huang, Taman (Muhammed) Islam, Dominik Schmid, and Ian Tullis.Analysis authors:
You are taking a programming course which is graded using problem sets of different types. The course goes for a positive even number of days. You start the course with no problem sets. On each day of the course, you must do exactly one of the following:
All problem sets are different. There is no requirement on how many sets of each type must be submitted. Once you submit a set, you no longer have that set. Any problem sets that you have not submitted before the end of the course get you no points.
The problem sets are requested from and submitted to an artificially-intelligent teaching assistant. Strangely, the assistant has different moods — on each day it is in the mood for either "Coding" or "Jamming".
For example:
Thanks to some help from a senior colleague who understands the assistant very well, you know what sort of mood the assistant will be in on each day of the course. What is the maximum total score that you will be able to obtain?
The first line of the input gives the number of test cases, T; T
test cases follow. Each test case consists of one line with a string S
of C
and/or J
characters. The i-th character of
S denotes the assistant's mood on the i-th day of the course. If it is
C
, it is in the mood for "Coding"; if it is J
, it is
in the mood for "Jamming".
For each test case, output one line containing Case #x: y
, where
x
is the test case number (starting from 1) and y
is
the maximum number of points you can obtain for that case.
Time limit: 20 seconds per test set.
Memory limit: 1 GB.
1 ≤ T ≤ 100.
The length of S is even.
2 ≤ the length of S ≤ 50.
2 ≤ the length of S ≤ 20000.
The sum of lengths of all S in the dataset is at most 150000.
5 CCJJ CJCJ CJJC CJJJ CCCCCC
Case #1: 20 Case #2: 10 Case #3: 20 Case #4: 15 Case #5: 30
This strategy is optimal for sample case #1:
Day 1: Request a "Coding" problem set (call it C1).
Day 2: Submit C1.
Day 3: Request a "Jamming" problem set (call it J1).
Day 4: Submit J1.
The following strategy is optimal for sample cases #2, #3, and #4: request C1, request J1, submit J1, submit C1.
For case #2, for example, note that you could not request C1, request J1, and then submit C1. Only the most recently requested problem set can be submitted.
In sample case #5, you can alternate between requesting a "Coding" problem set on one day, and submitting it on the next day.
The Forest University offers its students N courses, which must all be taken to obtain the degree. The courses can only be taken one at a time - you must complete a course before starting another. Each course is either basic, which means one can take it without any prior knowledge, or advanced, in which case exactly one other course is its prerequisite.
A student must take the prerequisite for a course before taking the course, although they do not need to be taken immediately one after the other. A course might be the prerequisite for multiple other courses. There are no prerequisite cycles. Any sequence of the N courses that meets the rules for prerequisites is valid for obtaining the degree.
When you graduate, the university commemorates the sequence of courses you have
taken by printing an abbreviated version of it on your graduation hat. More
precisely, this abbreviated version is a string consisting of the first letter
of the name of each course you have taken, in the order you have taken them.
For example, if you have taken a Coding course and a Jamming course, in that
order, your graduation hat will say CJ
. It is considered trendy to
have certain cool words as a substring of the string on one's graduation
hat.
Consider all possible valid sequences in which the courses can be taken. For each cool word, you need to find the fraction of those sequences that have the cool word as a substring (at least once) of the string on the corresponding graduation hat. Note that we're interested in the fraction of possible course sequences, not the fraction of possible different graduation hat strings. (Since multiple courses may start with the same letter, there may be fewer possible strings than course sequences.)
Somewhat unusually for Code Jam, we are only looking for an approximate answer to this problem; pay careful attention to the output format.
This problem has only 1 Small input and no Large input. You will be able to retry the input (with a time penalty).
The first line of the input gives the number of test cases, T. T test cases follow. Each test case consists of five lines, in this order, which contain the following:
For each test case, output one line containing
Case #x: y1 y2 ... yM
,
where x
is the test case number (starting from 1) and
yi
is the fraction of valid course sequences that will
have the i-th cool word as a substring of the string on the graduation hat.
yi
will be considered correct if it is within an
absolute error of 0.03 of the correct answer. See the
FAQ for an
explanation of what that means, and what formats of real numbers we accept.
Time limit: 300 seconds per test set.
Memory limit: 1 GB.
1 ≤ T ≤ 100.
1 ≤ N ≤ 100.
1 ≤ M ≤ 5.
The length of each cool word is between 1 and 20.
Each cool word consists of uppercase English letters only.
There are no cycles formed by the prerequisites.
2 2 0 1 CJ 4 CJ C D JC 3 0 1 0 BAA 3 AA AAB ABA
Case #1: 1.0 1.0 0.0 0.0 Case #2: 0.67 0.0 0.33
The sample output displays one set of acceptable answers to the sample cases. Other answers are possible within the allowed precision.
In sample case #1, course 1 (C
) is a basic course that is a
prerequisite for the advanced course 2 (J
). The only way to
complete the courses is to take course 1 and then course 2. This creates the
string CJ
. So the cool words CJ
, C
,
D
, and JC
are present as substrings in 1, 1, 0, and 0
out of 1 possible cases, respectively.
In sample case #2, the basic course 1 (B
) is a prerequisite for
the advanced course 2 (A
), and course 3 (A
) is
another basic course. There are three possible ways of completing the courses:
BAA
)BAA
)ABA
)
The cool words AA
, AAB
, and ABA
are
present as substrings in 2, 0, and 1 out of 3 possible cases, respectively.
You are a rebel against the evil Galactic Empire, and you are on the run!
You have sabotaged the Empire's Factory of Evil, and imperial security forces will be after you soon! The factory is located on asteroid 0 in a system of N numbered asteroids. Your getaway ship, the Century Quail, is located on asteroid 1, and if you can get there, you will be able to fly away safely.
Each asteroid is a single point in space with a velocity, and you move through space along with whichever asteroid you are currently on. Your Asteroid Jumper will allow you to instantaneously jump between any two asteroids in the system. Long jumps are scarier than short ones (and the vacuum of space is terrifying), so you want to minimize the maximum distance you need to jump. However, starting now, if you ever spend more than a continuous S seconds without jumping, the imperial security forces will catch you. That is, the interval from now until your first jump, and each interval between subsequent jumps, must be less than or equal to S. You may jump at any instant; it does not have to be after an integer number of seconds have elapsed. You escape the instant you jump to asteroid 1.
The i-th asteroid starts at position (xi, yi, zi) in space, and it will move a total distance of (Vxi, Vyi, Vzi) each second. This movement is continuous throughout time; it does not update discretely each second. (It is also possible for an asteroid to be stationary.) Nothing happens if asteroids occupy the same point in space at the same time. You can only travel between two asteroids by jumping, even if they happen to occupy the same point at the instant of your jump.
In the escape plan that minimizes the maximum jump distance, what is that maximum jump distance?
The first line of the input gives the number of test cases, T. T test cases follow. The first line of each test case contains two integers: N (the number of asteroids) and S (the limit on how long you can go without jumping). Next, there are N lines describing the asteroids. The i-th of these lines (counting starting from 0) contains six integers: the initial (xi, yi, zi) position of the i-th asteroid in space, and the distance ( Vxi, Vyi, Vzi) it moves in a single second.
For each test case, output one line containing Case #x: y
, where
x
is the test case number (starting from 1) and y
is
a floating-point number: the distance of the longest jump you will have to make
in order to get away. y
will be considered correct if it is within
an absolute or relative error of 10-4 of the correct answer. See the
FAQ for an
explanation of what that means, and what formats of real numbers we accept.
Time limit: 20 seconds per test set.
Memory limit: 1 GB.
1 ≤ T ≤ 20.
2 ≤ N ≤ 1000.
1 ≤ S ≤ 100.
-500 ≤ xi ≤ 500.
-500 ≤ yi ≤ 500.
-500 ≤ zi ≤ 500.
Vxi = 0.
Vyi = 0.
Vzi = 0.
3 3 7 0 0 0 0 0 0 1 2 2 0 0 0 1 1 1 0 0 0 5 10 0 0 0 0 0 0 35 0 0 -1 0 0 1 54 0 0 -2 0 2 -150 0 0 10 0 4 0 0 -1 0 0 3 1 -10 2 0 1 0 0 0 0 10 0 0 -1 -10 -2 0 1 0 0
Case #1: 1.7320508 Case #2: 2.0000000 Case #3: 4.0000000
Sample case #1 is the only sample case that could appear in the Small dataset. Any of the sample cases could appear in the Large dataset.
In sample case #1, we start on a stationary asteroid at (0, 0, 0), and our ship is on an asteroid at (1, 2, 2). There is another asteroid at (1, 1, 1). One option is to jump directly to our ship, which is a distance of 3 away. Another option is to jump to the other asteroid, which is a distance of sqrt(3) away, and then jump to the ship from there, which is a distance of sqrt(2) away. The maximum jump distance is 3 for the first option and sqrt(3) for the second option, so the second option is preferable.
Note that the value of S does not matter in the Small cases. Since all of the asteroids are stationary, there is no reason to wait around; we can make all jumps instantaneously.
In sample case #2, we start on a stationary asteroid at (0, 0, 0). We can wait there for 4 seconds for asteroid 4 to come very close, jump onto it, fly for 1 second on it, and then jump back at time 5 back to asteroid 0 (the distance between the two asteroids is 1 at this moment). There we wait 10 seconds, cutting it very close to being caught, and then jump to the speeding asteroid 3 at time 15. Two seconds later, asteroid 3 flies by asteroid 2, and we jump to asteroid 2. At time 27, we can jump from asteroid 2 to asteroid 0. There we patiently wait until time 35 when asteroid 1 reaches us, then we can jump onto it and escape. The longest jump we made was from asteroid 0 to asteroid 3 at time 15, and the distance we jumped was 2.
In sample case #3, the security forces are really active! You could, of course, wait one second and jump directly to asteroid 1, but a better choice - that allows you to make jumps no longer than 4 - is to jump back and forth between asteroids 0 and 2; while waiting for asteroid 1 to get close, and only then jump to it.
The Go language was designed to have a simple API and to support multi-threading. The Code Jam team wants to push these goals to the limit, so we are proposing a new language called Go++.
The Go++ language uses one register, which stores one boolean value (0 or 1). This register is initialized to 0. The language has three instructions:
0
, which sets the register to 0.
1
, which sets the register to 1.
?
, which prints the current register value.
For example, here are the only six ways in which the two programs
1?
and ?0
could be executed together. (The
underline on the second program is just to distinguish its instructions from
the instructions in the first program.)
?01?
, which will print 01
. (Remember that
the register is initialized to 0.)?10?
, which will print 00
.?1?0
, which will print 01
.1?0?
, which will print 10
.1??0
, which will print 11
.1??0
, which will print 11
.
Note that the output string always consists of 0
s and
1
s, and never ?
s, since ?
is not a state
the register can be in.
Usually, programmers write programs to produce a desired output, but your task
will be to write two programs that won't produce an undesired
output! Specifically, you will be given a "bad" string B of length
L, and a set G of N "good" strings, all of
length L. You must produce two Go++ programs (not necessarily of the
same length), which, when run in the way described here, could produce
all of the strings in G, but could not produce the string
B. It is fine if the programs could also produce other strings that are
not B and not in G. Note that there must be a combined total of
exactly L ?
instructions in the two programs. The combined
number of instructions in the two programs must not exceed 200.
For example, for B = 11
and G = { 10
,
00
}, the programs ?
and 10?1
would be
one valid answer. They can produce every string in G, but they cannot
produce B, no matter how they are interleaved. (They can also produce
the string 01
, which is not B and is not in G, but
that is fine.) However, the programs 1?
and ?0
would
not be a valid answer, since (as we saw above) they can produce B. The
programs 00
and ??
would not be a valid answer, since
they cannot produce every string in G.
Can you produce two programs that satisfy the conditions, or determine that the
task is IMPOSSIBLE
?
The first line of the input gives the number of test cases, T. T
test cases follow; each consists of three lines. The first line of each test
case has two integers N and L: the number of strings in G,
and the length of the B string and the strings in G. The second
line has N different strings of length L: the strings in
G. The third line has one string of length L: the bad string
B. B and all of the strings in G are made up of only
0
s and/or 1
s.
For each test case, output one line containing Case #x: IMPOSSIBLE
,
if no programs will satisfy the conditions; otherwise, output
Case #x: y z
, where x
is the test case number
(starting from 1) and y
and z
are your two programs
that satisfy the conditions. The combined number of instructions in your
programs must not exceed 200. Each program must contain at least one
instruction. There must be a combined total of exactly L ?
instructions in the two programs.
Time limit: 20 seconds per test set.
Memory limit: 1 GB.
1 ≤ T ≤ 100.
1 ≤ N ≤ 100.
1 ≤ L ≤ 50.
All strings in G are different.
B consists entirely of 1
s.
B may be any string consisting of 0
s and/or
1
s.
3 2 2 10 00 11 3 2 11 10 00 01 4 2 00 01 10 11 11
Case #1: ? 10?1 Case #2: 1?? 0 Case #3: IMPOSSIBLE
The sample output displays one set of answers to the sample cases. Other answers may be possible.
Sample case #1 is the one described in the problem statement.
Sample case #2 would not appear in the Small dataset.
Sample case #3 is obviously IMPOSSIBLE
because B is
in G.
Let's define N as the number of days in the test. Notice that since a student is not allowed to do nothing on a test day, the student must request at least N/2 problem sets. Further, there is no reason to request more than that, since the extras cannot be submitted.
We'll begin with a dynamic programming solution that is sufficient for the Small dataset, and then discuss a mathematical formula that solves the Large dataset.
Because we can only submit the problem we requested most recently, we can view the problems that we hold as a stack, where requesting corresponds to pushing and submitting the most recently requested problem corresponds to popping.
Consider what happens on day 1. Clearly, we must request a problem set, and we now have to pick a day to submit it. Now, between the request and the submission, we can push to and pop from the stack, but after we are finished, the stack must contain the problem we requested on day 1, and nothing else. That means we need to have an even number of days between our request on day 1 and the corresponding submission.
We can think of the period of time between our request and our submission as a subproblem, and the time after the submission as another subproblem. Each of those subproblems must leave the stack the way they found it, so we can use the same set of rules that we used above for every subproblem.
Now, we'll express how to solve the problem in terms of recursion. At each stage, we'll be looking at some range of days, from day i to day j. So the subproblem (i,j) is to find the best possible score we can get over the days from i to j. We'll also say we can't touch the existing entries on the stack (corresponding to any days before i), and must leave the stack as we found it. We can then solve the subproblem as we did the original problem: request a problem set on day i, then pick where in the (i,j) interval to submit it such that there's an even number of days between the request and the submission.
At this point, each of our choices divides the subproblem into two additional, possibly-empty subproblems. For a subproblem of length 6, for example, here are the ways to place the request and submission:
_ _ _ _ _ _
↓
R S _ _ _ _
R _ _ S _ _
R _ _ _ _ S
Given the pairs of request and submission days, how do we decide which type of problem set to request? If we request the type that doesn't match the assistant's mood on request day, the best score we can get is 5 points. But we always get at least that much by requesting the type that matches the assistant's mood, so we should always do that. Then, the score we get when we submit just depends on whether the assistant's mood on submit day matches the type of problem we requested: we get 10 points total if they match, and 5 if they don't.
The empty subproblems are the base cases of our recursion. For larger problems, we find, for each possible submission point, the score we get from requesting and submitting that problem set (10 if the assistant's mood on request day matches its mood on submit day, 5 otherwise), plus the optimal scores of the subproblems. We can use this to recursively find the solution to the original problem; memoizing the solutions for the subproblems yields a DP solution. Because the memoization table is size O(N2) and it takes O(N) iterations to fill each cell, this algorithm is O(N3), which is sufficient to solve the Small problem.
It's also possible to solve the problem with a simple mathematical formula based on the input.
Count the number of even- and odd-numbered days on which the assistant is in the mood for Coding and Jamming. That gives us four numbers, which we'll call CE, CO, JE, and JO (for instance, CE is the number of even-numbered days on which the assistant is in the mood for Coding). We will show that the maximum possible score is:
S = 10 * (min(CE, CO) + min(JE, JO)) + 5 * abs(CE - CO).
We'll first show that this is an upper bound (we can't get a score higher than S), then show that it is tight (we can get a score equal to S).
First, note that any request/submit pair for the same problem (which we'll just call a pair) must have one even and one odd day; that is, for any problem set, it was either requested on an odd day and submitted on an even day, or vice versa. That must be true because we need an even number of days between the request and the submission. Now, since we need an even Coding day and an odd Coding day to make a pair where both days are Coding, the amount of those pairs that we can make is equal to the minimum of CE and CO. We can apply the same logic to Jamming. Then, we can pair up the leftovers; the amount of leftover pairs is the amount by which CE and CO differ (you can show that JE and JO differ by the same amount). No matter how we pair up the leftovers, there is no way to form more matched pairs.
To show that this bound is tight, we will construct a solution that achieves that upper bound. We make a list of the days in order. Then, whenever we see two adjacent days with the same mood, mark the first one as a request and the second as a submission, then remove those two days from our list. Observe that we can continue making pairs on this list, and they'll be valid on the real list (it's impossible for a new pair to be halfway inside and halfway outside any pair that we've removed, since we only remove pairs when there's nothing inside them). Continue this process until we can't anymore. At this point, since there aren't any adjacent days with the same mood, the mood must alternate between days. But that means that all even-numbered days have the same mood, and all odd-numbered days have the other mood. So we can pair all of those up, completing the proof.
If you're curious, it's also possible to use a greedy stack-based algorithm to solve the Large problem. Our algorithm proceeds from left to right, and at each point decides whether to request or submit. We maintain a stack of problem sets that we hold, where a request is a push and a submit is a pop, then select our action each day as follows:
It's fairly difficult to see why this is valid on its own, but we can show that this also obeys the formula that we give above.
Both algorithms (the stack algorithm and the adjacent pair algorithm) find an adjacent pair with the same mood some number of times; we'll call these hits. Pairs with a different mood are misses. It remains to prove that the stack algorithm gets the optimal number of hits.
In most cases, both algorithms hit and miss in the same places. Intuitively, steps 3 and 4 of the stack algorithm correspond to finding and removing adjacent pairs with the same mood. Then, steps 1 and 2 clean up the remaining unmatched pairs.
There are some corner cases, however, where the algorithms don't precisely match. Take the input CJCJJC, for example: the stack algorithm produces RRRSSS, while the adjacent-pairs algorithm results in RRSRSS.
All of these corner cases occur when there's an unmatched pair that the stack algorithm doesn't match, failing to see that there's a matched pair later on at the same level. In other words, they all look like the following. (You can swap the Js and Cs, but we'll use the case that we show below without loss of generality.)
... J ... C ... J ... J ...
Then, the stack algorithm requests for the first two and submits for the last two, while the adjacent pairs algorithm matches the second pair, then the first pair. Notice that these have the same result either way: there's always one unmatched pair and one matched pair. Hence, the algorithms both produce the optimal score, even though they may attain it in different ways.
Small-only problems are relatively rare in Code Jam. Sometimes, as with 2012's Proper Shuffle from Round 1A 2014, this is because we have a randomized solution in mind and want contestants to be able to try again if necessary, even if the probability of a contestant's optimal solution actually failing due to chance is very small. The Forest University problem is Small-only for both of these reasons. We could have added another Small dataset small enough to solve precisely via dynamic programming, but we didn't think it would add useful resolution to help us pick our 26 advancers.
Clearly, with up to 100 courses, there are far too many course sequences to enumerate. The problem has an unusually relaxed tolerance for less precise answers, so it is a natural candidate for a simulation-based approach: we can generate course schedules and check whether they contain the cool substrings. Unusually for a simulation problem, though, the challenge here is to figure out how to do the simulation correctly even once! How do we sample the set of all course sequences uniformly at random?
Some pen and paper analysis of a small forest can reveal a simple rule that is sufficient to guide our simulations. Let's consider a case with five courses A, B, C, D, and E; A and E are basic, A is the prerequisite of both B and D, and B is the prerequisite of C.
Then there are fifteen possible orders in which to take the courses: ABCDE, ABCED, ABDCE, ABDEC, ABECD, ABEDC, ADBCE, ADBEC, ADEBC, AEBCD, AEBDC, AEDBC, EABCD, EABDC, EADBC. Notice that four-fifths of these begin with A and one-fifth begin with E. A is a root node with a total of 4 descendants (including itself); E is a root node with a total of 1 descendant (including itself). This is a promising pattern, and it is not coincidental! It suggests a method for building up a course string while sampling uniformly: keep choosing one of the available courses, with probability proportional to the size of that course's subtree (itself plus all its descendants).
To apply this method to the example above: we start with an empty course string, and at first, we can only choose either A or E. A has 4 descendants including itself, and E has 1, so we choose A with probability 4/5 and E with probability 1/5. Suppose that we choose A. Now we have three choices for the next course: B, D, and E. B has 2 descendants including itself; D has 1; E has 1. So, we choose B with probability 2/4, D with probability 1/4, and E with probability 1/4. Suppose that we choose D. Now we have two choices: B and E; we choose these with probability 2/3 and 1/3, respectively. We continue in this way until we've built a full sequence.
Why does this work? Speaking more generally: let's add a single root node to be the parent of all nodes with no prerequisites. We can now recursively compute, for each subtree, a schedule of just the classes in that subtree. Suppose that the subtree has root node V and size S. First, we compute a schedule for each subtree whose root is one of the children of V. Then, we put V first in the new schedule we are creating, and we choose an assignment (uniformly at random) of the remaining S-1 positions to each of the subtrees, such that each subtree is assigned as many positions as it has nodes. Then, we copy the ordering for each subtree into that subtree's positions. This results in a uniformly randomly chosen schedule.
Since we are picking a uniformly random assignment for each subtree, then interleaving them together uniformly randomly, the fraction of possible orders in which a given top-level course (i.e. the root of a certain subtree) appears first can be computed using a multinomial coefficient. For example, if we have to interleave A elements from one subtree and B elements from another subtree, the total number of ways to do so is (A+B)! / (A! × B!). The number of these ways that start with an element from subtree A is (A+B-1)! / ((A-1)! × B!), and the number of these ways that start with an element from subtree B is (A+B-1)! / (A! × (B-1)!). The ratio between these is (A! × (B-1)!) / ((A-1)! × B!), and this reduces to A / B. This explains why it suffices to sample proportionately to the size of each subtree.
Generating a random sequence is equivalent to assigning distinct numbers from 1 to N to nodes in our forest, in such a way that the number of a node is smaller than the number of all its descendants. Let's start with any of the N! possible assignments, chosen uniformly. Now let's go through nodes from top to bottom, and if a node is not the smallest in its subtree, we swap its number with the smallest. This generates sequences uniformly, since each sequence can be obtained from P different permutations, where P is the product of sizes of subtrees of all nodes. The probability that a given node X is the smallest (once it's available to be chosen) is proportional to the size of the tree it is rooting, because any of those nodes that get the smallest number will "give" it to X.
If we check K uniformly generated sequences, and the true probability to find a given substring is p, then the fraction of those K sequences that contain this substring follows a binomial distribution with parameters K, p, which we can approximate with a normal distribution with mean p and standard deviation sqrt(p × (1 - p) / K). (This Wikipedia article has some guidelines on when this approximation is valid.) If we run 10000 iterations, this is at most 0.5 / 100=5e-3, so the required precision of 3e-2 is 6 standard deviations. So, the probability of having one particular answer incorrect is roughly 1 in 500 million, which means that the probability of having at least one of the 500 required answers incorrect is at most 1 in a million.
The error bounds are generous enough that, with a fast enough implementation, the problem is solvable in slower languages such as Python.
To make the analysis a tiny bit easier to follow, we'll typeset strings (B, G, and output from the Go++ programs) in monospace font, instructions from the first program in normal font, and instructions from the second program with an underline.
First, let's try to figure out when the problem is impossible to solve. It's clearly impossible when B is in G; that case is even in the sample input. What about otherwise?
Surprisingly, if B is not in G, the problem is always solvable.
To prove this, I'll construct two Go++ programs that, when run as described in the problem statement, can produce any L-length string except B. That sounds harder than making one that just produces the strings in G, but sometimes, making the problem more difficult is a great way to gain the insight you need to solve the original problem. For this problem in particular, the set G is a red herring!
Given an input string B, construct the first program as follows:
0
↦ 1, 1
↦ 0).111
, which could appear on the Small
input, the first program is 0?0?0?. If B = 010
, which could
only appear on the Large input, the first program is 1?0?1?.
Why does this work? Notice that by default, this program generates the opposite
of B. In order to generate a different string, we need to override one
or more of the ? instructions; that is, the programs need to be interleaved
such that right before the ? from the first program, some other instruction
from the second program changes what it prints. For instance, given the first
program 1?0?1?, we can print 011
instead of 101
using
the second program 01, if it's interleaved as follows:
10?01?1?. In order to print B, we need to override all
three ? instructions.
So how do we prevent that from happening? We write a second program that cannot override all three ? instructions. Our solution differs for the Small and Large inputs.
For the Small input, B is always a string of 1
s. That means
the first program is always 0?, repeated L times. For instance, if
B = 111
, the first program is 0?0?0?.
Our second program needs to be able to override any two of those ?
instructions; that way, we can print any output with up to two 1
s.
However, we can't override all three ? instructions, since that means our
program might print B. So we need a second program that contains
L-1 1s, but not L 1s. The solution to this is
fairly simple: our second program simply consists of 1, repeated
L-1 times. For instance, if B = 111
, the second
program is 11.
To solve the problem for arbitrary B, we need a second program that, when taken as a string, has every (L-1)-length string as a subsequence (so we can print any string except B), but doesn't have B as a subsequence. (Here, we use "subsequence" to mean any set of elements, ordered as they appear in the original sequence; they need not be adjacent.) We'll discuss how to produce such a program shortly.
Now, with our first program and this hypothetical second program, we can produce any L-length string except B. That's because the second program has every possible subsequence of length L-1, so we can override up to L-1 of the ? instructions of the first program. However, we cannot override all L of them to produce B, because the second program doesn't have B as a subsequence.
There are many ways to generate the second program. This one won't produce the shortest program, but it is perhaps the easiest to explain:
0
↦ 10, 1
↦ 01.010
, the second program is
1001. The first 0
becomes 10, the 1
becomes 01, and we ignore the last character of B.
We outlined the two requirements above for the second program to be valid:
To get any (L-1)-length subsequence, split the second program into pairs of adjacent instructions. Each pair contains a 0 and a 1, since that's how we built the program. So we just pick one character from each pair, and we have any (L-1)-length string!
Now, let's try to get B as a subsequence of our second program. Say
B = 010
and the second program is 1001. In order to
get 010 as a subsequence, we start from finding the first 0 in 1001,
which occurs at the second character. Next we find the first 1 after that,
which is the fourth character. The pattern continues: because of how we've
built the second program, we always go through two characters whenever we try
to find B as a subsequence. And because the second program only has
2L-2 characters, we can't find all of B as a subsequence.
Therefore, our second program is valid.