Google Code Jam Archive — Round 3 2016 problems

In the last of our many online rounds, contestants faced four problems. Teaching Assistant presents a somewhat complex situation, but the problem has a simple greedy solution. Forest University poses the challenge of uniformly sampling tree traversals, and can be solved via simulation. Rebel Against The Empire is a 3D geometry / graph problem with a potentially tricky implementation. Finally, Go++ is a "wild card" problem about concurrency; it has a surprisingly simple answer, although it is tough to see it!

52 minutes into the contest, every dataset had been solved at least once; contestants raced to solve the three toughest datasets (the Larges for B, C, and D). apiad was first to 100 points on the scoreboard at 1:50:30, but had C-large wrong. xyz111 had the first complete set of correct submissions, but with enough penalties to push the total time to 2:42:09; nobody else registered a perfect score, though, so that record stood. Go++ ultimately had more correct Large attempts than Rebel (perhaps in part because its point value was higher), and so it turned out to be necessary to solve it to make the top 25, although the cutoff line was very close right up until the end.

Congratulations to everyone who made it this far and participated in the round. 25 contestants plus our defending champion (Gennady.Korotkevich) will head to New York to face off in this year's Finals on August 5.

Cast

Problem A (Teaching Assistant): Written by Devendra Agarwal. Prepared by Karol Pokorski.

Problem B (Forest University): Written and prepared by Petr Mitrichev.

Problem C (Rebel Against The Empire): Written by Onufry Wojtaszczyk. Prepared by Yerzhan Utkelbayev.

Problem D (Go++): Written by David Arthur. Prepared by Igor Naverniouk.

Solutions and other problem preparation and review by Shane Carr, John Dethridge, Minh Doan, Jackson Gatenby, Pablo Heiber, Andy Huang, Taman (Muhammed) Islam, Dominik Schmid, and Ian Tullis.

Analysis authors:

Teaching Assistant: Saurav Keshari Aryal, Alex Meed, and Steve Thomas
Forest University: John Dethridge, Petr Mitrichev, and Ian Tullis
Rebel Against The Empire: Pablo Heiber, Ian Tullis, and Onufry Wojtaszczyk
Go++: Alex Meed

Problem

You are taking a programming course which is graded using problem sets of different types. The course goes for a positive even number of days. You start the course with no problem sets. On each day of the course, you must do exactly one of the following:

Request a "Coding" problem set.
Request a "Jamming" problem set.
Submit a problem set for grading. You must have at least one problem set to choose this option. If you have multiple problem sets, you must submit the one among those that was requested most recently, regardless of its type.

All problem sets are different. There is no requirement on how many sets of each type must be submitted. Once you submit a set, you no longer have that set. Any problem sets that you have not submitted before the end of the course get you no points.

The problem sets are requested from and submitted to an artificially-intelligent teaching assistant. Strangely, the assistant has different moods — on each day it is in the mood for either "Coding" or "Jamming".

When you request a problem set:
- If the requested topic matches the assistant's mood, it assigns a problem set worth a maximum of 10 points.
- If the requested topic does not match its mood, it assigns a problem set worth a maximum of 5 points.
When you submit a problem set:
- If the topic of the submitted set matches the assistant's mood that day, it gives you the maximum number of points for that set.
- If the topic of the submitted set does not match its mood that day, it gives you 5 points fewer than the maximum number of points for that set.

For example:

f you request a "Coding" problem set on a day in which the assistant is in a "Coding" mood, and submit it on a day in which it is in a "Jamming" mood, you will earn 5 points: the problem set is worth a maximum of 10, but the assistant gives 5 points fewer than that.
If you request a "Jamming" problem set on a day in which the assistant is in a "Coding" mood, and submit it on a day in which it is in a "Jamming" mood, you will earn 5 points: the set is worth a maximum of 5, and the assistant gives you the maximum number of points.

Thanks to some help from a senior colleague who understands the assistant very well, you know what sort of mood the assistant will be in on each day of the course. What is the maximum total score that you will be able to obtain?

Input

The first line of the input gives the number of test cases, T; T test cases follow. Each test case consists of one line with a string S of C and/or J characters. The i-th character of S denotes the assistant's mood on the i-th day of the course. If it is C, it is in the mood for "Coding"; if it is J, it is in the mood for "Jamming".

Output

For each test case, output one line containing Case #x: y, where x is the test case number (starting from 1) and y is the maximum number of points you can obtain for that case.

Limits

Time limit: 20 seconds per test set.
Memory limit: 1 GB.
1 ≤ T ≤ 100.
The length of S is even.

Small dataset (Test Set 1 - Visible)

2 ≤ the length of S ≤ 50.

Large dataset (Test Set 2 - Hidden)

2 ≤ the length of S ≤ 20000.
The sum of lengths of all S in the dataset is at most 150000.

Sample

Sample Input

5
CCJJ
CJCJ
CJJC
CJJJ
CCCCCC

Sample Output

Case #1: 20
Case #2: 10
Case #3: 20
Case #4: 15
Case #5: 30

This strategy is optimal for sample case #1:
Day 1: Request a "Coding" problem set (call it C1).
Day 2: Submit C1.
Day 3: Request a "Jamming" problem set (call it J1).
Day 4: Submit J1.

The following strategy is optimal for sample cases #2, #3, and #4: request C1, request J1, submit J1, submit C1.

For case #2, for example, note that you could not request C1, request J1, and then submit C1. Only the most recently requested problem set can be submitted.

In sample case #5, you can alternate between requesting a "Coding" problem set on one day, and submitting it on the next day.

Problem

The Forest University offers its students N courses, which must all be taken to obtain the degree. The courses can only be taken one at a time - you must complete a course before starting another. Each course is either basic, which means one can take it without any prior knowledge, or advanced, in which case exactly one other course is its prerequisite.

A student must take the prerequisite for a course before taking the course, although they do not need to be taken immediately one after the other. A course might be the prerequisite for multiple other courses. There are no prerequisite cycles. Any sequence of the N courses that meets the rules for prerequisites is valid for obtaining the degree.

When you graduate, the university commemorates the sequence of courses you have taken by printing an abbreviated version of it on your graduation hat. More precisely, this abbreviated version is a string consisting of the first letter of the name of each course you have taken, in the order you have taken them. For example, if you have taken a Coding course and a Jamming course, in that order, your graduation hat will say CJ. It is considered trendy to have certain cool words as a substring of the string on one's graduation hat.

Consider all possible valid sequences in which the courses can be taken. For each cool word, you need to find the fraction of those sequences that have the cool word as a substring (at least once) of the string on the corresponding graduation hat. Note that we're interested in the fraction of possible course sequences, not the fraction of possible different graduation hat strings. (Since multiple courses may start with the same letter, there may be fewer possible strings than course sequences.)

Somewhat unusually for Code Jam, we are only looking for an approximate answer to this problem; pay careful attention to the output format.

Solving this problem

This problem has only 1 Small input and no Large input. You will be able to retry the input (with a time penalty).

Input

The first line of the input gives the number of test cases, T. T test cases follow. Each test case consists of five lines, in this order, which contain the following:

the number N of courses.
N integers; the i-th of these integers gives the number of the prerequisite course for the i-th course, or 0 if the i-th course is basic. The courses are numbered from 1 to N.
N uppercase English letters (without whitespace in between), with the i-th character giving the first letter of the i-th course's name.
the number M of cool words.
M cool words, each of which consists only of uppercase English letters.

Output

For each test case, output one line containing Case #x: y₁ y₂ ... y_M, where x is the test case number (starting from 1) and y_i is the fraction of valid course sequences that will have the i-th cool word as a substring of the string on the graduation hat.

y_i will be considered correct if it is within an absolute error of 0.03 of the correct answer. See the FAQ for an explanation of what that means, and what formats of real numbers we accept.

Limits

Small dataset (Test Set 1 - Visible)

Time limit: 300 seconds per test set.
Memory limit: 1 GB.
1 ≤ T ≤ 100.
1 ≤ N ≤ 100.
1 ≤ M ≤ 5.
The length of each cool word is between 1 and 20.
Each cool word consists of uppercase English letters only.
There are no cycles formed by the prerequisites.

Sample

Sample Input

2
2
0 1
CJ
4
CJ C D JC
3
0 1 0
BAA
3
AA AAB ABA

Sample Output

Case #1: 1.0 1.0 0.0 0.0
Case #2: 0.67 0.0 0.33

The sample output displays one set of acceptable answers to the sample cases. Other answers are possible within the allowed precision.

In sample case #1, course 1 (C) is a basic course that is a prerequisite for the advanced course 2 (J). The only way to complete the courses is to take course 1 and then course 2. This creates the string CJ. So the cool words CJ, C, D, and JC are present as substrings in 1, 1, 0, and 0 out of 1 possible cases, respectively.

In sample case #2, the basic course 1 (B) is a prerequisite for the advanced course 2 (A), and course 3 (A) is another basic course. There are three possible ways of completing the courses:

take course 1, then course 2, then course 3 (string: BAA)
take course 1, then course 3, then course 2 (string: BAA)
take course 3, then course 1, then course 2 (string: ABA)

The cool words AA, AAB, and ABA are present as substrings in 2, 0, and 1 out of 3 possible cases, respectively.

Problem

You are a rebel against the evil Galactic Empire, and you are on the run!

You have sabotaged the Empire's Factory of Evil, and imperial security forces will be after you soon! The factory is located on asteroid 0 in a system of N numbered asteroids. Your getaway ship, the Century Quail, is located on asteroid 1, and if you can get there, you will be able to fly away safely.

Each asteroid is a single point in space with a velocity, and you move through space along with whichever asteroid you are currently on. Your Asteroid Jumper will allow you to instantaneously jump between any two asteroids in the system. Long jumps are scarier than short ones (and the vacuum of space is terrifying), so you want to minimize the maximum distance you need to jump. However, starting now, if you ever spend more than a continuous S seconds without jumping, the imperial security forces will catch you. That is, the interval from now until your first jump, and each interval between subsequent jumps, must be less than or equal to S. You may jump at any instant; it does not have to be after an integer number of seconds have elapsed. You escape the instant you jump to asteroid 1.

The i-th asteroid starts at position (x_i, y_i, z_i) in space, and it will move a total distance of (V_xi, V_yi, V_zi) each second. This movement is continuous throughout time; it does not update discretely each second. (It is also possible for an asteroid to be stationary.) Nothing happens if asteroids occupy the same point in space at the same time. You can only travel between two asteroids by jumping, even if they happen to occupy the same point at the instant of your jump.

In the escape plan that minimizes the maximum jump distance, what is that maximum jump distance?

Input

The first line of the input gives the number of test cases, T. T test cases follow. The first line of each test case contains two integers: N (the number of asteroids) and S (the limit on how long you can go without jumping). Next, there are N lines describing the asteroids. The i-th of these lines (counting starting from 0) contains six integers: the initial (x_i, y_i, z_i) position of the i-th asteroid in space, and the distance ( V_xi, V_yi, V_zi) it moves in a single second.

Output

For each test case, output one line containing Case #x: y, where x is the test case number (starting from 1) and y is a floating-point number: the distance of the longest jump you will have to make in order to get away. y will be considered correct if it is within an absolute or relative error of 10^-4 of the correct answer. See the FAQ for an explanation of what that means, and what formats of real numbers we accept.

Limits

Time limit: 20 seconds per test set.
Memory limit: 1 GB.
1 ≤ T ≤ 20.
2 ≤ N ≤ 1000.
1 ≤ S ≤ 100.
-500 ≤ x_i ≤ 500.
-500 ≤ y_i ≤ 500.
-500 ≤ z_i ≤ 500.

Small dataset (Test Set 1 - Visible)

V_xi = 0.
V_yi = 0.
V_zi = 0.

Large dataset (Test Set 2 - Hidden)

-500 ≤ V_xi ≤ 500.
-500 ≤ V_yi ≤ 500.
-500 ≤ V_zi ≤ 500.

Sample

Sample Input

3
3 7
0 0 0 0 0 0
1 2 2 0 0 0
1 1 1 0 0 0
5 10
0 0 0 0 0 0
35 0 0 -1 0 0
1 54 0 0 -2 0
2 -150 0 0 10 0
4 0 0 -1 0 0
3 1
-10 2 0 1 0 0
0 0 10 0 0 -1
-10 -2 0 1 0 0

Sample Output

Case #1: 1.7320508
Case #2: 2.0000000
Case #3: 4.0000000

Sample case #1 is the only sample case that could appear in the Small dataset. Any of the sample cases could appear in the Large dataset.

In sample case #1, we start on a stationary asteroid at (0, 0, 0), and our ship is on an asteroid at (1, 2, 2). There is another asteroid at (1, 1, 1). One option is to jump directly to our ship, which is a distance of 3 away. Another option is to jump to the other asteroid, which is a distance of sqrt(3) away, and then jump to the ship from there, which is a distance of sqrt(2) away. The maximum jump distance is 3 for the first option and sqrt(3) for the second option, so the second option is preferable.

Note that the value of S does not matter in the Small cases. Since all of the asteroids are stationary, there is no reason to wait around; we can make all jumps instantaneously.

In sample case #2, we start on a stationary asteroid at (0, 0, 0). We can wait there for 4 seconds for asteroid 4 to come very close, jump onto it, fly for 1 second on it, and then jump back at time 5 back to asteroid 0 (the distance between the two asteroids is 1 at this moment). There we wait 10 seconds, cutting it very close to being caught, and then jump to the speeding asteroid 3 at time 15. Two seconds later, asteroid 3 flies by asteroid 2, and we jump to asteroid 2. At time 27, we can jump from asteroid 2 to asteroid 0. There we patiently wait until time 35 when asteroid 1 reaches us, then we can jump onto it and escape. The longest jump we made was from asteroid 0 to asteroid 3 at time 15, and the distance we jumped was 2.

In sample case #3, the security forces are really active! You could, of course, wait one second and jump directly to asteroid 1, but a better choice - that allows you to make jumps no longer than 4 - is to jump back and forth between asteroids 0 and 2; while waiting for asteroid 1 to get close, and only then jump to it.

Problem

The Go language was designed to have a simple API and to support multi-threading. The Code Jam team wants to push these goals to the limit, so we are proposing a new language called Go++.

The Go++ language uses one register, which stores one boolean value (0 or 1). This register is initialized to 0. The language has three instructions:

0, which sets the register to 0.
1, which sets the register to 1.
?, which prints the current register value.

Simple, right? To support multi-threading, we allow two different Go++ programs to run simultaneously while sharing the one register. Each instruction executes atomically — that is, one instruction must completely finish before the next instruction can start. However, the two programs may be interleaved in any way that preserves the relative order within each program.

For example, here are the only six ways in which the two programs 1? and ?0 could be executed together. (The underline on the second program is just to distinguish its instructions from the instructions in the first program.)

?01?, which will print 01. (Remember that the register is initialized to 0.)
?10?, which will print 00.
?1?0, which will print 01.
1?0?, which will print 10.
1??0, which will print 11.
1??0, which will print 11.

Note that the output string always consists of 0s and 1s, and never ?s, since ? is not a state the register can be in.

Usually, programmers write programs to produce a desired output, but your task will be to write two programs that won't produce an undesired output! Specifically, you will be given a "bad" string B of length L, and a set G of N "good" strings, all of length L. You must produce two Go++ programs (not necessarily of the same length), which, when run in the way described here, could produce all of the strings in G, but could not produce the string B. It is fine if the programs could also produce other strings that are not B and not in G. Note that there must be a combined total of exactly L ? instructions in the two programs. The combined number of instructions in the two programs must not exceed 200.

For example, for B = 11 and G = { 10, 00 }, the programs ? and 10?1 would be one valid answer. They can produce every string in G, but they cannot produce B, no matter how they are interleaved. (They can also produce the string 01, which is not B and is not in G, but that is fine.) However, the programs 1? and ?0 would not be a valid answer, since (as we saw above) they can produce B. The programs 00 and ?? would not be a valid answer, since they cannot produce every string in G.

Can you produce two programs that satisfy the conditions, or determine that the task is IMPOSSIBLE?

Input

The first line of the input gives the number of test cases, T. T test cases follow; each consists of three lines. The first line of each test case has two integers N and L: the number of strings in G, and the length of the B string and the strings in G. The second line has N different strings of length L: the strings in G. The third line has one string of length L: the bad string B. B and all of the strings in G are made up of only 0s and/or 1s.

Output

For each test case, output one line containing Case #x: IMPOSSIBLE, if no programs will satisfy the conditions; otherwise, output Case #x: y z, where x is the test case number (starting from 1) and y and z are your two programs that satisfy the conditions. The combined number of instructions in your programs must not exceed 200. Each program must contain at least one instruction. There must be a combined total of exactly L ? instructions in the two programs.

Limits

Time limit: 20 seconds per test set.
Memory limit: 1 GB.
1 ≤ T ≤ 100.
1 ≤ N ≤ 100.
1 ≤ L ≤ 50.
All strings in G are different.

Small dataset (Test Set 1 - Visible)

B consists entirely of 1s.

Large dataset (Test Set 2 - Hidden)

B may be any string consisting of 0s and/or 1s.

Sample

Sample Input

3
2 2
10 00
11
3 2
11 10 00
01
4 2
00 01 10 11
11

Sample Output

Case #1: ? 10?1
Case #2: 1?? 0
Case #3: IMPOSSIBLE

The sample output displays one set of answers to the sample cases. Other answers may be possible.

Sample case #1 is the one described in the problem statement.

Sample case #2 would not appear in the Small dataset.

Sample case #3 is obviously IMPOSSIBLE because B is in G.

Teaching Assistant: Analysis

Let's define N as the number of days in the test. Notice that since a student is not allowed to do nothing on a test day, the student must request at least N/2 problem sets. Further, there is no reason to request more than that, since the extras cannot be submitted.

We'll begin with a dynamic programming solution that is sufficient for the Small dataset, and then discuss a mathematical formula that solves the Large dataset.

A dynamic programming approach

Because we can only submit the problem we requested most recently, we can view the problems that we hold as a stack, where requesting corresponds to pushing and submitting the most recently requested problem corresponds to popping.

Consider what happens on day 1. Clearly, we must request a problem set, and we now have to pick a day to submit it. Now, between the request and the submission, we can push to and pop from the stack, but after we are finished, the stack must contain the problem we requested on day 1, and nothing else. That means we need to have an even number of days between our request on day 1 and the corresponding submission.

We can think of the period of time between our request and our submission as a subproblem, and the time after the submission as another subproblem. Each of those subproblems must leave the stack the way they found it, so we can use the same set of rules that we used above for every subproblem.

Now, we'll express how to solve the problem in terms of recursion. At each stage, we'll be looking at some range of days, from day i to day j. So the subproblem (i,j) is to find the best possible score we can get over the days from i to j. We'll also say we can't touch the existing entries on the stack (corresponding to any days before i), and must leave the stack as we found it. We can then solve the subproblem as we did the original problem: request a problem set on day i, then pick where in the (i,j) interval to submit it such that there's an even number of days between the request and the submission.

At this point, each of our choices divides the subproblem into two additional, possibly-empty subproblems. For a subproblem of length 6, for example, here are the ways to place the request and submission:

_ _ _ _ _ _
↓
R S _ _ _ _
R _ _ S _ _
R _ _ _ _ S

Given the pairs of request and submission days, how do we decide which type of problem set to request? If we request the type that doesn't match the assistant's mood on request day, the best score we can get is 5 points. But we always get at least that much by requesting the type that matches the assistant's mood, so we should always do that. Then, the score we get when we submit just depends on whether the assistant's mood on submit day matches the type of problem we requested: we get 10 points total if they match, and 5 if they don't.

The empty subproblems are the base cases of our recursion. For larger problems, we find, for each possible submission point, the score we get from requesting and submitting that problem set (10 if the assistant's mood on request day matches its mood on submit day, 5 otherwise), plus the optimal scores of the subproblems. We can use this to recursively find the solution to the original problem; memoizing the solutions for the subproblems yields a DP solution. Because the memoization table is size O(N²) and it takes O(N) iterations to fill each cell, this algorithm is O(N³), which is sufficient to solve the Small problem.

A formulaic approach

It's also possible to solve the problem with a simple mathematical formula based on the input.

Count the number of even- and odd-numbered days on which the assistant is in the mood for Coding and Jamming. That gives us four numbers, which we'll call CE, CO, JE, and JO (for instance, CE is the number of even-numbered days on which the assistant is in the mood for Coding). We will show that the maximum possible score is:

S = 10 * (min(CE, CO) + min(JE, JO)) + 5 * abs(CE - CO).

We'll first show that this is an upper bound (we can't get a score higher than S), then show that it is tight (we can get a score equal to S).

First, note that any request/submit pair for the same problem (which we'll just call a pair) must have one even and one odd day; that is, for any problem set, it was either requested on an odd day and submitted on an even day, or vice versa. That must be true because we need an even number of days between the request and the submission. Now, since we need an even Coding day and an odd Coding day to make a pair where both days are Coding, the amount of those pairs that we can make is equal to the minimum of CE and CO. We can apply the same logic to Jamming. Then, we can pair up the leftovers; the amount of leftover pairs is the amount by which CE and CO differ (you can show that JE and JO differ by the same amount). No matter how we pair up the leftovers, there is no way to form more matched pairs.

To show that this bound is tight, we will construct a solution that achieves that upper bound. We make a list of the days in order. Then, whenever we see two adjacent days with the same mood, mark the first one as a request and the second as a submission, then remove those two days from our list. Observe that we can continue making pairs on this list, and they'll be valid on the real list (it's impossible for a new pair to be halfway inside and halfway outside any pair that we've removed, since we only remove pairs when there's nothing inside them). Continue this process until we can't anymore. At this point, since there aren't any adjacent days with the same mood, the mood must alternate between days. But that means that all even-numbered days have the same mood, and all odd-numbered days have the other mood. So we can pair all of those up, completing the proof.

A greedy approach

If you're curious, it's also possible to use a greedy stack-based algorithm to solve the Large problem. Our algorithm proceeds from left to right, and at each point decides whether to request or submit. We maintain a stack of problem sets that we hold, where a request is a push and a submit is a pop, then select our action each day as follows:

If we have no unsubmitted problem sets, request.
If we've made N/2 requests thus far, where N is the number of days in the test, submit.
If the top of the stack contains a problem set that matches the assistant's mood, submit.
Otherwise, request.

It's fairly difficult to see why this is valid on its own, but we can show that this also obeys the formula that we give above.

Both algorithms (the stack algorithm and the adjacent pair algorithm) find an adjacent pair with the same mood some number of times; we'll call these hits. Pairs with a different mood are misses. It remains to prove that the stack algorithm gets the optimal number of hits.

In most cases, both algorithms hit and miss in the same places. Intuitively, steps 3 and 4 of the stack algorithm correspond to finding and removing adjacent pairs with the same mood. Then, steps 1 and 2 clean up the remaining unmatched pairs.

There are some corner cases, however, where the algorithms don't precisely match. Take the input CJCJJC, for example: the stack algorithm produces RRRSSS, while the adjacent-pairs algorithm results in RRSRSS.

All of these corner cases occur when there's an unmatched pair that the stack algorithm doesn't match, failing to see that there's a matched pair later on at the same level. In other words, they all look like the following. (You can swap the Js and Cs, but we'll use the case that we show below without loss of generality.)


... J ... C ... J ... J ...

Then, the stack algorithm requests for the first two and submits for the last two, while the adjacent pairs algorithm matches the second pair, then the first pair. Notice that these have the same result either way: there's always one unmatched pair and one matched pair. Hence, the algorithms both produce the optimal score, even though they may attain it in different ways.

Test Data

info We recommend that you practice debugging solutions without looking at the test data.

Forest University: Analysis

An unusual problem

Small-only problems are relatively rare in Code Jam. Sometimes, as with 2012's Proper Shuffle from Round 1A 2014, this is because we have a randomized solution in mind and want contestants to be able to try again if necessary, even if the probability of a contestant's optimal solution actually failing due to chance is very small. The Forest University problem is Small-only for both of these reasons. We could have added another Small dataset small enough to solve precisely via dynamic programming, but we didn't think it would add useful resolution to help us pick our 26 advancers.

Clearly, with up to 100 courses, there are far too many course sequences to enumerate. The problem has an unusually relaxed tolerance for less precise answers, so it is a natural candidate for a simulation-based approach: we can generate course schedules and check whether they contain the cool substrings. Unusually for a simulation problem, though, the challenge here is to figure out how to do the simulation correctly even once! How do we sample the set of all course sequences uniformly at random?

A weighted-sampling method

Some pen and paper analysis of a small forest can reveal a simple rule that is sufficient to guide our simulations. Let's consider a case with five courses A, B, C, D, and E; A and E are basic, A is the prerequisite of both B and D, and B is the prerequisite of C.

Then there are fifteen possible orders in which to take the courses: ABCDE, ABCED, ABDCE, ABDEC, ABECD, ABEDC, ADBCE, ADBEC, ADEBC, AEBCD, AEBDC, AEDBC, EABCD, EABDC, EADBC. Notice that four-fifths of these begin with A and one-fifth begin with E. A is a root node with a total of 4 descendants (including itself); E is a root node with a total of 1 descendant (including itself). This is a promising pattern, and it is not coincidental! It suggests a method for building up a course string while sampling uniformly: keep choosing one of the available courses, with probability proportional to the size of that course's subtree (itself plus all its descendants).

To apply this method to the example above: we start with an empty course string, and at first, we can only choose either A or E. A has 4 descendants including itself, and E has 1, so we choose A with probability 4/5 and E with probability 1/5. Suppose that we choose A. Now we have three choices for the next course: B, D, and E. B has 2 descendants including itself; D has 1; E has 1. So, we choose B with probability 2/4, D with probability 1/4, and E with probability 1/4. Suppose that we choose D. Now we have two choices: B and E; we choose these with probability 2/3 and 1/3, respectively. We continue in this way until we've built a full sequence.

Why does this work? Speaking more generally: let's add a single root node to be the parent of all nodes with no prerequisites. We can now recursively compute, for each subtree, a schedule of just the classes in that subtree. Suppose that the subtree has root node V and size S. First, we compute a schedule for each subtree whose root is one of the children of V. Then, we put V first in the new schedule we are creating, and we choose an assignment (uniformly at random) of the remaining S-1 positions to each of the subtrees, such that each subtree is assigned as many positions as it has nodes. Then, we copy the ordering for each subtree into that subtree's positions. This results in a uniformly randomly chosen schedule.

Since we are picking a uniformly random assignment for each subtree, then interleaving them together uniformly randomly, the fraction of possible orders in which a given top-level course (i.e. the root of a certain subtree) appears first can be computed using a multinomial coefficient. For example, if we have to interleave A elements from one subtree and B elements from another subtree, the total number of ways to do so is (A+B)! / (A! × B!). The number of these ways that start with an element from subtree A is (A+B-1)! / ((A-1)! × B!), and the number of these ways that start with an element from subtree B is (A+B-1)! / (A! × (B-1)!). The ratio between these is (A! × (B-1)!) / ((A-1)! × B!), and this reduces to A / B. This explains why it suffices to sample proportionately to the size of each subtree.

A surprisingly elegant method

Generating a random sequence is equivalent to assigning distinct numbers from 1 to N to nodes in our forest, in such a way that the number of a node is smaller than the number of all its descendants. Let's start with any of the N! possible assignments, chosen uniformly. Now let's go through nodes from top to bottom, and if a node is not the smallest in its subtree, we swap its number with the smallest. This generates sequences uniformly, since each sequence can be obtained from P different permutations, where P is the product of sizes of subtrees of all nodes. The probability that a given node X is the smallest (once it's available to be chosen) is proportional to the size of the tree it is rooting, because any of those nodes that get the smallest number will "give" it to X.

But will we get a precise enough answer?

If we check K uniformly generated sequences, and the true probability to find a given substring is p, then the fraction of those K sequences that contain this substring follows a binomial distribution with parameters K, p, which we can approximate with a normal distribution with mean p and standard deviation sqrt(p × (1 - p) / K). (This Wikipedia article has some guidelines on when this approximation is valid.) If we run 10000 iterations, this is at most 0.5 / 100=5e-3, so the required precision of 3e-2 is 6 standard deviations. So, the probability of having one particular answer incorrect is roughly 1 in 500 million, which means that the probability of having at least one of the 500 required answers incorrect is at most 1 in a million.

The error bounds are generous enough that, with a fast enough implementation, the problem is solvable in slower languages such as Python.

Test Data

info We recommend that you practice debugging solutions without looking at the test data.

The intended solution is roughly as follows:

Treat it at a graph problem. We have a graph with N nodes, and each edge is available in an interval of time.
For each node, order the edges by, say, availability start time. As long as we have at least one edge available, we can jump back and forth. And, obviously we can arrive on an asteroid only when there's at least one edge open. So, there are "gaps" such that you need to leave the asteroid between the gap, because there will be no edges in between. So, we can split each node into multiple nodes across these gaps (note that the number of edges doesn't grow, so the size of the whole graph is still O(N^2)).
Now we run a Dijkstra's on the new graph. When we arrive in a node, we can only take the outgoing edges which haven't been closed yet (and the time at which we take them is the max of the current time and the time in which they open).
Done.

Test Data

info We recommend that you practice debugging solutions without looking at the test data.

go++: Analysis

To make the analysis a tiny bit easier to follow, we'll typeset strings (B, G, and output from the Go++ programs) in monospace font, instructions from the first program in normal font, and instructions from the second program with an underline.

First, let's try to figure out when the problem is impossible to solve. It's clearly impossible when B is in G; that case is even in the sample input. What about otherwise?

Surprisingly, if B is not in G, the problem is always solvable.

To prove this, I'll construct two Go++ programs that, when run as described in the problem statement, can produce any L-length string except B. That sounds harder than making one that just produces the strings in G, but sometimes, making the problem more difficult is a great way to gain the insight you need to solve the original problem. For this problem in particular, the set G is a red herring!

The first program

Given an input string B, construct the first program as follows:

Start with the inverse of every character of B (0 ↦ 1, 1 ↦ 0).
Add a ? instruction after every 0 or 1 instruction.

For instance, if B = 111, which could appear on the Small input, the first program is 0?0?0?. If B = 010, which could only appear on the Large input, the first program is 1?0?1?.

Why does this work? Notice that by default, this program generates the opposite of B. In order to generate a different string, we need to override one or more of the ? instructions; that is, the programs need to be interleaved such that right before the ? from the first program, some other instruction from the second program changes what it prints. For instance, given the first program 1?0?1?, we can print 011 instead of 101 using the second program 01, if it's interleaved as follows: 10?01?1?. In order to print B, we need to override all three ? instructions.

So how do we prevent that from happening? We write a second program that cannot override all three ? instructions. Our solution differs for the Small and Large inputs.

The second program (Small)

For the Small input, B is always a string of 1s. That means the first program is always 0?, repeated L times. For instance, if B = 111, the first program is 0?0?0?.

Our second program needs to be able to override any two of those ? instructions; that way, we can print any output with up to two 1s. However, we can't override all three ? instructions, since that means our program might print B. So we need a second program that contains L-1 1s, but not L 1s. The solution to this is fairly simple: our second program simply consists of 1, repeated L-1 times. For instance, if B = 111, the second program is 11.

The second program (Large)

To solve the problem for arbitrary B, we need a second program that, when taken as a string, has every (L-1)-length string as a subsequence (so we can print any string except B), but doesn't have B as a subsequence. (Here, we use "subsequence" to mean any set of elements, ordered as they appear in the original sequence; they need not be adjacent.) We'll discuss how to produce such a program shortly.

Now, with our first program and this hypothetical second program, we can produce any L-length string except B. That's because the second program has every possible subsequence of length L-1, so we can override up to L-1 of the ? instructions of the first program. However, we cannot override all L of them to produce B, because the second program doesn't have B as a subsequence.

There are many ways to generate the second program. This one won't produce the shortest program, but it is perhaps the easiest to explain:

Start with a copy of B, excluding the last character.
Replace each character with a two-character sequence as follows: 0 ↦ 10, 1 ↦ 01.

For instance, if B = 010, the second program is 1001. The first 0 becomes 10, the 1 becomes 01, and we ignore the last character of B.

We outlined the two requirements above for the second program to be valid:

The second program must have every (L-1)-length string as a subsequence.
The second program must not contain B as a subsequence.

Now, we will show that the steps above give us a second program that satisfies these requirements.

To get any (L-1)-length subsequence, split the second program into pairs of adjacent instructions. Each pair contains a 0 and a 1, since that's how we built the program. So we just pick one character from each pair, and we have any (L-1)-length string!

Now, let's try to get B as a subsequence of our second program. Say B = 010 and the second program is 1001. In order to get 010 as a subsequence, we start from finding the first 0 in 1001, which occurs at the second character. Next we find the first 1 after that, which is the fourth character. The pattern continues: because of how we've built the second program, we always go through two characters whenever we try to find B as a subsequence. And because the second program only has 2L-2 characters, we can't find all of B as a subsequence. Therefore, our second program is valid.

Test Data

info We recommend that you practice debugging solutions without looking at the test data.

Test set 1: 366 correct solutions (98.4% solve rate)

First
Errichto	6:06
ksun48	C++, 6:21
piob	C++, 6:44
huzecong	C++, 6:52
nika	C++, 7:22

Shortest
grebnesieh	Python, 294 bytes
y0105w49	C++, 334 bytes
Blue.Mary	C++, 391 bytes
gog.gerard	C++, 411 bytes
Hezhu	C++, 439 bytes

Test set 2: 343 correct solutions (92.2% solve rate)

First
Errichto	6:06
ksun48	C++, 6:21
piob	C++, 6:44
huzecong	C++, 6:52
nika	C++, 7:22

Shortest
grebnesieh	Python, 294 bytes
y0105w49	C++, 334 bytes
Blue.Mary	C++, 391 bytes
gog.gerard	C++, 411 bytes
Hezhu	C++, 439 bytes

Test set 1: 153 correct solutions (41.1% solve rate)

First
iwiwi	C++, 25:06
Endagorion	C++, 29:41
nika	C++, 30:38
mnbvmar	C++, 31:58
scott_wu aka scottwu	C++, 36:15

Shortest
Nore	Python, 1048 bytes
wuzhengkai	C++, 1095 bytes
ZhouYuChen	C++, 1145 bytes
ksun48	C++, 1196 bytes
gs12117	C++, 1226 bytes

Test set 1: 294 correct solutions (79.0% solve rate)

First
microtony	C++, 13:53
LeBron	C++, 17:16
tourist aka Gennady.Korotkevich	C++, 17:41
pwild	C++, 18:59
Burunduk1	C++, 20:03

Shortest
Cypi	Python, 597 bytes
athin	C++, 820 bytes
grebnesieh	Python, 832 bytes
abyssmally	C++, 847 bytes
Nore	Python, 869 bytes

Test set 2: 19 correct solutions (5.1% solve rate)

First
xyz111	C++, 52:01
KAN aka KalininN	C++, 62:15
krijgertje	C++, 80:25
Egor	96:15
mk.al13n	C++, 105:26

Shortest
xyz111	C++, 2591 bytes
izrak	C++, 3044 bytes
krijgertje	C++, 3045 bytes
nika	C++, 3077 bytes
y0105w49	C++, 3136 bytes

Test set 1: 244 correct solutions (65.6% solve rate)

First
phire	Python, 14:57
Rafbill	C++, 22:18
grebnesieh	Python, 27:50
chaotic_iak	30:39
fastblood	C++, 30:45

Shortest
grebnesieh	Python, 258 bytes
Riatre	Python, 313 bytes
Cypi	Python, 372 bytes
misof	Python, 386 bytes
Nore	Python, 445 bytes

Test set 2: 36 correct solutions (9.7% solve rate)

First
yosupo aka yosupot	C++, 46:09
eatmore	Java, 64:05
kevinsogo	Python, 70:18
ksun48	C++, 83:50
MRain	C++, 86:34

Shortest
wuzhengkai	C++, 743 bytes
Arterm	C++, 762 bytes
KrK	C++, 770 bytes
Nore	Python, 808 bytes
ksun48	C++, 822 bytes

Google Code Jam Archive — Round 3 2016 problems

Overview

A. Teaching Assistant

Problem

Input

Output

Limits

Small dataset (Test Set 1 - Visible)

Large dataset (Test Set 2 - Hidden)

Sample

B. Forest University

Problem

Solving this problem

Input

Output

Limits

Small dataset (Test Set 1 - Visible)

Sample

C. Rebel Against The Empire

Problem

Input

Output

Limits

Small dataset (Test Set 1 - Visible)

Large dataset (Test Set 2 - Hidden)

Sample

D. Go++

Problem

Input

Output

Limits

Small dataset (Test Set 1 - Visible)

Large dataset (Test Set 2 - Hidden)

Sample

Analysis — A. Teaching Assistant

Teaching Assistant: Analysis

A dynamic programming approach

A formulaic approach

A greedy approach

Analysis — B. Forest University

Forest University: Analysis

An unusual problem

A weighted-sampling method

A surprisingly elegant method

But will we get a precise enough answer?

Analysis — C. Rebel Against The Empire

Analysis — D. Go++

go++: Analysis

The first program

The second program (Small)

The second program (Large)

Statistics — A. Teaching Assistant

Test set 1: 366 correct solutions (98.4% solve rate)

Test set 2: 343 correct solutions (92.2% solve rate)

Statistics — B. Forest University

Test set 1: 153 correct solutions (41.1% solve rate)

Statistics — C. Rebel Against The Empire

Test set 1: 294 correct solutions (79.0% solve rate)

Test set 2: 19 correct solutions (5.1% solve rate)

Statistics — D. Go++

Test set 1: 244 correct solutions (65.6% solve rate)

Test set 2: 36 correct solutions (9.7% solve rate)