In our fourth annual Code Jam to I/O for Women contest, we had an immediate burst of activity on the scoreboard. Within 16 minutes, at least one contestant had solved two full problems out of four. The others proved to be more difficult, though; there were no Large submissions for C (or any submissions for D) within the first hour of the contest. Then, solutions for both came trickling in, but the majority of D-Large attempts used an incorrect approach. The first and only completely correct set of submissions came from Taube, with a total penalty time of 2:23:44. Most contestants who reached the top 150 did so by solving all of A and B and C-Small; a combination of speed and accuracy was necessary.
Ticket Trouble was a warm-up problem that alluded to our 2017 Google I/O venue, Shoreline Amphitheatre. This was followed by Understudies, a probability problem with a greedy solution, and Word Search, in which contestants had to construct a letter grid under specific tight constraints. Rounding out this year's set, we had Where Ya Gonna Call?, a challenging graph problem. You can check out the analysis for each problem for more information.
Over 600 contestants made it onto the scoreboard with at least one correct submission! Thanks to everyone who participated, and we hope to see all of you again in next year's contest! Before that, though, there's Google Code Jam 2017; registration is open through the end of the Qualification Round on April 9.
Cast
Problem A (Ticket Trouble): Written and prepared by Ian Tullis.
Problem B (Understudies): Written and prepared by Ian Tullis.
Problem C (Word Search): Written and prepared by Ian Tullis.
Problem D (Where Ya Gonna Call?): Written and prepared by Pablo Heiber.
Solutions and other problem preparation and review by Liang Bai, Shane Carr, Yinfu Chen, John Dethridge, Lauren Mancino Gallagher, Jackson Gatenby, Lalit Kundu, Zhusong Li, Rohan Mukkamala, Igor Naverniouk, Chieu Nguyen, Trung Thanh Nguyen, and Erick Wong.
Problem analyses by Pablo Heiber and Ian Tullis.
A group of F friends is attending a conference held at an amphitheater, and they have bought tickets to see a concert there afterwards. The amphitheater is a grid of seats with S rows and S columns. For each seat, the amphitheater has sold a single ticket (although some of the tickets might not have been sold to this group of friends). Each ticket is normally labeled with a pair of integers giving the row and column numbers of one seat, in that order. For example, a ticket might normally say (2, 1), meaning row 2, column 1, or (2, 2), meaning row 2, column 2.
When the tickets were printed, there was a malfunction, and the two numbers in each pair always came out in sorted (that is, nondecreasing) order! So, for example, a ticket labeled (1, 2) might actually be for the seat in row 1, column 2, or it might actually be for the seat in row 2, column 1. If two friends have tickets labeled (1, 2), then one must actually be for row 1, column 2, and the other must actually be for row 2, column 1.
The friends will consult the box office on the day of the concert to find out what their actual seat numbers are, but for now, it is unclear! Given the printed pairs on the tickets, what is the largest possible number of the friends that could actually be seated all in the same-numbered row of seats? (The friends do not necessarily need to be seated in consecutive seats in that row.)
The first line of the input gives the number of test cases, T. T test cases follow. Each begins with one line with two integers F and S, representing the number of friends and the dimension of the grid of seats. Then, F more lines follow. The i-th of those lines has two integers Ai and Bi, representing the two numbers printed on the i-th friend's ticket.
For each test case, output one line containing Case #x: y
, where
x
is the test case number (starting from 1) and y
is largest possible number of the friends that could actually be seated all
in the same-numbered row of seats.
Time limit: 20 seconds per test set.
Memory limit: 1GB.
F ≤ S2.
1 ≤ Ai ≤ Bi ≤ S, for all
i.
No pair appears more than twice in a test case.
No pair containing the same number twice appears more than once in a test
case.
1 ≤ T ≤ 50.
2 ≤ F ≤ 3.
2 ≤ S ≤ 3.
1 ≤ T ≤ 100.
2 ≤ F ≤ 100.
2 ≤ S ≤ 100.
Input |
Output |
3 2 3 1 2 1 2 3 3 1 2 2 3 2 2 3 3 1 1 2 2 1 2 |
Case #1: 1 Case #2: 3 Case #3: 2 |
In sample case #1, one ticket must actually be for row 1, column 2, and the other must actually be for row 2, column 1, even though we do not know which is which. So we know that the friends are not seated in the same row, and the largest number of friends in any row is 1. Also note that the seats have a third row and column, but none of the tickets use the third row or column.
In sample case #2, one of the tickets is definitely for seat 2 in row 2, and it is possible that two of the other tickets could be for seats 1 and 3 in row 2. So there may be as many as 3 friends in the same row.
In sample case #3, either there are two friends in row 1 and one in row 2, or there are two friends in row 2 and one in row 1. In either case, the answer is 2.
You are a casting director for an upcoming musical. The musical has N roles, and for each role, you want to cast two performers: one primary performer and one understudy. A primary performer or understudy trains for only one particular role, and the job of the understudy is to play the role if the primary performer becomes unavailable. At least one of the two performers for each role must be available for the show to succeed.
You have selected 2N performers to be in the musical. They are all quite talented, and any of them can be cast as a primary performer or understudy for any of the roles. However, you are worried that some of them may be tempted to run away to join the cast of Hamiltonian!, the smash hit musical about quantum mechanics, before your show opens. Luckily, you are an excellent judge of character. You know that the i-th performer has a probability Pi of becoming unavailable. (These probabilities are all independent of each other, and a given performer has their probability regardless of their assigned role or whether they are a primary performer or understudy.)
You wish to assign one primary performer and one understudy for each role in a way that maximizes the probability that the show will succeed. That is, you want to minimize the probability that there will be at least one role for which the primary performer and understudy both become unavailable.
If you make optimal casting choices, what is the probability that your show will succeed?
The first line of the input gives the number of test cases, T. T test cases follow; each consists of two lines. The first line contains a single integer N: the number of roles. The second line contains 2N rational numbers Pi; the i-th of these gives the probability that the i-th performer will become unavailable for your show. All of these probabilities are specified to exactly four decimal places of precision.
For each test case, output one line containing Case #x: y
, where
x
is the test case number (starting from 1) and y
is the probability that your show will succeed. y
will be
considered correct if it is within an absolute or relative error of
10-6 of the correct answer. See the
FAQ for an explanation of what
that means, and what formats of real numbers we accept.
1 ≤ T ≤ 100.
Time limit: 20 seconds per test set.
Memory limit: 1GB.
0.0000 ≤ Pi ≤ 1.0000, for all i.
1 ≤ N ≤ 4.
1 ≤ N ≤ 40.
Input |
|
3 2 0.2500 0.5000 0.5000 0.2500 3 0.0000 0.0000 0.0000 0.0009 0.0013 0.1776 1 1.0000 0.1234 |
|
|
|
Output
|
|
Case #1: 0.765625 Case #2: 1.000000 Case #3: 0.876600 |
In sample case #1, one optimal casting choice is to make the two 0.5000 performers leads for the two roles, and the two 0.2500 performers understudies. For a given role, the probability that both performers will become unavailable is 0.5 × 0.25 = 0.125. So the probability that a role will be filled by at least one of its actors is 1 - 0.125 = 0.875. The probability that both roles will be filled (and thus that the show will succeed) is 0.875 × 0.875 = 0.765625.
If we instead cast the two 0.5000 performers for one role and the two 0.2500 performers for the other role, the probability of success would be (1 - 0.50 × 0.50) × (1 - 0.25 × 0.25) = 0.703125, which is lower.
In sample case #2, the show will succeed for sure as long as you cast exactly one of the 0.0000 performers (who will never become unavailable) in each role.
In sample case #3, the 1.0000 performer will always become unavailable, so the probability of success is equal to 1 minus the probability that the other performer will become unavailable.
In honor of Google I/O 2017, we would like to make an I/O-themed word search
grid. This will be a rectangular grid in which every cell contains one of
the three characters I
, /
, or O
. The
people solving our word search will look for all instances of the string
I/O
that appear contiguously forwards or backwards in a row,
column, or diagonal. For example, the following grid contains eight instances
of I/O
, representing all eight possible directions in which the
string can appear:
OOOOO
O///O
O/I/O
O///O
OOOOO
To control the difficulty level of our word search, we would like the string to appear exactly N times in the grid. Moreover, we do not want the grid to be too large; it cannot have more than D rows or more than D columns.
Can you help us design a grid that meets these specifications?
The first line of the input gives the number of test cases, T. T test cases follow. Each test case consists of one line with two integers D and N, as described above.
For each test case, first output one line containing Case #x:
.
Then output R lines of exactly C characters each, representing the
rectangular grid. Each of those characters must be either I
,
/
, or O
. You may choose any values of R and C as
long as both are at least 1 and neither exceeds D. Your grid must
contain exactly N instances of the string I/O
, per
the rules described in the statement.
If there are multiple valid answers, you may output any of them.
Time limit: 20 seconds per test set.
Memory limit: 1GB.
0 ≤ N ≤ 287.
It is guaranteed that at least one valid grid exists for each test case.
1 ≤ T ≤ 25.
D = 50.
1 ≤ T ≤ 100.
D = 15.
Input |
Output |
4 50 1 50 0 50 3 50 8 |
Case #1: O / I Case #2: IO Case #3: IIIOOO /I/O/O IIIOOO Case #4: OOOOO O///O O/I/O O///O OOOOO |
The sample output displays one set of answers to the sample cases. Other answers may be possible. Note that these cases would only appear in the Small dataset.
Gooli is a huge company that owns B buildings in a hilly area. The buildings are numbered from 1 to B.
Last year, they built a set of slides between buildings that are now the favorite form of transportation between buildings. Slides have been upgraded with suction technology to make them two-way, so a slide between two buildings can be used to travel between those buildings in either direction. Some slides were built with turns, so their lengths do not necessarily follow common sense; for instance, they do not necessarily comply with the triangle inequality. Also, there is at most one slide between any pair of buildings.
Gooli is going to choose a location to install a special super secure phone for the CEO to talk to other important people. They want to minimize the distance by slide from any building to the meeting location, so as to minimize the time that it would take the CEO to reach it from any building. Gooli does not have any more carbon kilotubes to build more slides, and the CEO refuses any other type of transportation, so Gooli's communication security team needs to find the best location that is reachable using only already existing slides. The location could be in a building or a point somewhere within a slide.
When traveling using the slides, the CEO may use a slide, arrive at a building, then use a slide that starts there, arrive at another building, and so on, until she arrives at the desired location. Slides used from end to end contribute their full length to the total distance. If the CEO enters a slide and stops inside it because she found the phone, on the other hand, only the used part of the slide contributes to the total distance. When measuring distance, only the slide distance is important. Distance traveled within buildings to connect to a new slide or reach the phone is considered to be zero.
Given the buildings and slides in existence, can you find any optimal location for the super secure phone and return the distance from a farthest building to it? Note that the distance is the same for any optimal location.
The first line of the input gives the number of test cases, T. T lines follow. Each test case starts with one line with a single integer B, the number of buildings on Gooli's campus. Then, B - 1 lines follow. For i = 2, 3, ..., B, the (i-1)-th of these lines contains (i-1) integers Di1, Di2, ..., Di(i-1). Dij is -1 if there is no slide between the i-th building and the j-th building, or the length of that slide otherwise. All buildings are reachable from any other building using only slides, possibly passing through intermediate buildings.
For each test case, output one line containing Case #x: y
, where
x
is the test case number (starting from 1) and y
is the distance from an optimal location for the phone to a building farthest
from it. y
will be considered correct if it is within an
absolute or relative error of 10-6 of the correct answer. See the
FAQ for an explanation of what
that means, and what formats of real numbers we accept.
1 ≤ T ≤ 100.
Time limit: 100 seconds per test set.
Memory limit: 1GB.
2 ≤ B ≤ 50.
All buildings are reachable from any other building using only slides,
possibly passing through intermediate buildings.
Dij ≠ 0, for all i, j.
-1 ≤ Dij ≤ 2, for all i, j.
-1 ≤ Dij ≤ 109, for all i, j.
Input |
Output |
4 3 -1 1 2 3 1 1 1 3 4 2 3 4 9 10 7 7 -1 5 |
Case #1: 1.500000 Case #2: 1.000000 Case #3: 2.500000 Case #4: 8.500000 |
Note that the last two cases would not appear in the Small dataset.
In Case #1, all buildings are in a line. The only optimal location is of course the middle point of the line, as any other location would make one of the buildings at the end of the line be farther away.
Case #2 depicts an equilateral triangle. Any of the three buildings would be an optimal location for the phone.
Case #3 is also a triangle, but with sides of different lengths. If we pick any building, the farthest building would be at distance at least 3 from it. On the other hand, if we choose a location inside the slide of size 3, at distance 0.5 from building 3, the distance to a farthest building is improved to 2.5.
In Case #4, the optimal location is inside the slide of length 10 between buildings 1 and 3, at distance 1.5 from building 3 and distance 8.5 from building 1.
There are two minor hurdles to deal with in this problem:
1 2
) can never
both be in the same row, so they cannot both contribute to the maximum
value that is the answer. Therefore, it is safe to eliminate or ignore one
ticket from each such pair. You can sort the tickets so that duplicates end
up next to each other, or use a data structure such as a set to eliminate
them.1 1
). This
can throw off an attempted solution that just counts how many times each
number appears anywhere in any ticket.A possible solution for the Small is to note that each of the F tickets might either be correct as is, or it might have its numbers reversed. You can try all 2F possible scenarios and find the largest number of friends that ever end up together in the same row. The only tricky part is that duplicate tickets can cause illegal scenarios that put two friends in the same seat, so you must either check for these scenarios, or remove the duplicates in advance as mentioned earlier.
An even simpler solution that works for the Large is to determine the number of tickets that could possibly be in each row. For each ticket (ignoring one ticket from each duplicate pair), check the two numbers. If they are the same, add 1 to the count for that row number. If they are different, add 1 to the count for each of those row numbers. The maximum among these numbers is the answer. Note that in this solution, we do not need to explicitly consider any particular arrangement of the friends, and the solution would also work if the problem had asked about columns instead of rows.
There can be at most four roles and eight performers in a Small test case. We
could write code to enumerate all of the different ways to pair up the
performers, but this can be tricky to get right; however, the limits allow an
even easier brute-force solution. It is simple to generate all possible
permutations of performers using, for example, Python's
itertools.permutations
. For each permutation, we can pair the
first performer in the list with the second, the third with the fourth, and
so on, and then check the overall success probability in the way described in
the sample case explanations. The maximum probability we encounter is the
answer. This method will check equivalent casting decisions multiple times,
but it doesn't matter; 8! is under 50 thousand, so your computer should not
even break a sweat.
Brute force will not cut it for a musical with up to 40 roles — we must be giving Cats and A Chorus Line a run for their money! — so we need a more thoughtful strategy. Is it better to pair up reliable performers with other reliable performers, making some roles secure and others risky? Or should we pair our most reliable performers with our least reliable ones, spreading out the risk more evenly across roles? Or is the solution something more complex? Intuition can easily fool us when it comes to probability problems, so it's best to make a mathematical argument.
Suppose that we have paired up our performers in some arbitrary way. Let's consider two of the roles and the four performers. Without loss of generality, let's label the four performers A, B, C, and D, such that their probabilities of becoming unavailable follow the order PA ≥ PB ≥ PC ≥ PD. Our key claim is that we will maximize these four performers' contribution to the overall probability of the show's success by pairing A with D and B with C, if they are not already paired in that way.
Let's prove that pairing A with D is better than pairing A with C. If we pair A with D, the probability that both of these roles will be successfully filled is as follows:
(1 - PAPD)(1 - PBPC) = 1 - PAPD - PBPC + PAPBPCPD.
If we instead pair A with C, the probability of success is:
(1 - PAPC)(1 - PBPD) = 1 - PAPC - PBPD + PAPBPCPD.
Subtracting the second quantity from the first, we get:
PAPC + PBPD - PAPD - PBPC = PA(PC - PD) + PB(PD - PC) = (PA - PB)(PC - PD).
Since we know that PA ≥ PB, and PC ≥ PD, both of the terms in the final expression above must be positive or zero, and so their product is also positive or zero. That means that we are at least as likely to succeed if we pair A with D as we are if we pair A with C. (The contributions from performers other than A, B, C, and D are identical across these two cases, so we don't need to consider them here.) A similar argument shows that pairing A with B cannot possibly be better than pairing A with D.
So we have proven that in any arrangement, for any pair of roles, we should reassign the four performers as necessary to pair up the most reliable and least reliable of the four in one role, and the other two performers in the other. This implies that the most and least reliable performers in the entire cast should be assigned to the same role. (If they are in two different roles, just apply the argument above to that pair of roles.) Similarly, the second-most and second-least reliable performers should be in the same role, and so on. Only the rank orders of the performers' probabilities turn out to matter; the actual values are not important!
This makes our solution very simple: sort the probabilities and then pair up the extremes, and then the remaining extremes, and so on. This is O(N log N) — it would be linear if not for the sorting step — and it would easily work for values of N much larger than 40.
One surprisingly viable approach for the Small dataset is to create random
grids of random sizes and count the numbers of I/O
s in them
until we happen to find a grid with the desired number. Even if we only try
square grids with a random size and random contents, this is easily fast
enough to pass. Other optimizations are possible — we can vary the
frequencies of I
, /
, and O
, or try to
edit grids that are close — but not necessary.
Another possibility is to directly construct a grid with the desired number
of I/O
s. Better yet, we can construct one grid with at least
287 I/O
s, and then use it in every test case, eliminating
I/O
s individually until we have the desired number. To ensure
that I/O
s do not interfere with each other, let's space them out
in a field of O
s like this:
I/OOI/OOI/OO
...
OOOOOOOOOOOO
...
I/OOI/OOI/OO
...
OOOOOOOOOOOO
...
I/OOI/OOI/OO
...
...
If we extend this pattern to fill a 50 by 50 grid, we will have 25 rows with
12 I/O
s each; 25 × 12 = 300, which is more than enough. We
can eliminate an individual I/O
by changing its /
into another O
. With this grid setup, that change cannot
possibly affect other existing I/O
s or create new ones. Other
similar strategies are possible.
The particular construction strategy above cannot produce enough
I/O
s when each of our grid dimensions is capped at 15. Nor will
random generation help; we would be lucky to get a random 15 x 15 grid for
N = 80, let alone 287. We need to pack as many I/O
s into
the grid as possible. Let's start with a single row; we can fit many
I/O
s in, overlapping forward ones with backward ones:
I/O/I/O/I
...
If we stack these rows on top of each other, we will also form many diagonal
I/O
s:
I/O/I/O/I
...
I/O/I/O/I
...
I/O/I/O/I
...
...
How many I/O
s are in the 15 by 15 version of this grid? Let's
count by looking only at the /
es. Any /
in the top
or bottom row is part of exactly one I/O
, running horizontally.
Any other I/O
is part of exactly three I/O
s: one
horizontal and two diagonal. There are 7 /
es in each row; the 14
in the top and bottom row contribute 14 I/O
s, and the 13 ×
7 = 91 in the other rows contribute 91 × 3 = 273. That's a total of
287, which is conveniently exactly the maximum number of I/O
s we
could be asked to produce. Note that we do not need to prove that this
arrangement packs in as many I/O
s as possible; we only had to
find a way to fit at least 287 in.
Now, in each test case, we only need to whittle the number of
I/O
s down from 287 to N. As we do this, we must be
careful not to create any new I/O
s or destroy more
I/O
s than we want to. One safe strategy is to change
/
es into O
s, as in the Small strategy above.
Changing a /
in the top or bottom row eliminates one
I/O
, and changing any other /
eliminates three. We
can reach any value of N by first removing as many "threes" as
possible (without going below N) and then removing as many "ones" as
possible (without going below N). For example, for N = 280, we
remove two "threes" and one "one". For N = 4, we remove everything
except for four of the "ones". Since there are only 288 possible test cases,
we can even precompute answers to all of them before downloading the dataset
(but this is not necessary).
We can model the input as a weighted undirected graph. The question to be answered is similar to finding the graph's radius. In fact, we are asked to minimize the input graph's radius by possibly adding a single node inside an existing edge. This view is not only a succinct and accurate way to describe the problem, but also a first step towards a solution.
One solution for the Small relies on a simple fact: the result is always an integer multiple of 1/2, and if the optimal location is inside an edge, it is always at a point that is at a distance from one of the incident nodes that is an integer multiple of 1/2. This property is not hard to prove: if the location is in a building, all distances to it are integers. If not, let L be a location and B be a farthest building at distance D. Let B' be the farthest building among those whose minimum path approaches L from the other side of the edge as B's minimum path, with distance D'. If there were no buildings whose minimal path approaches L from the other side, L would not be an optimal location, since we could move L towards B decreasing all distances. If D' < D, we could move the location slightly towards B and decrease the overall minimum, so D' ≤ D, which together with D' ≥ D by definition implies D' = D. Notice that the fractional parts of both D and D' only depend on where L is located on the edge, because the rest of each distance comes from a distance between buildings, which is an integer. Therefore, since the fractional parts of D and D' are equal, the fractional parts of the edge on both sides of L are equal, and thus an integer multiple of 1/2.
With this property in mind, and the limitation that the maximum edge length is 2 in the Small dataset, there are only 3 positions on each edge that can contain an optimal location (at distances 0.5, 1, and 1.5 from one end). For each of these, we can find the farthest building. To do that, we can use Dijkstra's algorithm. We also have to consider the radius of the original graph (which represents choosing a location in a building), which we can do by running the same algorithm starting at each building. Then, we just take the minimum farthest distance from all of those options. The running time of this solution is O(MB4) where M is the maximum length of an edge. This is because we need to run Dijkstra’s algorithm 2M-1 times per edge for up to O(B2) edges and once per node for O(B) nodes (a total of O(MB2) times), and each run takes O(B2) time.
The approach outlined for the Small dataset does not work for the Large, because the edges can be really long and thus the number of locations to try is too large, even when restricted to integer multiples of 1/2.
What we can do to simplify the problem is to use binary search. That is, write an algorithm to determine whether there is a location with farthest distance D or less. The statement is clearly false for some interval [0, X) of values for D and true for [X, infinity), so we can simply binary search for X.
The simplification that we obtain is that we can now check for a fixed distance. We start by finding the distance between all pairs of buildings. We can either run Dijkstra's algorithm once per building as mentioned above, or use something simpler for all pairs like Floyd-Warshall's. We can then iterate over each edge to see whether there is a viable location within it. For a fixed edge, we iterate over each building and see where in the edge a location could be at a distance D or less from that building. To go to a point inside edge (I,J) from building K, there are two options: go from K to I and then move inside the edge, or go from K to J and then move inside the edge. Since we now know the distance from K to I and J, we can calculate the interval of positions within the edge that can be reached with distance D from each end I and J. Then, if those intervals cover the entire edge (by overlapping or because one of them is big enough), then any location inside (I,J) is reachable from K with a distance D or less. Otherwise, there is some interval of unreachable locations. We record such intervals for each building and then check whether the union of all those intervals covers all of (I,J). If it does, no location inside (I,J) is viable. Otherwise, there is at least one that is.
To check if a given set of intervals fully covers another interval A there is a greedy algorithm: sort the intervals by starting point and process them while keeping a current covered upper bound U, which is initialized to the lower bound of A. While U is less than the upper bound of A, for each interval, if its lower bound L is greater than U, then there is an uncovered interval (U, L) and we are done. Otherwise, if the upper bound H of the current interval is greater than U, set U := H. If the iteration finishes due to U becoming larger than A's upper bound, all of A is covered. Otherwise we either found a hole or there is one between the last value of U and the upper bound of A.
For this solution we run Floyd-Warshall's which takes O(B3) time and then run a procedure for each of O(B2) edges. This procedure iterates all B nodes within a binary search, which takes O(log BM) to converge (remember M is the maximum length of an edge, so BM is an upper bound on the output), which makes it take time O(B3 log BM) overall.
Notice that since we are binary searching for X, precision is not an issue, as a bad decision due to precision would only give a slightly larger or slightly smaller result. Additionally, we could use the property of the result being a multiple of 1/2 to do all calculations on integers by doubling all edges in the input and dividing by 2 at the very end.
A number of people tried to use ternary search to solve this problem. Ternary search assumes a convex or concave function, and the function in this case (the farthest distance for each point within an edge) is neither. The distance from each point to a single fixed building is indeed a concave function. However, the maximum of many concave functions is not a concave nor a convex function. Some ternary search implementations may suceed in the Small because of the really small number of critical points, which may all be tried even under the flawed assumption.