Google Code Jam Archive — Round 1A 2011 problems

Overview

Round 1 was off to a rather brutal start. The scoreboard quickly started filling up with incorrect submissions on all three problems. It took almost 40 minutes for somebody to solve Problem B and over two hours for a correct solution to problem C to appear.

At the end, a significant number of large submissions turned out to be incorrect as well, which caused a lot of reshuffling at the top of the score board.

Some of the confusion in problem B was caused by a sentence that was not as clear as we would have liked. We are sorry that it took us almost two hours to understand the confusion, and we had to broadcast a clarification message.

krijgertje ended up being the first to solve all problems correctly with 27 minutes to go. Progbeat followed soon after and took second place, only to be overtaken 15 minutes later by Myth. No one else managed to solve all of the problems.

Congratulations to the top 1000 contestants. They have advanced to Round 2. Everyone else has two more chances in rounds 1B and 1C.



Cast

Problem A. FreeCell Statistics Written by Igor Naverniouk and prepared by Onufry Wojtaszczyk.

Problem B. The Killer Word Written by David Arthur and prepared by Tomek Czajka.

Problem C. Pseudominion Written by David Arthur and prepared by Luka Kalinovcic.

Contest analysis presented by Stephen Fulwider, David Arthur and Luka Kalinovcic.

Solutions and other problem preparation by Jorge Bernadas Saragoza, Sean Henderson, Petr Mitrichev, Greg Tener, Cosmin Negruseri and John Dethridge.

A. FreeCell Statistics

Problem

I played D (D > 0) games of FreeCell today. Each game of FreeCell ends in one of two ways -- I either win, or I lose. I've been playing for many years, and have so far played G games in total (obviously, GD).

At the end of the day, I look at the game statistics to see how well I have played. It turns out that I have won exactly PD percent of the D games today, and exactly PG percent of G total games I had ever played. Miraculously, there is no rounding necessary -- both percentages are exact! Unfortunately, I don't remember the exact number of games that I have played today (D), or the exact number of games that I have played in total (G). I do know that I could not have played more than N games today (DN).

Are the percentages displayed possible, or is the game statistics calculator broken?

Input

The first line of the input gives the number of test cases, T. T lines follow. Each line contains 3 integers -- N, PD and PG.

Output

For each test case, output one line containing "Case #x: y", where x is the case number (starting from 1) and y is either "Possible" or "Broken".

Limits

0 ≤ PD ≤ 100;
0 ≤ PG ≤ 100.
Memory limit: 1GB.

Small dataset (Test set 1 - Visible)

1 ≤ T ≤ 100;
1 ≤ N ≤ 10.
Time limit: 30 seconds.

Large dataset (Test set 2 - Hidden)

1 ≤ T ≤ 2000;
1 ≤ N ≤ 1015.
Time limit: 60 seconds.

Sample

Sample Input
content_copy Copied!
3
1 100 50
10 10 100
9 80 56
Sample Output
content_copy Copied!
Case #1: Possible
Case #2: Broken
Case #3: Possible

In Case #3, I could have played 5 games today (D = 5) and 25 games in total (G = 25), and won 4 games today (80% of 5) and 14 games in total (56% of 25).

B. The Killer Word

Problem

You are playing Hangman with your friend Sean. And while you have heard that Sean is very good at taking candy from a baby, he is not as good at this game. Can you take advantage of Sean's imperfect strategy, and make him lose as badly as possible?

 +--+
 |  O
 | /|\       Mystery word: _ a _ a _ a _
 | / \
 |
+-+---+

Hangman is played as follows:

  • There is a dictionary D of all valid words, which both you and Sean know. A word consists only of the characters a - z. In particular, there are no spaces.
  • You begin by choosing any word from D, and writing it down on a blackboard with each letter replaced by a blank: _.
  • On his turn, Sean can choose any letter and ask you if it is in the word. If it is, you reveal all locations of that letter. Otherwise, Sean loses a point.
  • Once all letters in the word have been revealed, the round ends.
  • The round never ends early, no matter how many points Sean loses.

Sean uses a very simple strategy. He makes a list L of the 26 letters in some order, and goes through the list one letter at a time. If there is at least one word in D that (a) has the letter he is thinking of, and (b) is consistent with what you have written down so far and the result of all of Sean's previous guesses, then Sean guesses that letter. Otherwise, he skips it. No matter what, Sean then moves on to the next letter in his list.

Given Sean's list, what word should you choose to make Sean lose as many as points as possible? If several choices are equally good, you should choose the one that appears first in D.

Example

Suppose Sean decides to guess the letters in alphabetical order (i.e., L = "abcdefghijklmnopqrstuvwxyz"), and D contains the words banana, caravan, and pajamas. If you choose pajamas, the game would play out as follows:

  • You begin by writing 7 blanks _ _ _ _ _ _ _ on the blackboard. Based on the number of blanks, Sean knows immediately that the word is either caravan or pajamas.
  • Sean begins by guessing a since it is first in L, and you reveal all locations of the letter a on the blackboard: _ a _ a _ a _.
  • Sean skips b even though it is used in banana. Sean already knows that is not your word.
  • He then guesses c because it appears in caravan. It does not appear in the word you actually chose though, so Sean loses a point and nothing more is revealed.
  • By process of elimination, Sean now knows your word has to be pajamas, so he proceeds to guess j, m, p, and s in order, without losing any more points.

So Sean loses one point if you choose pajamas. He would have gotten either of the other words without losing any points.

Input

The first line of the input gives the number of test cases, T. T test cases follow. Each test case begins with a line containing integers N and M, representing the number of words in the dictionary and the number of lists to consider.

The next N lines contain the words in the dictionary, one per line: D1, D2, ..., DN. Each word is an arbitrary sequence of characters a - z.

The final M lines contain all of the lists Sean will use, one per line: L1, L2, ..., LM. Each list is exactly 26 letters long, containing each letter exactly once. Sean will use these lists to guess letters as described above.

Output

For each test case, output one line containing "Case #x: w1 w2 ... wM", where x is the case number (starting from 1) and wi is the word you should choose if Sean guesses the letters in order Li. If multiple words cause Sean to lose the same number of points, you should choose the option that appears first in the dictionary.

Limits

1 ≤ T ≤ 10.
Each word in D will have between 1 and 10 characters inclusive.
No two words in D will be the same within a single test case.
Memory limit: 1GB.

Small dataset (Test set 1 - Visible)

1 ≤ N ≤ 100.
1 ≤ M ≤ 10.
Time limit: 30 seconds.

Large dataset (Test set 2 - Hidden)

1 ≤ N ≤ 10000.
1 ≤ M ≤ 100.
Time limit: 60 seconds.

Sample

Sample Input
content_copy Copied!
2
3 2
banana
caravan
pajamas
abcdefghijklmnopqrstuvwxyz
etaoisnhrdlcumwfgypbvkjxqz
4 1
potato
tomato
garlic
pepper
zyxwvutsrqponmlkjihgfedcba
Sample Output
content_copy Copied!
Case #1: pajamas caravan
Case #2: garlic

C. Pseudominion

Problem

You are playing a game with a fancy deck of cards. Each card has three bonus numbers: a card bonus c, a score bonus s, and a turn bonus t. Some of the cards start in your hand, while the rest are in a deck on the table. You start with one turn.

On each turn, you can choose any card from your hand and play it. If it has bonus numbers c, s, t, then the following happens:

  • The card is discarded from your hand, and it can never be used again.
  • You draw the first c cards from the deck into your hand. If the deck has fewer than c cards in it, you draw all of them.
  • Your total score increases by s.
  • Your number of remaining turns increases by t.

If you do not have any cards in your hand at the start of a turn, then nothing happens on that turn. Your goal is to get as high a score as possible before running out of turns.

For example, suppose your hand and deck contain the following cards:

         +---+---+---+            +---+---+---+
   HAND: | c | s | t |      DECK: | c | s | t |
         +---+---+---+            +---+---+---+
Card #1: | 0 | 0 | 2 |   Card #4: | 1 | 1 | 0 |
Card #2: | 0 | 5 | 0 |   Card #5: | 0 | 1 | 1 |
Card #3: | 2 | 1 | 1 |   Card #6: | 2 | 2 | 0 |
         +---+---+---+            +---+---+---+
The following table shows how you can get a score of 8 from these cards. The first three columns show your hand, the number of turns left, and your score before playing each card, and the final column shows which card to play.
+---------+------------+-------+------+
| Hand    | Turns left | Score | Play |
+---------+------------+-------+------+
| 1, 2, 3 |      1     |   0   |   1  |
| 2, 3    |      2     |   0   |   3  |
| 2, 4, 5 |      2     |   1   |   2  |
| 4, 5    |      1     |   6   |   5  |
| 4       |      1     |   7   |   4  |
| 6       |      0     |   8   |   -  |
+---------+------------+-------+------+

As you can see, the card bonuses and turn bonuses allow you to chain together a long sequence of cards before you have to stop.

Input

The first line of the input gives the number of test cases, T. T test cases follow.

Each test case begins with a single line containing N, the number of cards in your hand. The next N lines each contain three integers, c, s, and t, representing the bonus numbers for a single card in your hand.

This is followed by a single line containing M, the number of cards in the deck. The next M lines each contain three integers, c, s, and t, representing the bonus numbers for a single card in the deck. These cards are listed in the same order in which you draw them.

Output

For each test case, output one line containing "Case #x: S", where S is the largest score you can obtain before running out of turns.

Limits

1 ≤ T ≤ 100.
1 ≤ N.
0 ≤ M.
N + M ≤ 80.
Memory limit: 1GB.

Small dataset (Test set 1 - Visible)

0 ≤ c ≤ 1.
0 ≤ s ≤ 20.
0 ≤ t ≤ 20.
Time limit: 30 seconds.

Large dataset (Test set 2 - Hidden)

0 ≤ c ≤ 2.
0 ≤ s ≤ 50.
0 ≤ t ≤ 50.
Time limit: 60 seconds.

Sample

Sample Input
content_copy Copied!
2
4
1 0 0
1 1 1
0 5 0
1 2 0
0
2
1 1 1
0 6 0
1
0 1 3
Sample Output
content_copy Copied!
Case #1: 6
Case #2: 8

Analysis — A. FreeCell Statistics

The first thing to notice in this problem is that both PD and PG can range from 0 to 100, so let’s start by taking care of some easy cases. If our unnamed player has won 100% of his total games (PG = 100) but not won 100% of his games today (PD < 100), then something has clearly gone wrong with the calculator. Similarly, if PG = 0 but PD > 0, something is wonky.

The cases where PD = PG = 0 or PD = PG = 100 are also easy, and the answer for both of them is "Possible" — they mean that all the games ever ended with the same result (loss or win, respectively).

The trick now is that once we ruled out the case of PG being zero or a hundred, we do not need to worry about it any more! Indeed, assume we have a solution that gives the correct daily percentage of wins, and it consists of W games won and L games lost today. To get a global win percentage of PG, we can, for instance, assume we won a total of (W + L) * PG and lost a total of (W + L) * (100 - PG). As 1 ≤ PG ≤ 99, the numbers are greater than W and L, respectively, so they are possible to achieve.

The small data set can now be solved by brute force by simply trying all possible values of D from 1 to N and all possible number of games won from 0 to D, and seeing if any of them results in exactly PD percentage of games being won.

Solving the large date set can require some maths. One way to solve the problem is to directly solve for the minimum number of games we would need to play in order to get a win percentage of PD and simply verify that this number is ≤ N. If we let W be the number of games we have won today, then we want to solve W / D = PD / 100 for the minimum value of D such that W is integral.

From here, it is easy to see D = 100 * W / PD. Thus, if we want to minimize D, then we need to find the smallest value W such that the right hand side is integral. In order to do this, we divide 100 and PD by their greatest common divisor (let's call this value G) so that they are relatively prime and W is minimal when it is PD / G. Plugging this in and cancelling terms tells us that D = 100 / G is the fewest number of games we must be play.

A simpler way to solve this problem is by brute force. We can just try all possible values of D from 1 to N and check if any of them could result in exactly PD percentage of games being won by checking that D * PD = 0 (mod 100), stopping the loop the first time we find a candidate value of D or we exceed N games. While at first this solution appears to be O(N) and would time out for the large data set, this loop will in fact run at most 100 times, regardless of the value of N, so this solution will be plenty fast enough. Coders who noticed this simpler solution early were rewarded with very fast submission times.

Test Data
info We recommend that you practice debugging solutions without looking at the test data.

Analysis — B. The Killer Word

Every year, contestants come to us saying they implemented a program correctly but it ran too slowly. Unfortunately, being too slow is the same thing as being wrong. The Google Code Jam is an algorithms contest, and once you get past the early questions, coming up with a fast algorithm is often the whole point of the problem.

Why do we bring this up? Because this problem is a trap. On first glance, it looks just like the first two problems from the Qualification Round. We tell you an algorithm, and you implement it exactly as described:

  • Loop through every one of Sean's lists.
  • Loop through every word in the dictionary.
  • See how many mistakes Sean makes while guessing this word.

Since the last step requires iterating over every word again, the running time here is O(N2M).

We will always give you the hardest inputs we can fit within the stated limits, which means you will see N = 10000 and M = 100. Neither a fast computer nor a fast language will save you here - the whole approach is simply too slow. For example, our C++ implementation takes over 20 minutes for a single test case. To succeed on an algorithms contest, you need to see these issues in advance, either by looking at the Big O complexity or by trying a worst-case input of your own.

Once you do see the issue, it isn't too hard to dramatically speed things up. The main idea is to combine the last two steps:

  • Loop through every one of Sean's lists.
  • Divide the words into different classes by length. If you focus on a single class, Sean will make the same first guess for any word in that class. This is because he has exactly the same information in each case.
  • For each class, figure out what letter Sean will guess, and further divide the class based on each of the different responses you could give to Sean's guess.
  • Repeat for each of the sub-classes until you are down to a single word, at which point Sean will finish with no mistakes.

The running time here is O(NM), and it will go hundreds of times faster than the more straightforward algorithm.

This technique is similar to Dynamic Programming - the game begins in the same way for many different words, so we should not have to redo all that work every time.

Fun facts: this problem was originally called Bloodthirsty Executioner Man, but was renamed because the hangman is made out of paper, and has no blood. Sean himself answered several of the clarifications for this problem.

Test Data
info We recommend that you practice debugging solutions without looking at the test data.

Analysis — C. Pseudominion

Let's denote the bonus numbers of a card c by c.card_draw, c.score and c.turns. Let c.index be the (zero based) index of the card in the ordering from the input file.

After looking at the problem constraints, we can classify a card c as:

  • a T card, if c.turns > 0.
  • a C0 card, if c.turns = 0 and c.card_draw = 0
  • a C1 card, if c.turns = 0 and c.card_draw = 1
  • a C2 card, if c.turns = 0 and c.card_draw = 2

Let T[i] denote the i-th T card in the sequence of all T cards sorted by the index. We define C0[i], C1[i] and C2[i] analogously.

The first key observation is that it never hurts to play a T card whenever we have one in our hand. We never lose turns or score and we may draw new cards by doing so.

Let's think about some arbitrary sequence of played cards. There are two observations we need to make before we can proceed.

  1. Because C0 cards do not add turns or cards to our hand, we can postpone playing them until we have no other cards we want to play. So we can transform any valid sequence of played cards into another sequence that has all C0 cards at the very end. The total score, being the sum of individual scores of all played cards, will obviously remain the same.
  2. Let's say there are two C1 cards a and b such that we played a before b, but b.index < a.index. Because b.index < a.index, we already had both cards in our hand when we played card a. So we can instead play card b first and card a second. The resulting sequence with a and b swapped will remain valid because a and b both have the same card bonus (1 card) and the same turn bonus (0 turns). Furthermore, the score will remain the same. We can do the same for two C2 cards.

The first observation shows us that we don't care about C0 cards until the very end when we can use our remaining turns on whatever C0 cards we have already drawn. Obviously, we should sort the C0 cards in our hand by decreasing score bonus and then play as many as we can.

The second observation shows us that we can transform any optimal sequence of played cards into another one that has cards of the same type ordered by index. This means that we can play C1 (and C2) cards in increasing order by index.

So, whenever we have more than one C1 (or C2) cards in our hand we can always look at the one with the smallest index and choose whether we will play it right now or never at all.

These observations lead us to construct a directed acyclic weighted graph. A node represents a state of the game and is defined by these properties:

  • hand - the number of drawn cards.
  • turns - the number of turns left.
  • t - T[t] is the first T card that we haven't played yet.
  • c1 - C1[c1] is the first C1 card which we are not yet sure if we are going to play.
  • c2 - C2[c2] is the first C2 card which we are not yet sure if we are going to play.

An edge represents a valid transition from one game state to another and is usually associated with playing a card or deciding not to play a certain card at all. The weight of an edge indicates how much your score goes up when you switch between the two game states. For each node (hand, turns, t, c1, c2), we add edges according to these rules:

  1. If we have a T card in our hand, we can play it.
    • Condition: T[t].index < hand
    • Weight: T[t].score
    • Target node: (min(N, hand + T[t].card_draw), min(N, turns + T[t].turns - 1), t + 1, c1, c2)
  2. If we have a C1 card in our hand, we can play the first one.
    • Condition: C1[c1].index < hand
    • Weight: C1[c1].score
    • Target node: (min(N, hand + 1), turns - 1, t, c1 + 1, c2)
  3. If we have a C1 card in our hand, we can throw away the first one.
    • Condition: C1[c1].index < hand
    • Weight: 0
    • Target node: (hand, turns, t, c1 + 1, c2)
  4. If we have a C2 card in our hand, we can play the first one.
    • Condition: C2[c2].index < hand
    • Weight: C2[c2].score
    • Target node: (min(N, hand + 2), turns - 1, t, c1, c2 + 1)
  5. If we have a C2 card in our hand, we can throw away the first one.
    • Condition: C2[c2].index < hand
    • Weight: 0
    • Target node: (hand, turns, t, c1, c2 + 1)
  6. We can always decide to finish the game by spending the remaining turns on C0 cards. This edge leads to the special final node and the weight of the edge is defined by the greedy algorithm that spends all remaining turns picking the best C0 cards.

The answer to our problem is the length of the longest path in the graph described above. The graph is acyclic, so we can use dynamic programming or recursion with memoization to find the length of the longest path.

The number of nodes in the graph is O(n5), so the asymptotic time complexity of the algorithm is O(n5) too. In practice, the runtime of the program is really small because the vast majority of these states are unreachable. You can speed things up even more if you use the preceding observations to their full power. For example, if there is an edge corresponding to a T card, you should always follow it first.



Remark: This problem is inspired by the game Dominion, published by Rio Grande Games.
Test Data
info We recommend that you practice debugging solutions without looking at the test data.

Statistics — A. FreeCell Statistics

Test set 1: 3079 correct solutions (98.3% solve rate)

First
Egor Java, 7:52
Mitchell C++, 8:19
ltaravilse C++, 9:06
xiaowuc2 9:06
omeometo C++, 9:14
Shortest
Nabb Golfscript, 100 bytes
surfeit Matlab, 225 bytes
ronnodas Python, 284 bytes
chuanren Python, 299 bytes
arti Python, 308 bytes

Test set 2: 2181 correct solutions (69.6% solve rate)

First
ltaravilse C++, 9:06
xiaowuc2 9:06
omeometo C++, 9:14
ACRush aka ACRushTC 9:29
zorrikon Python, 10:13
Shortest
Nabb Golfscript, 100 bytes
surfeit Matlab, 225 bytes
ronnodas Python, 284 bytes
chuanren Python, 299 bytes
arti Python, 308 bytes

Statistics — B. The Killer Word

Test set 1: 684 correct solutions (21.8% solve rate)

First
dbjorge Python, 35:18
tomconerly 37:11
madking Ruby, 38:09
Egor Java, 39:55
YiningWang C++, 40:12
Shortest
zibada Python, 752 bytes
Hooyoung Python, 908 bytes
chys Python, 998 bytes
uru Ruby, 1047 bytes
cky Ruby, 1095 bytes

Test set 2: 181 correct solutions (5.8% solve rate)

First
Egor Java, 39:55
iwiwi aka iwi C++, 43:49
peter50216 44:43
vlad89 C++, 45:01
ItsNear 47:55
Shortest
yuhang C++, 1179 bytes
tckwok C++, 1357 bytes
lwz C++, 1364 bytes
gilesg Python, 1394 bytes
Fumiya C++, 1395 bytes

Statistics — C. Pseudominion

Test set 1: 105 correct solutions (3.4% solve rate)

First
ACRush aka ACRushTC 37:59
angwuy C++, 62:00
ttim Java, 65:54
defrager C++, 69:08
mbhand Python, 74:36
Shortest
mbhand Python, 728 bytes
CoderCharts Python, 1104 bytes
JongMan Python, 1194 bytes
ali.assaf Python, 1259 bytes
itelichko Perl, 1275 bytes

Test set 2: 3 correct solutions (0.1% solve rate)

First
krijgertje C++, 123:22
Progbeat C++, 129:35
Myth C++, 144:05
Shortest
Progbeat C++, 2079 bytes
Myth C++, 2619 bytes
krijgertje C++, 3048 bytes