Google Code Jam Archive — Round 1B 2009 problems


Round 1B had easier problems, but it was not easy to qualify from. ACRush, the Code Jam 2008 Champion, finished all the problems in less than 45 minutes. Solving problem B alone was not enough to qualify.

Problem A was not the favorite choice of the contestants, probably because of the long statement, although it was mostly about parsing. Problem B was easier to implement, but it required a little more thinking. Problem C needed some optimizations and was the last one to be implemented by most of the contestants.

63 people got 100 points, just one less than Round 1A. 3 of them submitted in the last minute.
Congratulations to all the people that qualified for Round 2!


Problem A. Decision Tree Written by Pavel Nalivaiko. Prepared by Frank Chu and Igor Naverniouk.

Problem B. The Next Number Written by John Dethridge. Prepared by Frank Chu and Marius Andrei.

Problem C. Square Math Written by Junbin Teng. Prepared by Xiaomin Chen and Daniel Rocha.

Contest analysis presented by Marius Andrei, Igor Naverniouk, and Xiaomin Chen.

Solutions and other problem preparation provided by Tomek Czajka, Pablo Dal Lago, Ante Derek, Derek Kisman, Petr Mitrichev, and Cosmin Negruseri.

A. Decision Tree


Decision trees -- in particular, a type called classification trees -- are data structures that are used to classify items into categories using features of those items. For example, each animal is either "cute" or not. For any given animal, we can decide whether it is cute by looking at the animal's features and using the following decision tree.

(0.2 furry
  (0.81 fast
  (0.1 fishy
    (0.3 freshwater

A decision tree is defined recursively. It always has a root node and a weight. It also, optionally, has a feature name and two sub-trees, which are themselves decision trees.

More formally, a decision tree is defined using the following grammar.

tree ::= (weight [feature tree tree])
weight is a real number between 0 and 1, inclusive
feature is a string of 1 or more lower case English letters
The part inside the square brackets, [], is optional. The parentheses, (), weight and feature are tokens. There will be at least one whitespace character between any two tokens, except (possibly) after an open-bracket '(' or before a close-bracket ')'. Whitespace characters are space (' ') and endline ('\n').

To figure out how likely the animal is to be cute, we start at the root of the tree with probability p set to 1. At each node, we multiply p by the weight of the node. If the node is a leaf (has no sub-trees), then we stop, and the value of p is the probability that our animal is cute. Otherwise, we look at the feature associated with the node. If our animal has this feature, we move down into the first sub-tree and continue recursively. If it does not have this feature, then we move down into the second sub-tree and continue in the same way.

For example, a beaver is an animal that has two features: furry and freshwater. We start at the root with p equal to 1. We multiply p by 0.2, the weight of the root and move into the first sub-tree because the beaver has the furry feature. There, we multiply p by 0.81, which makes p equal to 0.162. From there we move further down into the second sub-tree because the beaver does not have the fast feature. Finally, we multiply p by 0.2 and end up with 0.0324 -- the probability that the beaver is cute.

You will be given a decision tree and a list of animals with their features. For each item, you need to return the probability that the animal is cute.


The first line of input contains a single integer, N, the number of test cases. N test cases follow.

Each test case description will start with a line that contains an integer L -- the number of lines that describe a decision tree. The next L lines will contain a decision tree in the format described above. The line after that will contain A -- the number of animals. The next A lines will each contain the description of one animal in the following format.

animal n feature1 feature2 ... featuren


For each test case, output one line containing "Case #x:" followed by exactly A lines, one per animal, in the same order as they appear in the input. Each line should contain the probability that the animal is cute. Answers that are precise to within an absolute or relative error of 10-6 will be considered correct.


Time limit: 30 seconds per test set.
Memory limit: 1 GB.
1 ≤ N ≤ 100
All weights will be between 0 and 1, inclusive.
All weights will consist of only digits with at most one decimal point.
The weights will not start or end with a decimal point.
The weights will not have more than one 0 before a decimal point.
All animals and features will consist of between 1 and 10 lower case English letters.
All animal names within a test case will be distinct.
All feature names for a single animal will be distinct.
Each of the L lines in a decision tree definition will have at most 80 characters, not including the endlines.

Small dataset

1 ≤ L ≤ 10
1 ≤ A ≤ 10
0 ≤ n ≤ 5

Large dataset

1 ≤ L ≤ 100
1 ≤ A ≤ 100
0 ≤ n ≤ 100


Sample Input
content_copy Copied!
(0.5 cool
  ( 1.000)
  (0.5 ))
anteater 1 cool
cockroach 0
Sample Output
content_copy Copied!
Case #1:

B. The Next Number


You are writing out a list of numbers. Your list contains all numbers with exactly Di digits in its decimal representation which are equal to i, for each i between 1 and 9, inclusive. You are writing them out in ascending order.

For example, you might be writing every number with two '1's and one '5'. Your list would begin 115, 151, 511, 1015, 1051.

Given N, the last number you wrote, compute what the next number in the list will be.


The first line of input contains an integer T, the number of test cases in the input. T lines follow, one for each test case, each containing a single integer N.


For each test case, output

Case #X: K
where X is the test case number, starting from 1, and K is the next integer in the list.


Memory limit: 1 GB.

Small dataset

Time limit: 20 seconds.
1 ≤ T ≤ 50
1 ≤ N ≤ 106

Large dataset

Time limit: 30 seconds.
1 ≤ T ≤ 500
1 ≤ N ≤ 1020


Sample Input
content_copy Copied!
Sample Output
content_copy Copied!
Case #1: 151
Case #2: 1105
Case #3: 6323

C. Square Math


Say we have a square that has W cells on each side and, therefore, W2 cells total. Let's go further and fill each cell with one of the following:

  • A digit from 0 to 9;
  • The addition sign (+);
  • The subtraction sign (-).
If, finally, we add a constraint that no 2 digits are horizontally or vertically adjacent and no 2 operators (+ or -) are horizontally or vertically adjacent, then our square can be called an "arithmetic square".

Square Math is the name of a puzzle where, given an arithmetic square, we start from any numeric cell and move either horizontally or vertically a cell at a time, finally ending in a numerical cell. The mathematical expression we get from the traversal is evaluated to get a single value. For example:


The above is a valid arithmetic square of size W = 3. If we start from "2", move horizontally right, then vertically down, we'll get "2+4", which gives a value of "6". If we further move horizontally right, then vertically up, we'll get "2+4-3", which is equal to "3".

In Square Math, there is no limit to how many times you can use a particular cell. It is perfectly legal to move from a cell to its neighbor, then back to the original cell. Given an arithmetic square and a list of queries, your task is to find a Square Math expression which evaluates to each query.


The first line of input contains a single integer, T. T test cases follow. The first line of each test case contains 2 integers, W and Q. W lines follow, each containing W characters, representing the arithmetic square. Don't worry, all arithmetic squares in the input are well-formed. The following line contains a space separated list of Q integers, representing the values which need to be computed by using Square Math (the queries). You can assume that all given values will have at least one possible Square Math solution.


For each test case, begin output with "Case #X:" on a line by itself, where X is the test case number, starting from 1. Then, for each query within the test case, print the Square Math expression which evaluates to the query on a line by itself.

In the case where there are multiple possible Square Math expressions, print the one that is shortest. If there is still a tie, print the lexicographically smallest expression. Remember that '+' is lexicographically smaller than '-'.


Memory limit: 1 GB.
1 ≤ T ≤ 60

Small dataset

Time limit: 30 seconds.
2 ≤ W ≤ 10
1 ≤ Q ≤ 20
1 ≤ each query ≤ 50

Large dataset

Time limit: 60 seconds.
2 ≤ W ≤ 20
1 ≤ Q ≤ 50
1 ≤ each query ≤ 250


Sample Input
content_copy Copied!
5 3
20 30 40
3 2
2 20
Sample Output
content_copy Copied!
Case #1:
Case #2:

Analysis — A. Decision Tree

The hard part here was parsing the tree. An easy way to do this is by using a technique called recursive descent. The tree grammar is defined recursively, so it makes sense to parse it recursively, too. Here is a solution in Python.

import re
import sys
inp = sys.stdin

tokens = None
ti = -1

def ReadInts():
  """Reads several space-separated integers on a line.
  return tuple(map(int, inp.readline().strip().split(" ")))

def NextToken():
  """Consumes the next token from 'tokens'.
  global ti
  assert 0 <= ti < len(tokens)
  ti += 1
  return tokens[ti - 1]

def ParseNode():
  """Parses from 'tokens' and returns a tree node.
  global ti
  assert NextToken() == "("
  node = {"weight": float(NextToken())}
  tok = NextToken()
  if tok == ")":
    return node
  node["feature"] = tok
  node[True] = ParseNode()
  node[False] = ParseNode()
  assert NextToken() == ")"
  return node

def ParseTree(s):
  """Initializes 'tokens' and 'ti' and parses a tree.
  global tokens
  global ti
  s = re.compile(r"\(").sub(" ( ", s)
  s = re.compile(r"\)").sub(" ) ", s)
  s = re.compile(r"[ \n]+").sub(" ", " %s " % s)
  tokens = s[1:-1].split(" ")
  ti = 0
  return ParseNode()

def Evaluate(tree, features):
  ans = tree["weight"]
  if "feature" in tree:
    ans *= Evaluate(tree[tree["feature"] in features], features)
  return ans

if __name__ == "__main__":
  N = ReadInts()[0]
  for prob in xrange(1, N + 1):
    n_lines = ReadInts()[0]
    lines = [inp.readline() for _ in xrange(n_lines)]
    tree = ParseTree(" ".join(lines))
    n_queries = ReadInts()[0]
    print "Case #%d:" % prob
    for _ in xrange(n_queries):
      features = set(inp.readline().strip().split(" ")[2:])
      print "%.7f" % Evaluate(tree, features)

For each test case, we read the 'n_lines' lines containing the tree definition, we glue them together using spaces and pass the resulting string to ParseTree().

In ParseTree(), we do some "massaging" to make the string easier to parse. First, we put spaces around each parenthesis by using two simple regular expressions. Next, we replace each sequence of whitespace characters by a single space and make sure there is always exactly one space character at the beginning and the end of the input. Finally, we strip off the leading and trailing spaces and split the rest into tokens.

The ParseNode() function does the rest. It uses the NextToken() function to read one token at a time from the 'tokens' list and returns a simple dictionary representation of a tree node.

Once we have the tree as a dictionary, we then use Evaluate() to do a tree traversal from the root to a leaf and compute the answer for each input animal.

Using the parsers built into dynamic languages

A number of contestants have noticed that there is an even easier way to parse the tree. Most dynamic, interpreted languages give you access to their built-in parser, and by manipulating the input a little bit, it is possible to use make the language's interpreter do the parsing for you! This means using "eval" in JavaScript or Python, or "read" in Lisp. Check out some of the shortest solutions at

Test Data
info We recommend that you practice debugging solutions without looking at the test data.

Analysis — B. The Next Number

Let x be our input. We want to find the next number y. We denote L(s) to be the length of s, i.e., the number of digits in s.

Case 1. If all the digits in x are non-increasing, for example x = 776432100, then x is already the biggest one among the numbers in the list with L(x) digits. The next number, y, must have one more digit, that is, one more 0. Actually y must be the smallest one with L(x)+1 digits. To get y, we put the smallest non-zero digit in front, and all the other digits are put in non-decreasing order.

Case 2. Otherwise, x can be written as the concatenation x = ab, where b is the longest non-increasing suffix of x, so d, the last digit of a is smaller than the first digit of b. Let d' be the smallest among all the digits in b that are bigger than d.
Because b is non-increasing, x is the biggest number among those who has L(x) digits and starts with prefix a. The first L(a) digits of y must be bigger than a. The smallest we can do is to replace d with d', and then for the rest digits, we arrange them in non-decreasing order.
Let us do another example. x = 134266530. Then a = 1342, b = 66530, d = 2, and d' = 3. The next number is y = 134302566.

In fact, we can unify the two cases above. Since the number of 0's is not restricted, we can just imagine there is one more 0 in the beginning of x, thus Case 1 is reduced to Case 2.

The above described is actually exactly the procedure to get the next permutation of a finite sequence in certain languages. Below is a solution that is essentially one line in C++. From the author:

  deque<char> f;
  next_permutation(f.begin(), f.end());
  if (f.front() == '0') f.pop_front();

Test Data
info We recommend that you practice debugging solutions without looking at the test data.

Analysis — C. Square Math

The problem would be a lot easier if there were only plus signs used. With the presence of the negative signs, a valid expression might get to a large value in the middle, then decrease back to the value we are looking for. Will that be the shortest answer for a certain query at all?

The best path cannot be long

First of all, let us denote a non-zero digit in the square pos if it has at least one neighbor that is a plus sign; call it neg if it has one minus sign as its neighbor. A digit can be both pos and neg. We assume there are both pos digits and neg digits in the square. Let q be the value we are looking for.
Let g be the gcd (greatest common divisor) of all the digits in the square. If g is not 1, in order to find a solution for q, it must be a multiple of g. Dividing everything by g, we may assume from now on that the gcd of all the numbers in the square is 1. The situation is further simplified when all our numbers are between 0 and 9 -- in this case, the above implies that there must be two digits, a and b, in the square, such that gcd(a, b) = 1.

Case 1. a is pos and b is neg. Take any shortest path P from a to b in the square. Let q' be the value of this path. q' is between [-200, 200]. Let t = q - q'. We prove the case t ≥ 0; the other case is similar and left to the readers.
Since gcd(a, b)=1, one of the numbers t, t+b, t+2b, ..., t+(a-1)b is a multiple of a. This means we can find non-negative numbers x and y such that ax - by = t, where y < a (therefore x < t/a + b). We start from a, since it is a positive digit, we use the plus sign to repeat at a x times, then follow P. After reaching b, we use the minus sign to repeat y times. The path just described evaluates to the query q.

Case 2. Both a and b are pos, there is a neg digit c. If c is co-prime with either a or b, we handle it as Case 1. Otherwise c has to be 6.
Pick any shortest path P that connects a, b, and c. Suppose it evaluates to q'. Pick a non-negative z such that q - q' + 6z ≥ ab - a - b + 1. We use a basic fact in number theory here

If positive integers a and b are co-prime, then for any t ≥ ab - a - b + 1, there exist non-negative integers x and y such that ax + by = t.
To get an answer for query q, we use the path P, repeat a x times with plus sign, b y times with plus sign, and c z times with minus sign.

Case 3. Both a and b are neg. This is similar to Case 2, and we leave it to the readers.

Thus we proved that for any solvable query, there is a path that is not long, as well as any partial sum on the path cannot be too big. The rough estimate shows that any of the paths cannot take more than 1000 steps. One can get a better bound by doing more careful analysis.

The algorithm

Our solution is a BFS (breadth first search). The search space consists all the tuples (r, c, v), where (r, c) is the position of a digit, and denote A(r, c, v) to be the best path that evaluates to v, and ends at the position (r, c). We know there are at most 200 such (r, c)'s (202/2), and at most (much less than) 20000 such v's from the bound we get.

The only difference with a standard BFS is that, because of the lexicographical requirement, we may need to update the answer on a node. But we never need to re-push it into the queue, since it is not yet expanded.

Test Data
info We recommend that you practice debugging solutions without looking at the test data.

Statistics — A. Decision Tree

Statistics — B. The Next Number

Statistics — C. Square Math