Google Code Jam Archive — Round 1C 2008 problems


In this last sub-round of Google Code Jam 2008 Round 1, 2240 contestants solved at least one dataset, 17 among them got the perfect score of 100 points.

The first problem was the easiest one among the 9 Round 1 problems. It is interesting to note that there were contestants from the first page of the scoreboard who solved all of the datasets, except for A-large.

In the second problem, dynamic programming, a common topic in programming contests, made its first appearance in this tournament. We intended this to be an easy dynamic programming problem with a little trick in number theory. From the scoreboard we can tell that this is not a trivial problem.

The last problem, with a nice story, can be regarded as an exercise in data structures. In fact, the small dataset can be solved using standard dynamic programming. For the large dataset, one needs a way to speed it up. In this analysis we introduce an implementation that uses a binary indexed tree.


Problem A. Text Messaging Outrage Written by Mohamed Eldawy. Prepared by Marius Andrei

Problem B. Ugly Numbers Written and prepared Xiaomin Chen.

Problem C. Increasing Speed Limits Written by Petr Mitrichev. Prepared by Mark Gordon.

Contest analysis presented by Cosmin Negruseri, Xiaomin Chen, and Mark Gordon.

A. Text Messaging Outrage

The story

Professor Loony, a dear friend of mine, stormed into my office. His face was red and he looked very angry. The first thing that came out of his mouth was "Damn those phone manufacturers. I was trying to send a text message, and it took me more than ten minutes to type a one-line message." I tried to calm him down. "But what is wrong? Why did it take you so long?" He continued, "Don't you see?! Their placement of the letters is so messed up? Why is 's' the 4th letter on its key? and 'e'? Why is it not the first letter on its key? I have to press '7' FOUR times to type an 's'? This is lunacy!"

"Calm down, my friend," I said, "This scheme has been in use for so long, even before text messaging was invented. They had to keep it that way."

"That's not an excuse," his face growing redder and redder. "It is time to change all this. It was a stupid idea to start with. And while we are at it, how come they only put letters on 8 keys? Why not use all 12? And why do they have to be consecutive?"

"Umm... I... don't... know," I replied.

"Ok, that's it. Those people are clearly incompetent. I am sure someone can come up with a better scheme."

He was one of those people, I could see. People who complain about the problem, but never actually try to solve it.

In this problem, you are required to come up with the best letter placement of keys to minimize the number of key presses required to type a message. You will be given the number of keys, the maximum number of letters we can put on every key, the total number of letters in the alphabet, and the frequency of every letter in the message. Letters can be placed anywhere on the keys and in any order. Each letter can only appear on one key. Also, the alphabet can have more than 26 letters (it is not English).

For reference, the current phone keypad looks like this

key 2: abc
key 3: def
key 4: ghi
key 5: jkl
key 6: mno
key 7: pqrs
key 8: tuv
key 9: wxyz

The first press of a key types the first letter. Each subsequent press advances to the next letter. For example, to type the word "snow", you need to press "7" four times, followed by "6" twice, followed by "6" three times, followed by "9" once. The total number of key presses is 10.


The first line in the input file contains the number of test cases N. This is followed by N cases. Each case consists of two lines. On the first line we have the maximum number of letters to place on a key (P), the number of keys available (K) and the number of letters in our alphabet (L) all separated by single spaces. The second line has L non-negative integers. Each number represents the frequency of a certain letter. The first number is how many times the first letter is used, the second number is how many times the second letter is used, and so on.


For each case, you should output the following

Case #x: [minimum number of keypad presses]

indicating the number of keypad presses to type the message for the optimal layout.


Time limit: 30 seconds per test set.
Memory limit: 1GB.
P * KL
0 ≤ The frequency of each letter ≤ 1000000

Small dataset (Test set 1 - Visible)

1 ≤ N ≤ 10
1 ≤ P ≤ 10
1 ≤ K ≤ 12
1 ≤ L ≤ 100

Large dataset (Test set 2 - Hidden)

1 ≤ N ≤ 100
1 ≤ P ≤ 1 000
1 ≤ K ≤ 1 000
1 ≤ L ≤ 1 000


Sample Input
content_copy Copied!
3 2 6
8 2 5 2 4 9
3 9 26
1 1 1 100 100 1 1 1 1 1 1 1 1 1 1 1 1 10 11 11 11 11 1 1 1 100
Sample Output
content_copy Copied!
Case #1: 47
Case #2: 397

B. Ugly Numbers


Once upon a time in a strange situation, people called a number ugly if it was divisible by any of the one-digit primes (2, 3, 5 or 7). Thus, 14 is ugly, but 13 is fine. 39 is ugly, but 121 is not. Note that 0 is ugly. Also note that negative numbers can also be ugly; -14 and -39 are examples of such numbers.

One day on your free time, you are gazing at a string of digits, something like:


You are amused by how many possibilities there are if you are allowed to insert plus or minus signs between the digits. For example you can make

1 + 234 - 5 + 6 = 236

which is ugly. Or

123 + 4 - 56 = 71

which is not ugly.

It is easy to count the number of different ways you can play with the digits: Between each two adjacent digits you may choose put a plus sign, a minus sign, or nothing. Therefore, if you start with D digits there are 3D-1 expressions you can make.

Note that it is fine to have leading zeros for a number. If the string is "01023", then "01023", "0+1-02+3" and "01-023" are legal expressions.

Your task is simple: Among the 3D-1 expressions, count how many of them evaluate to an ugly number.


The first line of the input file contains the number of cases, N. Each test case will be a single line containing a non-empty string of decimal digits.


For each test case, you should output a line

Case #X: Y

where X is the case number, starting from 1, and Y is the number of expressions that evaluate to an ugly number.


Time limit: 30 seconds per test set.
Memory limit: 1GB.
0 ≤ N ≤ 100.
The string in each test case will be non-empty and will contain only characters '0' through '9'.

Small dataset (Test set 1 - Visible)

Each string is no more than 13 characters long.

Large dataset (Test set 2 - Hidden)

Each string is no more than 40 characters long.


Sample Input
content_copy Copied!
Sample Output
content_copy Copied!
Case #1: 0
Case #2: 1
Case #3: 6
Case #4: 64

C. Increasing Speed Limits


You were driving along a highway when you got caught by the road police for speeding. It turns out that they've been following you, and they were amazed by the fact that you were accelerating the whole time without using the brakes! And now you desperately need an excuse to explain that.

You've decided that it would be reasonable to say "all the speed limit signs I saw were in increasing order, that's why I've been accelerating". The police officer laughs in reply, and tells you all the signs that are placed along the segment of highway you drove, and says that's unlikely that you were so lucky just to see some part of these signs that were in increasing order.

Now you need to estimate that likelihood, or, in other words, find out how many different subsequences of the given sequence are strictly increasing. The empty subsequence does not count since that would imply you didn't look at any speed limits signs at all!

For example, (1, 2, 5) is an increasing subsequence of (1, 4, 2, 3, 5, 5), and we count it twice because there are two ways to select (1, 2, 5) from the list.


The first line of input gives the number of cases, N. N test cases follow. The first line of each case contains n, m, X, Y and Z each separated by a space. n will be the length of the sequence of speed limits. m will be the length of the generating array A. The next m lines will contain the m elements of A, one integer per line (from A[0] to A[m-1]).

Using A, X, Y and Z, the following pseudocode will print the speed limit sequence in order. mod indicates the remainder operation.

for i = 0 to n-1
  print A[i mod m]
  A[i mod m] = (X * A[i mod m] + Y * (i + 1)) mod Z

Note: The way that the input is generated has nothing to do with the intended solution and exists solely to keep the size of the input files low.


For each test case you should output one line containing "Case #T: S" (quotes for clarity) where T is the number of the test case and S is the number of non-empty increasing subsequences mod 1 000 000 007.


Time limit: 60 seconds per test set.
Memory limit: 1GB.
1 ≤ N ≤ 20
1 ≤ m ≤ 100
0 ≤ X ≤ 109
0 ≤ Y ≤ 109
1 ≤ Z ≤ 109
0 ≤ A[i] < Z

Small dataset (Test set 1 - Visible)

1 ≤ mn ≤ 1000

Large dataset (Test set 2 - Hidden)

1 ≤ mn ≤ 500000


Sample Input
content_copy Copied!
5 5 0 0 5
6 2 2 1000000000 6
Sample Output
content_copy Copied!
Case #1: 15
Case #2: 13

The sequence of speed limit signs for case 2 should be 1, 2, 0, 0, 0, 4.

Analysis — A. Text Messaging Outrage

This was the one of the easiest problems of Round 1. We simply need to fill the cell phone keyboard greedily. We put the K most frequent letters in the first positions of the K keys, the next K most frequent letters in the second positions, and so on. Any optimal solution will have this structure because if it does not, then it can be improved by swapping a more frequent character from a higher indexed position of some key with a less frequent character from a lower indexed position, thus decreasing the number of key presses.

Here is code that implements this solution:

long long A[1000];

int main() {
  int N;
  cin >> N;
  for(int t = 1; t <= N; t++) {
    long long result = 0;
    int P, K, L;
    cin >> P >> K >> L;
    for(int i = 0; i < L; i++) cin >> A[i];
    sort(A, A+L);
    reverse(A, A+L);

    for(int i = 0; i < L; i++)
      result += (1 + i / K) * A[i];

    cout << "Case #" << t << ": " << result << endl;
Test Data
info We recommend that you practice debugging solutions without looking at the test data.

Analysis — B. Ugly Numbers

As the problem clearly states, there are 3D-1 ways you need to consider. When D is no larger than 13, as it is in the small dataset, one may simply generate all the possible combinations and calculate each of them. However, with D as large as 40, brute force is clearly too slow.

As you might expect from a programming contest, the magic term is dynamic programming. This is the first problem in Google Code Jam 2008 that falls into the standard dynamic programming (DP) category; and one can be assured that there are more to come.

As in any DP problem, the first task is to structure the problem in a good way so there are not too many numbers to compute, and each number can be computed easily from previous ones. If you observe the solutions of the top scorers in this sub-round, you will see, although their algorithms vary to some degre, they all contain the following magic number

2 · 3 · 5 · 7 = 210.

Let us say we have two numbers, x and y, knowing the ugliness of x and y is not enough to decide whether x + y and x - y are ugly. On the other hand, we do not need the exact value of x and y. Knowing x % 210 and y % 210 is enough for us to decide if, say, x + y is ugly or not.

For those who prefer mathematical terms, we are using the Chinese remainder theorem. Our problem can be viewed as arithmetics on the cyclic group (Z210, +).

So, let us outline the most central step of our dynamic programming solution. We want to compute

dyn[i][x] := number of ways we get an expression evaluating 
          to x (mod 210) if we only consider the first i
          characters of the string. (*)

So, we have only 40·210 numbers to consider. For each dyn[i][x], we try all possible positions for inserting the last '+' or '-' sign. If the last sign was a '+' before position j (j<i), and the number formed by digits from position j to position i is d, then we want to know dyn[j-1][(x-d)%210]. On the other hand, if the sign inserted was a '-', then we want to look at dyn[j-1][(x+d)%210].

Here is a masterful implementation from our own Derek Kisman. His complete C++ solution follows the idea above, with a little twist and some nice programming tricks.

#define MOD (2*3*5*7)
string s;
long long dyn[41][MOD];

main() {
  int N, prob=1;
  for (cin >> N; N--;) {
    cin >> s;
    memset(dyn, 0, sizeof(dyn));
    dyn[0][0] = 1;
    for (int i = 0; i < s.size(); i++)
    for (int sgn = (i==0) ? 1 : -1; sgn <= 1; sgn += 2) {
      int cur = 0;
      for (int j = i; j < s.size(); j++) {
        cur = (cur*10 + s[j]-'0')%MOD;
        for (int x = 0; x < MOD; x++)
          dyn[j+1][(x+sgn*cur+MOD)%MOD] += dyn[i][x];
    long long ret = 0;
    for (int x = 0; x < MOD; x++)
      if (x%2 == 0 || x%3 == 0 || x%5 == 0 || x%7 == 0)
        ret += dyn[s.size()][x];
    cout << "Case #" << prob++ << ": " << ret << endl;

More information:

Dynamic programming - Chinese remainder theorem

Test Data
info We recommend that you practice debugging solutions without looking at the test data.

Analysis — C. Increasing Speed Limits

This was the hard problem for round 1C, with 398 people solving the small i/o set and 49 people solving the large i/o set during the contest.

The problem asks us to count the number of strictly increasing subsequences of the original sequence. This lends itself nicely to a dynamic programming solution for the small input.

Let f(x) be the number of strictly increasing subsequences that start with node x and S[x] be the value of the sequence at position x. At any node x we can either end the subsequence at x or connect it to any of the following nodes that are strictly greater than node x. Therefore, you could do something like this to find out how many strictly increasing subsequences start at each index.

f(x) = 1
for i = n to 1
  f(i) = 1
  for j = i + 1 to n
    if S[i] < S[j]
      f(i) = f(i) + f(j)

After doing that, solving the original problem is just a matter of summing the number of ways you make a strictly increasing subsequence starting at each position.

The key to solving the large i/o set is to realize we can make the inner loop run much faster. What we really want to do is sum all of the previous f(j)s for which S[j] > S[i]. This is a classic tree problem. There are a number of tree data structures you could use including a binary indexed tree, a segment tree, or a binary search tree, to name a few, that will allow you to solve the problem in O(n log n) time per case.

The first two types of trees are fairly easy to implement although they add a complication since they use an amount of memory proportional to the maximum value in the sequence. Fortunately this can be worked around by normalizing S to contain only values between 0 and n-1 without changing the the S[i] < S[j] property for any i and j. This can be done by transforming S[i] to be the first occurrence of S[i] in a sorted list of S.

Here is an implementation of these ideas in C++.

#define MAXN (1<<20)
int sum_bit[MAXN];

int sum_bit_get(int x)
  int ret = 0;
  for(int i = x | MAXN; i < 2 * MAXN; i += i & -i)
    ret = (ret + sum_bit[i ^ MAXN]) % 1000000007;
  return ret;

void sum_bit_add(int x, int v)
  for(int i = x | MAXN; i; i &= i - 1)
    sum_bit[i ^ MAXN] = (sum_bit[i ^ MAXN] + v) % 1000000007;

int S[1000000];
int S2[1000000];

int main()
  int T; cin >> T;
  long long X, Y, Z, A[100];
  for(int t = 1; t <= T; t++) {
    int n, m; cin >> n >> m >> X >> Y >> Z;
    for(int i = 0; i < m; i++) cin >> A[i];

    // Generate S
    for(int i = 0; i < n; i++) {
      S[i] = A[i % m];
      A[i % m] = (X * A[i % m] + Y * (i + 1)) % Z;

    // Normalize S
    memcpy(S2, S, sizeof(S));
    sort(S2, S2+n);
    for(int i = 0; i < n; i++)
      S[i] = lower_bound(S2, S2+n, S[i]) - S2;

    // Calculate f(i) and sum them in to result.
    int result = 0;
    memset(sum_bit, 0, sizeof(sum_bit));
    for(int i = n - 1; i >= 0; i--) {
      int add = 1 + sum_bit_get(S[i] + 1);
      sum_bit_add(S[i], add);
      result = (result + add) % 1000000007;
    cout << "Case #" << t << ": " << result << endl;
  return 0;

In addition to trees, there was a "hacker's" O(n √n) solution. The idea behind it is similar to the idea mentioned in solution 1 of the Mousetrap editorial (from Round 1B) where, in addition to maintaining the value at each position, we maintain sums for ranges of length √N. These tables can be maintained in constant time and take √n time to query.

More Information:
Binary Indexed Tree

Test Data
info We recommend that you practice debugging solutions without looking at the test data.

Statistics — A. Text Messaging Outrage

Test set 1: 2204 correct solutions (98.4% solve rate)

xhl_kogitsune aka xhl.kogitsune C++, 8:24
Graceful blue rhino 9:03
Jan_Kuipers aka JanKuipers C++, 9:07
Vasyl(alphacom) aka Vasyl C++, 9:53
nodchip C++, 10:09
mmsganesh -, 146 bytes
henryw aka HenryW C, 191 bytes
mtran Ruby, 242 bytes
pinc Ruby, 269 bytes
philipmatarese Ruby, 284 bytes

Test set 2: 1402 correct solutions (62.6% solve rate)

xhl_kogitsune aka xhl.kogitsune C++, 8:24
Graceful blue rhino 9:03
Jan_Kuipers aka JanKuipers C++, 9:07
Vasyl(alphacom) aka Vasyl C++, 9:53
nodchip C++, 10:09
mtran Ruby, 242 bytes
pinc Ruby, 269 bytes
philipmatarese Ruby, 284 bytes
sholden Ruby, 308 bytes
Primat Python, 340 bytes

Statistics — B. Ugly Numbers

Test set 1: 554 correct solutions (24.7% solve rate)

xhl_kogitsune aka xhl.kogitsune C++, 20:00
Digibomb C++, 23:44
tiagomt C, 24:42
anrieff C++, 25:08
elizarov Java, 27:24
MTH aka mth Python, 594 bytes
paolino Haskell, 595 bytes
RalfKistner Python, 608 bytes
Aidy Perl, 633 bytes
tiagomt C, 668 bytes

Test set 2: 82 correct solutions (3.7% solve rate)

xhl_kogitsune aka xhl.kogitsune C++, 20:00
elizarov Java, 27:24
Hackerham 28:36
JongMan C++, 31:32
Klinck Python, 33:55
Klinck Python, 766 bytes
gaosimeng C++, 886 bytes
Danko Python, 900 bytes
zalenski C++, 961 bytes
xhl_kogitsune aka xhl.kogitsune C++, 1024 bytes

Statistics — C. Increasing Speed Limits

Test set 1: 398 correct solutions (17.8% solve rate)

devilkind C++, 23:35
kyun C++, 26:45
Primat Python, 29:21
bjq Java, 29:25
LifeCoder C++, 31:53
leftleaf C++, 196 bytes
StepInto C++, 512 bytes
Primat Python, 523 bytes
MTH aka mth Python, 540 bytes
ravehanker Ruby, 548 bytes

Test set 2: 49 correct solutions (2.2% solve rate)

vepifanov Pascal, 44:48
xhl_kogitsune aka xhl.kogitsune C++, 51:04
Hackerham 52:11
anrieff C++, 57:54
Xiefeng C++, 59:07
Xiefeng C++, 1161 bytes
ulzha C++, 1180 bytes
nutki C, 1254 bytes
Progbeat C++, 1356 bytes
imos C++, 1359 bytes