LeetCode818. Race Car

https://leetcode.com/problems/race-car/description/

Your car starts at position 0 and speed +1 on an infinite number line. (Your car can go into negative positions.)

Your car drives automatically according to a sequence of instructions A (accelerate) and R (reverse).

When you get an instruction "A", your car does the following: position += speed, speed *= 2.

When you get an instruction "R", your car does the following: if your speed is positive then speed = -1 , otherwise speed = 1. (Your position stays the same.)

For example, after commands "AAR", your car goes to positions 0->1->3->3, and your speed goes to 1->2->4->-1.

Now for some target position, say the length of the shortest sequence of instructions to get there.

Example 1:
Input: 
target = 3
Output: 2
Explanation: 
The shortest instruction sequence is "AA".
Your position goes from 0->1->3.

Example 2:
Input: 
target = 6
Output: 5
Explanation: 
The shortest instruction sequence is "AAARA".
Your position goes from 0->1->3->7->7->6.

1 <= target <= 10000.

思路

解析：https://leetcode.com/problems/race-car/discuss/124326/Summary-of-the-BFS-and-DP-solutions-with-intuitive-explanation

想到了用BFS来遍历所有可能的组合，但是存储空间超时，需要剔除掉重复计算的部分。还有一种解法则是万能的dp。

BFS

主要优化是两个，一是记录每层遍历中的position和speed，这个构成一个状态，如果已经访问过则不加入队列中，还有一个是 0 < nxt[0] && nxt[0] < (target << 1)。这个不是很明白，网友给的解释如下：

There are cases when it is desirable to go past the target and then come back. For example, target = 6, we could go 0 --> 1 --> 3 --> 7 --> 7 --> 6, which takes 5 instructions (AAARA). If you reverse before passing the target, it takes more than 5 instructions to get to the target. That said, we don't want to go too far past the target. The rule of thumb (I have not proved it rigorously though) is that we stay within some limit from the target, and this limit is set by the initial distance from the target, which is target. So from the point of view of the target, we want the car to stay in the range [0, 2 * target]. This is the primary optimization for the BFS solution.

public int racecar(int target) {
    Queue<int[]> queue = new LinkedList<>();
    queue.offer(new int[] {0, 1}); // starts from position 0 with speed 1
    
    Set<String> visited = new HashSet<>();
    visited.add(0 + " " + 1);
    
    for (int level = 0; !queue.isEmpty(); level++) {
        for(int k = queue.size(); k > 0; k--) {
            int[] cur = queue.poll();  // cur[0] is position; cur[1] is speed
            
            if (cur[0] == target) {
                return level;
            }
            
            int[] nxt = new int[] {cur[0] + cur[1], cur[1] << 1};  // accelerate instruction
            String key = (nxt[0] + " " + nxt[1]);
            
            if (!visited.contains(key) && 0 < nxt[0] && nxt[0] < (target << 1)) {
                queue.offer(nxt);
                visited.add(key);
            }
            
            nxt = new int[] {cur[0], cur[1] > 0 ? -1 : 1};  // reverse instruction
            key = (nxt[0] + " " + nxt[1]);
            
            if (!visited.contains(key) && 0 < nxt[0] && nxt[0] < (target << 1)) {
                queue.offer(nxt);
                visited.add(key);
            }
        }
    }
    
    return -1;
}

dp的难点在于如何建立相应的dp模型，即能准确描述问题，又能方便找到递推公式。

参考：https://blog.csdn.net/magicbean2/article/details/80333734

采用动态规划的思路，定义dp[target]表示行驶长度为target的距离所需要的最小指示个数。看了半天dp解析，还是看不明白在说什么，最终找到了这个比较容易的解释：

https://leetcode.com/problems/race-car/discuss/123834/C++JavaPython-DP-solution

首先对于target，我们肯定能找到一个正整数 n 使得2 ^ (n - 1) <= target < 2 ^ n，即n是target的二进制形式的数字长度，比如对于4，那么其二进制是100，那么n=3。我们有如下连个策略：

1. Go pass our target , stop and turn back
We take n instructions of A.
1 + 2 + 4 + ... + 2 ^ (n-1) = 2 ^ n - 1
Then we turn back by one R instruction.
In the end, we get closer by n + 1 instructions.

2. Go as far as possible before pass target, stop and turn back
We take n - 1 instruction of A and one R.
Then we take m instructions of A, where m < n

    int[] dp = new int[10001];
    public int racecar(int t) {
        if (dp[t] > 0) return dp[t]; // 如果dp[t]已计算过则直接返回
        int n = (int)(Math.log(t) / Math.log(2)) + 1;
        if (1 << n == t + 1) dp[t] = n; // 1<<n-1=t 表示一路狂奔A就能到达target
        else {
            dp[t] = racecar((1 << n) - 1 - t) + n + 1;　// 狂奔A到第一次冲过target，然后R，然后剩下的距离作为子问题递归求解
            for (int m = 0; m < n - 1; ++m)
                dp[t] = Math.min(dp[t], racecar(t - (1 << (n - 1)) + (1 << m)) + n + m + 1);  // 标记1
        }
        return dp[t];
    }

个人理解标记1处的计算是这样的，首先是 n-1 个A到达 2 ^ (n - 1) <= target < 2 ^ n 中前面的2 ^ (n - 1)位置，然后R，停在当前位置。然后注意现在是反向的，也就是向左，那么我们可以通过 0<=m<n-1 来遍历递归计算所有的可能，比如说m=0，那么

dp[t] = Math.min(dp[t], racecar(t - (1 << (n - 1)) + (1 << m)) + n + m + 1);

就变成了

Math.min(dp[t], racecar(t - (1 << (n - 1))) + n + 1)

其中 racecar(t - (1 << (n - 1))) 求解的是距离为 t - (1 << (n - 1)) 时的最优解，是个子问题。当我们走到2 ^ (n - 1)位置时，这样的子问题有很多个，遍历递归求这些子问题。

它这里遍历是逐步扩展左边来计算的，即会在2 ^ (n - 1)位置位置往左走m个A，其中m<n-1。

这里加上 n+1 是因为走了 n-1 个 A ，然后是1个 R。往回走m个A时，还要一次R将方向置回。

只有在R之后才能作为相同性质的子问题处理，因为会重置speed为1，只不过这里要注意方向性。