HDU 4050 wolf5x(动态规划-概率DP)

wolf5x


Time Limit: 2000/1000 MS (Java/Others)    Memory Limit: 32768/32768 K (Java/Others)

Total Submission(s): 402    Accepted Submission(s): 248

Special Judge


Problem Description
There are n grids in a row. The coordinates of grids are numbered from x=1 to x=n. Someone starts from x=0. You can step forward with your left leg or right leg alternatively in turn.Namely,if you step forward with your left leg, then you must step with your right leg next. As you can not jump , only one leg is allowed to use each step. Every step you take is in the range of [A,B], inclusively; namely, every step you take is at most B units and at least A units.
Before you start to move, the grids will be initialized randomly with 4 states(0,1,2,3), and p[i][j] means the probability of ith grid initialized with state j. After initialization, the state of the grids will not change.

State 0 means you can't step into the correspoding grid.
State 1 means you can just step into the grid with your left leg. 
State 2 means you can just step into the grid with your right leg.
State 3 means you can step into the grid with either of your legs,and the next step,you can use any legs; namely you don't need to follow the rules above.
If x>n, then the grid can be stepped in with arbitrary method.means you can step at the place after the nth grid.
For every step,you will choose the “step method” with the minimum step length. Namely, if you can take the step of S units and S+1 units, you will choose the step of S units.
Until you can't step in any grids in front of you,or you have been in a grid x>n, you will stop.
Can you calculate the expectation of the steps when you stop? 
 

Input
An integer T means the number of cases.T<=30
For each case,the first line is three integers n,A,B.
The next n lines,each line has 4 number p[i][0], p[i][1], p[i][2], p[i][3].
1 <= A <= B <= n<= 2000.
0 <= p[i][j] <= 1, p[i][0]+p[i][1]+p[i][2]+p[i][3] = 1.
 

Output
The expectation of the steps when you stop
you can assume that the relative epsilon is no more than 1e-6
 

Sample Input
9 2 1 1 0 0.5 0.5 0 0 0 1 0 2 1 1 0 0.5 0.5 0 0.5 0.5 0 0 2 1 2 0 0.5 0.5 0 0 0 1 0 2 1 2 0.2 0.3 0.4 0.1 0.15 0.2 0.25 0.4 3 1 10 0 0 0 1 0 0 0 1 0 0 0 1 3 1 1 0 0 0 1 0 0 0 1 0 0 0 1 3 2 2 0 0 0 1 0 0 0 1 0 0 0 1 3 3 3 0 0 0 1 0 0 0 1 0 0 0 1 3 1 2 0.0 0.3 0.6 0.1 0.1 0.2 0.3 0.4 0.5 0.4 0.1 0.0
 

Sample Output
2.00000000 1.50000000 2.50000000 2.46000000 4.00000000 4.00000000 2.00000000 2.00000000 2.80200000
 

Source
 

Recommend
lcy
 

题目大意:

这是一维的,一个人在0号格子,如今1~n号格子排成一排,上面有各种限制,一个人想从 0号格子走出n号格子。也就是走到 >n 处。

每一个格子是4种状态的当中一种,而且没告诉你是哪种状态,仅仅是告诉你概率,第i号格子4种状态的当中一种的概率记为p[i][0],p[i][1]。p[i][2],p[i][3]。

0 表示这个格子既不能左腿也不能右腿踏进去。

1 表示这个格子能够左腿踏进去。

2 表示这个格子能够右腿踏进去。

3 表示这个格子既能够左腿也能够右腿踏进去。

这个人刚開始能够选择迈出左腿也能够迈出右腿,可是下次必须迈出不同的腿,除非你走在3这个状态下。下次又能够左右腿选一个,就和走路一样。

这个人一次迈出的步子的范围是[A,B],还有就是假设他迈出的步子能够是A+1或者A+2或者别的。他会选择最少的步数也就是 A+1

问你,这个人走的步数的数学期望,这个人走到 >n 处停止或者不能再走了也停止。


解题思路:

看到了数学期望,我就想到了概率DP。而这题的解法也就是概率DP,这个赛区已经看到了三道DP题目了。

(1)首先。用DP(pos,state) 记录这个状态下你还须要走的步数的数学期望

当中 。pos记录你当前所在的格子,state记录你能够迈出的步子

0 表示你下一步不能走了,也就须要走的步数的数学期望0,是所以不会有这个状态,直接被剪纸剪掉了。

1 表示下一步要用左腿踏进去

2 表示下一步要用右腿踏进去

3 表示下一步既左腿也能够右腿踏进去

那么我么要求的就是DP(0,3)表示在0号格子,下一步既左腿也能够右腿踏进去,这个状态下你还须要走的步数的数学期望

(2)关于递推方程怎么求

DP(pos。state) =sum { p * ( DP(newpos,newstate) + 1 ) }

解释下我写的状态转移方程:

比方说,你当前在s位置,那么就枚举下一步走的位置 i, i>=s+A 且 i<=s+B

你走这个位置的概率,就是前面都不能走。也就是 s~i-1都同一时候为0状态,由于假设不是0状态你就得走,题目交代他会选择最少的步数。

所以得维护 s~i-1都同一时候为0状态的概率,也就是前缀相乘。

那么,如今讨论假设走了第 i 号位置

假设你的这一步要用右腿踏进去。下一步就得用左腿,假设这个位置的状态是3,你下个位置任意。

左腿同理。


解题代码:

#include <iostream>
#include <cstdio>
#include <cmath>
using namespace std;

const int maxn=2100;
double p[maxn][5],dp[maxn][5];
int a,b,n,vis[maxn][5],marked;

void input(){
	scanf("%d%d%d",&n,&a,&b);
	for(int i=1;i<=n;i++){
		for(int j=0;j<4;j++){
			scanf("%lf",&p[i][j]);
		}
	}
}

double DP(int k,int state){
	if(k>n) return 0;
	if(k==n) return 1;
	if(vis[k][state]==marked) return dp[k][state];
	double ans=0;
	if(state==3){
		double p0=1.0;
		for(int i=k+a;i<=k+b;i++){
			if(i>n){
				ans+=p0*1.0;
				break;
			}
			ans+=p0*p[i][3]*(DP(i,3)+1);
			ans+=p0*p[i][2]*(DP(i,1)+1);
			ans+=p0*p[i][1]*(DP(i,2)+1);
			p0*=p[i][0];
		}
	}else if(state==2){
		double p0=1.0;
		for(int i=k+a;i<=k+b;i++){
			if(i>n){
				ans+=p0*1.0;
				break;
			}
			ans+=p0*p[i][3]*(DP(i,3)+1);
			ans+=p0*p[i][2]*(DP(i,1)+1);
			p0*=(p[i][0]+p[i][1]);
		}
	}else{
		double p0=1.0;
		for(int i=k+a;i<=k+b;i++){
			if(i>n){
				ans+=p0*1.0;
				break;
			}
			ans+=p0*p[i][3]*(DP(i,3)+1);
			ans+=p0*p[i][1]*(DP(i,2)+1);
			p0*=(p[i][0]+p[i][2]);
		}
	}
	vis[k][state]=marked;
	return dp[k][state]=ans;
}

void solve(){
	marked++;
	printf("%.8lf
",DP(0,3));
}

int main(){
	int T;
	scanf("%d",&T);
	while(T-- >0){
		input();
		solve();
	}
	return 0;
}






原文地址:https://www.cnblogs.com/blfshiye/p/5227338.html