一些题解

题目来源: 鲸歌的博客

题解

这是一道简单的题目,但是朴素的算法是$ ext{O}left( n^2 ight)$的,只能过$70\%$的点.如何优化呢?

考虑在什么情况下会使一段区间内出现过的A,B,C数量相等.显然的,设$totAleft[ i ight]$为前i个字符中A出现的次数(以此类推,$totB$和$totC$)

设一段相等的区间起始于$i$,终止于$j$,显然,当且仅当$totA[j]-totA[i]=totB[j]-totB[i]=totC[j]-totC[i]$时,i与j间A,B,C数相等

移项易得$totA[j]-totB[j]=totA[i]-totB[i],totB[j]-totC[j]=totB[i]-totC[i]$,易证该式成立为上式成立充分必要条件.
这两个值拥有相当有趣的推论,即记$ab[i]=totA[i]-totB[i],bc[i]=totB[i]-totC[i]$,
只要任何$i<j$且$ab[i]=ab[j], bc[i]=bc[j]$成立,i~j 间就有相同的A,B,C数目,反之,是不能有相同的A,B,C数目的.

这时算法仍然是$ ext{O}left( n^2 ight)$,但是通过组合学的方法,我们可以将所有$ab[i],bc[i]$相同的点分为一类,每类中可行段的总和即是答案.而每类中只要是两两元素间的段
都是可行段,即段数为$frac{nleft( n-1 ight)}{2}$.n只需要扫描到此类节点时自增一即可.

最后问题,如何储存一类的和?使用hash表,平均复杂度接近常数,不过hash函数要写的好些.(有冲突怎么办?记得前向星么?就那么处理)

最后附上渣代码,不过效率比std高多了,std评测415ms,我的代码180ms.(有个神犇同学写的AC代码总共1500 ms...)

#include <cstdio>
#define nxp 1110007
#define par 111029
int tot[3],hash[nxp],t,t2,t3,t4,ln;
char c;
struct nnode{
	int ab,bc,cn,sz;
} nn[2000000];
inline int abs(int n){
	return (n<0?-n:n);
}
int hashF(int ab,int bc){
	return abs((ab*par+bc)%nxp);
}
int find(int ab,int bc){
	t4=hashF(ab,bc);
	t=hash[t4];
	while(nn[t].ab!=ab||nn[t].bc!=bc){
		if(t==0) return ~t4;//-t4-1
		t=nn[t].cn;
	}
	return t;
}
long long sum;
int main(){
	ln=1;
	nn[ln].ab=0;
	nn[ln].bc=0;
	t3=~t3;
	nn[ln].cn=hash[0];
	nn[ln].sz=1;
	hash[0]=ln;
	++ln;
	while(c=getchar(),c!=EOF&&c!='
'&&c!=''){
		t2=c-'A';
		if(t2<3){
			++tot[t2];
		}
		t3=find(tot[1]-tot[0],tot[2]-tot[1]);
		if(t3<0){//if has significate bit true
			nn[ln].ab=tot[1]-tot[0];
			nn[ln].bc=tot[2]-tot[1];
			t3=~t3;
			nn[ln].cn=hash[t3];
			nn[ln].sz=1;
			hash[t3]=ln;
			++ln;
		}else{
			sum+=nn[t3].sz;
			++nn[t3].sz;
		}
	}
	printf("%d
",sum);
	return 0;
}

最后说一句,千万注意$ab[0]=bc[0]=0$,不然很可能错.(即是我代码main最前面如此伪和的来历).唉,解释了这么多不如代码实在啊...我这个代码用多次加法(每次sum加上此点前符合要求的起点总数)代替乘法,附上乘法版(不过这个优化根本不明显啊...)

#include <cstdio>
#define nxp 1110007
#define par 111029
int tot[3],hash[nxp],t,t2,t3,t4,ln,i;
char c;
struct nnode{
	int ab,bc,cn,sz;
} nn[2000000];
inline int abs(int n){
	return (n<0?-n:n);
}
int hashF(int ab,int bc){
	return abs((ab*par+bc)%nxp);
}
int find(int ab,int bc){
	t4=hashF(ab,bc);
	t=hash[t4];
	while(nn[t].ab!=ab||nn[t].bc!=bc){
		if(t==0) return ~t4;//-t4-1
		t=nn[t].cn;
	}
	return t;
}
long long sum;
int main(){
	ln=1;
	nn[ln].ab=0;
	nn[ln].bc=0;
	t3=~t3;
	nn[ln].cn=hash[0];
	nn[ln].sz=1;
	hash[0]=ln;
	++ln;
	while(c=getchar(),c!=EOF&&c!='
'&&c!=''){
		t2=c-'A';
		if(t2<3){
			++tot[t2];
		}
		t3=find(tot[1]-tot[0],tot[2]-tot[1]);
		if(t3<0){//if has significate bit true
			nn[ln].ab=tot[1]-tot[0];
			nn[ln].bc=tot[2]-tot[1];
			t3=~t3;
			nn[ln].cn=hash[t3];
			nn[ln].sz=1;
			hash[t3]=ln;
			++ln;
		}else{
			++nn[t3].sz;
		}
	}
	for(i=1;i<ln;++i) sum+=(nn[i].sz)*(nn[i].sz-1)>>1;
	printf("%lld
",sum);
	return 0;
}

代码中nxp和par即为hash函数常数了...这里选用了两个大小刚刚好的素数.~x是-x-1的一个很方便的代替方法(而且非常快).为什么-x-1主要是考虑到-0=0的情况不好分辨.(当然~x比-x快是显然的,无需取反后加一)