HDU

Time limit 1000

msMemory limit 32768 kB

Given a string S and two integers L and M, we consider a substring of S as “recoverable” if and only if
(i) It is of length M*L;
(ii) It can be constructed by concatenating M “diversified” substrings of S, where each of these substrings has length L; two strings are considered as “diversified” if they don’t have the same character for every position.

Two substrings of S are considered as “different” if they are cut from different part of S. For example, string "aa" has 3 different substrings "aa", "a" and "a".

Your task is to calculate the number of different “recoverable” substrings of S.

InputThe input contains multiple test cases, proceeding to the End of File.

The first line of each test case has two space-separated integers M and L.

The second ine of each test case has a string S, which consists of only lowercase letters.

The length of S is not larger than 10^5, and 1 ≤ M * L ≤ the length of S.OutputFor each test case, output the answer in a single line.

Sample Input

3 3
abcabcbcaabc

Sample Output

2


本題大意：
　　給出M，L，和一個字符串。求該字符串的長度為M*L的子串中，有幾個是由M個長度為L的不同的小串組成的，如樣例：找長度3*3=9的子串，其中一個“bcabcbcaa"就由“bca" "bcb" "caa"
三個不相同的小子串組成，則子串“bcabcbcaa"是其中一個滿足條件的一個子串。

解題思想：
　　用求出整串的hash值，定一個長度m*l的範圍，在hash數組上跑，每次操作把相鄰的m個，長為l的字符串的hash值放到map容器里去，然後檢查map里元素的個數是否為m，若為m，則放進去的m段
字符串hash值不同，即字符串不同。則滿足條件。這段M*L長度的範圍在主串上移動，不斷重複以上的操作，直到把主串跑完，結果就出來了。根據這個想法，我寫了下面的第一段代碼。

#include<bits/stdc++.h>
#define N 100009
#define bas 27
using namespace std;
typedef unsigned long long ull;
char s[N];
int l,m;
ull has[N],base[N];
map<ull,int> mp;
int main()
{
    base[0]=1;
    for(int i=1;i<N;i++)
        base[i]=base[i-1]*bas;

    //freopen("a.txt","r",stdin);
    while(~scanf("%d%d",&m,&l))
    {

        scanf("
%s",s+1);
        int t=l*m;
        int len=strlen(s+1);
        has[0]=0;

        for(int i=1;i<=len;i++)
            has[i]=has[i-1]*bas+s[i]-'a'+1;

        int cnt=0;
        for(int i=1;i+t-1<=len;i++)//M*L的範圍在主串上移動
        {
            mp.clear();
            for(int j=l-1;j<t;j+=l)//在範圍中不斷計算相鄰的長度為L的子串的hash值
            {
                ull temp=has[i+j]-has[i+j-l]*base[l];
                mp[temp]++;
            }

            if(mp.size()==m)    //判斷條件
                cnt++;
        }
       printf("%d
",cnt);

    }
}

TLE代碼

　　為什麼會TLE呢？上面的代碼在L很小，而主串很長的時候，複雜度會達到O(n^2) 在1e5的數據上跑是會T掉的。無奈之下，我去參考了一些大神的AC代碼。咋一看，他們的代碼，也是在主串上跑一邊
同時也在一個範圍內判斷是否滿足條件，看似跟我的操作是一樣的，但卻能AC，當我再仔細查看的時候，才發現，他們在外層循環上只跑了L的距離。在內層則把M*L的範圍從左往右諾了一遍。複雜度似乎只有O(n)級別
　　具體如何實現呢？這就涉及到hash處理等長字段的一個技巧，我好好地學了一下：首先，我們是要M個長為L的小子串，在中間的某個地方，要找下一個M*L的時候，只要在左邊取出一個L，再右邊填上一個L，
就可以得到個新的判斷範圍，這樣就避免了重複地從map容器中取出太多的hash值。大大降低了複雜度。
　　看到這裡，其實問題並沒有完全的解決。因為這樣每次改變的是一個L的長度，而不是一個字符，那麼並沒有遍歷到所有的情況。所以怎麼辦呢？這就是為什麼在外層循環跑一個L的距離，內層每次移動L的長度，
如果把起點挪動一個字符的長度，那麼就是另一個不同的情況，但是當我們的起點逐個字符移動了L個長度后，是不是已經相當於從一開始的起點一次移動L的長度了，這種情況已經包含在前面的情況裡面了。所以我們
外層循環只需要跑個L字符的長度，這樣就已經把所有的情況遍歷了一遍。
　　這個方法的核心是如何每次只從map容器中取出一個，再放入一個小子段，從而達到另一個範圍的hash值。具體的文字難以描述清楚，請下邊的AC代碼，配合注釋意會理解。

#include<bits/stdc++.h>
#define N 100020
#define bas 27
using namespace std;
typedef unsigned long long ull;

char s[N];
ull base[N],has[N];
int m,l;
int main()
{
    //freopen("a.txt","r",stdin);
    map<ull,int> mp;
    base[0]=1;
    for(int i=1;i<N;i++)
        base[i]=base[i-1]*bas;

    while(~scanf("%d%d",&m,&l))
    {
        scanf("%s",s+1);
        int len=strlen(s+1);
        int ans=0,t=m*l;
        has[0]=0;
        for(int i=1;i<=len;i++)
            has[i]=has[i-1]*bas+s[i]-'a'+1;

        for(int i=1;i<=l&&i+t-1<=len;i++)//起點每次移動一字符
        {

            mp.clear();

            for(int j=i-1;j+l<=i-1+t;j+=l)     //從計算出從起點開始長度為M*L的子串的hash值
                mp[has[j+l]-has[j]*base[l]]++;
            

            if(mp.size()==m)     //判斷是否滿足條件
                ans++;

            for(int j=i-1+t;j+l<=len;j+=l)//每次移動L個字符的過程
            {
                int a=j-t;
                ull temp=has[a+l]-has[a]*base[l];//計算最左邊的L長度的子串hash值
                mp[temp]--;                     //把這個小段的哈希值取出

                if(mp[temp]==0)        
                    mp.erase(temp);

                mp[has[j+l]-has[j]*base[l]]++;//再從右邊放入長度為L的子串的hash值
                                                //這樣只用兩步操作，就把整個範圍向右移動了L的距離。
                if(mp.size()==m)              //移動后判斷是否滿足條件
                    ans++;
            }
        }                                      //至此一個起點的情況已經判斷完了，回到外層循環，起點前進一個字符
        printf("%d
",ans);

    }
}