字符串分类

链接:https://www.nowcoder.com/acm/contest/141/E
来源:牛客网

题目描述

Eddy likes to play with string which is a sequence of characters. One day, Eddy has played with a string S for a long time and wonders how could make it more enjoyable. Eddy comes up with following procedure:

1. For each i in [0,|S|-1], let Si be the substring of S starting from i-th character to the end followed by the substring of first i characters of S. Index of string starts from 0.
2. Group up all the Si. Si and Sj will be the same group if and only if Si=Sj.
3. For each group, let Lj be the list of index i in non-decreasing order of Si in this group.
4. Sort all the Lj by lexicographical order.

Eddy can't find any efficient way to compute the final result. As one of his best friend, you come to help him compute the answer!

输入描述:

Input contains only one line consisting of a string S.

1≤ |S|≤ 10
6

S only contains lowercase English letters(i.e.
).

输出描述:

First, output one line containing an integer K indicating the number of lists.
For each following K lines, output each list in lexicographical order.
For each list, output its length followed by the indexes in it separated by a single space.
示例1

输入

复制
abab

输出

复制
2
2 0 2
2 1 3
示例2

输入

复制
deadbeef

输出

复制
8
1 0
1 1
1 2
1 3
1 4
1 5
1 6
1 7

题意 : 给一个字符串,要求每个位置为开始的子串会有多少种不同的情况,将不同的情况分类,按类输出
思路分析:
  将字符串延长一倍,预处理一遍 hash 值,任意一个子串的 hash 值就可以 O(1)的得到了,然后将 hash 值相同的串分类输出即可
代码示例:
using namespace std;
#define ll unsigned long long
const ll maxn = 1e6+5;
typedef pair<ll, ll>pa;

char s[maxn*2];
ll len, len2;
ll p = 19873;
ll hash_[maxn*2];

ll pp[maxn];
void init(){
    //printf("-------------------
");
    pp[0] = 1;
    for(ll i = 1; i <= len; i++){
        pp[i] = pp[i-1]*p;
        //printf("------ %llu 
", pp[i]);    
    }
    
}

void gethash(){
    
    for(ll i = 1; i <= len2; i++){
        hash_[i] = hash_[i-1]*p+(s[i]-'a');
    }
}
vector<ll>ve[maxn];
pa pre[maxn];
pa arr[maxn];

int main() {
    //freopen("in.txt", "r", stdin);
    //freopen("out.txt", "w", stdout);
    
    scanf("%s", s+1);
    len = strlen(s+1);
    init();
    for(ll i = 1; i <= len; i++) s[i+len] = s[i];
    len2 = len*2;
    
    gethash();
    ll k = 1; 
    for(ll i = len; i < len2; i++){
        ll num = hash_[i]-hash_[i-len]*pp[len];
        
        pre[k++] = make_pair(num, i-len);        
    }    
    sort(pre+1, pre+1+len);
    pre[k] = make_pair(-1, 0); 
    k = 1;
    for(int i = 1; i <= len; i++){
        ve[k].push_back(pre[i].second);
        arr[k] = make_pair(ve[k][0], k);
        while(pre[i+1].first == pre[i].first){
            ve[k].push_back(pre[i+1].second);    
            i++;
        }
        k++;
    }
    sort(arr+1, arr+k);
    printf("%llu
", k-1);
    
    for(ll i = 1; i < k; i++){
        ll x = arr[i].second;
        printf("%llu ", ve[x].size());
        for(ll j = 0; j < ve[x].size(); j++) 
            printf("%llu%c", ve[x][j], j==ve[x].size()-1?'
':' ');
    }
    return 0;
}


东北日出西边雨 道是无情却有情
原文地址:https://www.cnblogs.com/ccut-ry/p/9381123.html