HDU 6096 String（字典树）

String

2017 Multi-University Training Contest - Team 6

Time Limit: 6000/3000 MS (Java/Others) Memory Limit: 524288/524288 K (Java/Others)
Total Submission(s): 834 Accepted Submission(s): 271

Problem Description
Bob has a dictionary with N words in it.
Now there is a list of words in which the middle part of the word has continuous letters disappeared. The middle part does not include the first and last character.
We only know the prefix and suffix of each word, and the number of characters missing is uncertain, it could be 0. But the prefix and suffix of each word can not overlap.
For each word in the list, Bob wants to determine which word is in the dictionary by prefix and suffix.
There are probably many answers. You just have to figure out how many words may be the answer.

Input
The first line of the input gives the number of test cases T; T test cases follow.
Each test case contains two integer N and Q, The number of words in the dictionary, and the number of words in the list.
Next N line, each line has a string Wi, represents the ith word in the dictionary ()
Next Q line, each line has two string Pi , Si, represents the prefix and suffix of the ith word in the list ()
All of the above characters are lowercase letters.
The dictionary does not contain the same words.

Limits

T≤5
0<N,Q≤100000
∑Si+Pi≤500000
∑Wi≤500000

Output
For each test case, output Q lines, an integer per line, represents the answer to each word in the list.

Sample Input
1
4 4
aba
cde
acdefa
cdef
a a
cd ef
ac a
ce f

Sample Output
2
1
1
0

Source
2017 Multi-University Training Contest - Team 6

题意：
给出n个字符串，q个查询，每个查询包含A、B两个字符串，问在给定的n个字符串中，有多少个字符串满足前缀是A，后缀是B且前缀后缀没有重叠部分
解题思路：
为了优化内存和查询时间，我们首先将n个字符串保存在一个字符串中，并用数组储存每个字符串的长度，方便后续操作。然后对输入的q次查询的字符串进行建立在字典树，在建立字典树的时间，需要注意一些问题，如：对于cd ef这个询问：我们转换成这样：cd{fe（字符‘{’等于’a’+26），并把 cd{fe 插入到字典树。
对于插入，查询例如：
字典树中有cd{fe 和 cda{fe，然后当前给出的cdabef
处理到cd的时候，d后面发现有通配符，然后去匹配后缀。匹配完后，我们扩展前缀，
即考虑cda发现a后面有通配符，然后去匹配后缀。这样cdabef对cd ef 和 cda ef这两个查询都贡献了一次。
有一点非常坑，题目给出的Q次查询中会有重复的出现，对于两个给出的完全相同的前缀和后缀的组合
它在插入字典树最后所到达的节点P必定是相同的。如果p->num发现不是0，说明之前已经有一个前缀后缀
的组合到达这个位置了，现在再次到达这个位置，p->num只统计最早到达p节点的前后缀组合就行，后到达
的和之前到达的一样，当然最后输出的时候答案也是一样的。我们可以用一个id[]数组，原来第i次询问的
id[i] = i就行了，如果第i个询问和第j个询问是相同的，他们最终答案一样，我们让id[j] = i;就可以了。
这样我们最后输出ans[id[i]]就是答案了。
具体看代码：

┭┮﹏┭┮ ┭┮﹏┭┮

#include <iostream>
#include <stdio.h>
#include <malloc.h>
#include <string.h>
const int maxn = 27;
const int maxstr = 1e7;
char str[maxstr],s1[maxstr],s2[maxstr];
int id[maxstr];
int len[maxstr];    ///len[i]代表第i个字符串的长度。
int ans[maxstr];    ///ans[i]是第i个最后输出的满足第i个询问的字符串数量
typedef struct TrieNode ///字典树节点
{
    int num;            ///该节点所代表的查询的编号。
    struct TrieNode *son[maxn];  ///26个字母加上一个通配符。
}Trie;
Trie* createNode()
{
    Trie *node;
    node = (Trie*)malloc(sizeof(Trie));
    for(int i = 0; i < maxn; i++)
        node->son[i] = NULL;
    node->num = 0;
    return node;
}
void insertWord(Trie *root,int index)  ///将第index个询问插入字典树
{
    Trie *p;    ///辅助指针
    p = root;
    int i = 0;
    while(s1[i] != '')
    {
        int lowercase = s1[i]-'a';   ///求小写字母对应的整数
        if(p->son[lowercase]==NULL)
        {
            p->son[lowercase] = createNode();
        }
        p = p->son[lowercase];
        i++;
    }
    if(p->num == 0)  ///还没有询问到达这里
        p->num = index;
    else id[index] = p->num;   ///遇到相同的数据只存最初出现的一个。
}
void query(Trie *root,int strbegin,int strend)
{
    Trie *p,*tmp;
    p = root;
    int i = strbegin;
    while(p!=NULL && i<=strend)
    {
        int lowercase = str[i]-'a';
        if(p->son[lowercase] == NULL) return;  ///前缀匹配的时候就失配了
        p = p->son[lowercase];
        if(p->son[26]!=NULL)
        {
            tmp = p->son[26];    ///现在tmp是通配符，我们继续往后匹配后缀
            for(int j=strend; j > i; j--)
            {
                lowercase = str[j]-'a';
                if(tmp->son[lowercase]==NULL)  ///失配了
                    break;
                tmp = tmp->son[lowercase];
                if(tmp->num)
                    ans[tmp->num]++;//统计满足前缀后缀字符串的个数
            }
        }
        i++;
    }
}
void free_Trie(Trie *root)//清空字典树
{
    if(root != NULL)
    {
        for(int i = 0; i < maxn; i++)
            if(root->son[i]!=NULL)
                free_Trie(root->son[i]);
    }
    free(root);
}
int main()
{
    int t;
    scanf("%d",&t);   ///t组测试数据
    while(t--)
    {
        int n,q;      ///n个字符，q次查询
        scanf("%d%d",&n,&q);
        Trie *root;
        root = createNode();
        int prelen = 0;    ///当前串长
        for(int i = 0; i < n; i++)
        {
            scanf("%s",str+prelen);
            len[i] = strlen(str+prelen);//存储每个字符串的长度，方便下面操作
            prelen += len[i];
        }
        for(int i = 1; i <= q; i++)
        {
            id[i] = i;
            ans[i] = 0;
        }
        for(int i = 1; i <= q; i++)
        {
            scanf("%s",s1);
            int len1 = strlen(s1);
            s1[len1++] = 'a'+26; ///通配符
            scanf("%s",s2);
            int len2 = strlen(s2);
            for(int j = len2-1; j >= 0; j--)
            {
                s1[len1++] = s2[j];
            }
            s1[len1] = '';
            insertWord(root,i);//建立字典树
        }
        prelen = 0;
        for(int i = 0; i < n; i++)
        {
            query(root,prelen,prelen+len[i]-1);
            prelen += len[i];
        }
        for(int i = 1; i <= q; i++)
            printf("%d
",ans[id[i]]);
        free_Trie(root);
    }
    return 0;
}