HDU 1686 Oulipo 求大串中最多可匹配多少个小串(kmp)

http://acm.hdu.edu.cn/showproblem.php?pid=1686

Oulipo

Time Limit: 3000/1000 MS (Java/Others)    Memory Limit: 32768/32768 K (Java/Others)
Total Submission(s): 6098    Accepted Submission(s): 2448


Problem Description
The French author Georges Perec (1936–1982) once wrote a book, La disparition, without the letter 'e'. He was a member of the Oulipo group. A quote from the book:

Tout avait Pair normal, mais tout s’affirmait faux. Tout avait Fair normal, d’abord, puis surgissait l’inhumain, l’affolant. Il aurait voulu savoir où s’articulait l’association qui l’unissait au roman : stir son tapis, assaillant à tout instant son imagination, l’intuition d’un tabou, la vision d’un mal obscur, d’un quoi vacant, d’un non-dit : la vision, l’avision d’un oubli commandant tout, où s’abolissait la raison : tout avait l’air normal mais…

Perec would probably have scored high (or rather, low) in the following contest. People are asked to write a perhaps even meaningful text on some subject with as few occurrences of a given “word” as possible. Our task is to provide the jury with a program that counts these occurrences, in order to obtain a ranking of the competitors. These competitors often write very long texts with nonsense meaning; a sequence of 500,000 consecutive 'T's is not unusual. And they never use spaces.

So we want to quickly find out how often a word, i.e., a given string, occurs in a text. More formally: given the alphabet {'A', 'B', 'C', …, 'Z'} and two finite strings over that alphabet, a word W and a text T, count the number of occurrences of W in T. All the consecutive characters of W must exactly match consecutive characters of T. Occurrences may overlap.

 
Input
The first line of the input file contains a single number: the number of test cases to follow. Each test case has the following format:

One line with the word W, a string over {'A', 'B', 'C', …, 'Z'}, with 1 ≤ |W| ≤ 10,000 (here |W| denotes the length of the string W).
One line with the text T, a string over {'A', 'B', 'C', …, 'Z'}, with |W| ≤ |T| ≤ 1,000,000.
 
Output
For every test case in the input file, the output should contain a single number, on a single line: the number of occurrences of the word W in the text T.

 
Sample Input
3
BAPC
BAPC
AZA
AZAZAZA
VERDI
AVERDXIVYERDIAN
 
 
 1 #include <iostream>
 2 #include <stdlib.h>
 3 #include <stdio.h>
 4 #include <cstring>
 5 using namespace std;
 6 int n,m,nxt[10005],kk,t;
 7 char b[10005],a[1000005];
 8 ///此题在基础的kmp上加了多次匹配。
 9 ///就意味着我们在匹配完一次字串后,要跳到最适合的位置,继续查找
10 ///继续利用kmp的思想。某些位置已经匹配过,就不要匹配了。
11 ///       xxxxxxxabbaab*xxxxxx
12 ///              abbaaba
13 //我们跳跃之后的位置 abbaaba  而跳跃的位置与next数组有关
14 //kmp中主串的位置都没有被调动,只是next数组的下标被调动(自己写的代码乱动了,我又卖萌了。。。)
15 void buildnxt()
16 {
17    int j,k;
18    m=strlen(b);
19     nxt[0]=-1;
20     j=0;k=-1;
21     while(j<m)
22     {
23         if((k==-1)||b[j]==b[k])
24         {
25             j++;
26             k++;
27             nxt[j]=k;
28         }
29         else k=nxt[k];
30     }
31 }
32 int kmp()
33 {
34     int k=0,l=0,cou=0;
35     n=strlen(a);
36     /*int ans=m,kk=nxt[m];///ans在字串中下标,和起点距离ans+1
37     while(1)
38     {
39         if(kk!=0&&kk!=-1) {ans=kk;kk=nxt[ans];}
40         else break;
41     }///要找最小的跳跃点,所以从next尾端返回去找到首个非负值。
42 额 这个想法是没错,但是时间上还是不够优化。
43 对与最小跳跃点的话也就是中间跳跃点,最小跳跃点对应的字符串匹配失败。没必要要该点匹配。
44 于是我们最省事的做法还是直接往前跳一步,有可能匹配成功。
45 */
46     while(k<n)
47     {
48         if((l==-1)||a[k]==b[l])
49         {
50             k++;
51             l++;
52         }
53         else l=nxt[l];
54         if(l==m)
55         {
56             cou++;
57             /*if(kk==0) continue;///如果是尾端next数组是0的话,主串中匹配的子串中没有重复。
58             ///也就是说在匹配的主串中,没有可以跳跃的点。
59             if(k==n-1) break;///如果k已经是主串末尾了,就不能还有继续可以匹配的字串了。
60             k=k-l+ans;///k-l(起点)+ans
61             l=0;*/
62 l=nxt[l];//next跳到次大子串点重新匹配,跳过已经匹配好的部分
63         }
64     }
65     return cou;
66 }
67 int main()
68 {
69     scanf("%d",&t);
70     getchar();
71     while(t--)
72     {
73         gets(b);
74         gets(a);
75         memset(nxt,0,sizeof(nxt));
76         buildnxt();
77         printf("%d
",kmp());
78     }
79     return 0;
80 }
View Code
 
原文地址:https://www.cnblogs.com/linxhsy/p/4449084.html