poj1743Musical Theme

Description

A musical melody is represented as a sequence of N (1<=N<=20000)notes that are integers in the range 1..88, each representing a key on the piano. It is unfortunate but true that this representation of melodies ignores the notion of musical timing; but, this programming task is about notes and not timings.
Many composers structure their music around a repeating &qout;theme&qout;, which, being a subsequence of an entire melody, is a sequence of integers in our representation. A subsequence of a melody is a theme if it:
is at least five notes long
appears (potentially transposed – see below) again somewhere else in the piece of music
is disjoint from (i.e., non-overlapping with) at least one of its other appearance(s)

Transposed means that a constant positive or negative value is added to every note value in the theme subsequence.
Given a melody, compute the length (number of notes) of the longest theme.
One second time limit for this problem’s solutions!

Input
The input contains several test cases. The first line of each test case contains the integer N. The following n integers represent the sequence of notes.
The last test case is followed by one zero.

Output
For each test case, the output file should contain a single line with a single integer that represents the length of the longest theme. If there are no themes, output 0.

Sample Input
30
25 27 30 34 39 45 52 60 69 79 69 60 52 45 39 34 30 26 22 18
82 78 74 70 66 67 64 60 65 80
0

Sample Output
5

翻译:
题意:有N(1 <= N <=20000)个音符的序列来表示一首乐曲,每个音符都是1~88范围内的整数,现在要找一个重复的主旋律。“主旋律”是整个音符序列的一个子串,它需要满足如下条件:
1.长度至少为5个音符。
2.在乐曲中重复出现。(可能经过变调,“变调”的意思是主题序列中每个音符都被加上或减去了同一个整数值)
3.重复出现的同一主题不能有公共部分。

分析:后缀数组+二分答案
首先我们解决转调的问题,原数据为num,新数据为num[i]=num[i+1]-num[i],这样一来即使原先的字串是进行过加或减那他们的差值也会相同,
然后把问题转化为不可重叠最长重复子串,用后缀数组来做。
所以我们二分答案,把题目变成判定性问题:判断是否存在两个长度为l的子串是相同的,且不重叠。
将height数组分组,每组内的后缀之间的height都要大于l,
原因在于对于长度l,如果某一个height[k] < l,则公共前缀大于l的两个i,j(这里i,j都是排名)绝对不会位于k的两侧,
所以答案一定在一个组内,如果每组内的后缀之间的最长公共前缀有大于k的,
而且这两个后缀的sa(在字符串中的位置)之差大于k,也就是他们之间一定有k长度的字符串完全不相同,
就不会存在最长公共前缀有包含现象,就说明存在长度至少为k的不重复子串

注意:不要乱抄别人的代码,容易引起变量名的混乱(我就是因为看了某大神的AC代码,len和n两个变量傻傻分不清,所以WA了一天。。。大神勿喷此蒟蒻)

这里写代码片
#include<cstdio>
#include<cstring>
#include<iostream>

using namespace std;

const int N=20050;
int sa[N],rank[N],a[N],b[N],hei[N],cc[N];
int num[N],len;

int cmp(int *y,int a,int b,int k)
{
    int ra1=y[a];
    int rb1=y[b];
    int ra2= a+k>=len ? -1:y[a+k];
    int rb2= b+k>=len ? -1:y[b+k];
    return ra1==rb1&&ra2==rb2;  //
}

void make_sa(int *r,int len)
{
    int i,p,m,k,*x=a,*y=b,*t;
    m=200;  //因为在处理变调时+89,原来就有88个音符,所以m=200 
    for (i=0;i<m;i++) cc[i]=0;
    for (i=0;i<len;i++) ++cc[x[i]=r[i]];
    for (i=1;i<m;i++) cc[i]+=cc[i-1];
    for (i=len-1;i>=0;i--) sa[--cc[x[i]]]=i;
    for (k=1;k<=len;k<<=1)
    {
        p=0;
        for (i=len-k;i<len;i++) y[p++]=i;
        for (i=0;i<len;i++)
            if (sa[i]>=k) y[p++]=sa[i]-k;
        for (i=0;i<m;i++) cc[i]=0;
        for (i=0;i<len;i++) ++cc[x[y[i]]];
        for (i=1;i<m;i++) cc[i]+=cc[i-1];
        for (i=len-1;i>=0;i--) sa[--cc[x[y[i]]]]=y[i];
        t=x;x=y;y=t;
        x[sa[0]]=0;
        p=1;
        for (i=1;i<len;i++)  //循环条件要写对 
        {
            x[sa[i]]=cmp(y,sa[i-1],sa[i],k) ? p-1:p++;  //先是sa[i-1],再是sa[i] 
        }
        if (p>=len) break;
        m=p;
    }
    return;
}

void make_hei(int len)
{
    int i,k=0;
    for (i=0;i<=len;i++) rank[sa[i]]=i;
    hei[0]=0;
    for (i=0;i<len;i++)
    {
        if (!rank[i]) continue;
        int j=sa[rank[i]-1];  //sa[rank[i]-1]
        if (k) k--;
        while (num[j+k]==num[i+k]) k++;
        hei[rank[i]]=k;
    }
    return;
}

bool pd(int le)
{
    int i=2,ma,mi;
    while (1)
    {
        while (i<=len&&hei[i]<le) i++;  //这些是不符合二分答案的分组 
        if (i>len) return 0;
        ma=sa[i-1];  
        mi=sa[i-1];  
        while (i<=len&&hei[i]>=le)//这组内的后缀之间的height都大于k
        {
            ma=max(ma,sa[i]);
            mi=min(mi,sa[i]);
//sa(在字符串中的位置)之差大于k,也就是他们之间一定至少有k长度的字符串完全不相同,
//就不会存在最长公共前缀有包含现象,就说明存在长度至少为k的不重复子串 
            i++;
        }
        if (ma-mi>=le) return 1;
    }
    return 0;
}

void solve()  //二分 
{
    int l,r,mid,ans;
    l=4;  //最长不重叠的重复子序列最短为5 
    r=(len-1)/2;  // 
    while (l<r)
    {
        mid=(l+r+1)>>1;
        if (pd(mid))
           l=mid;
        else 
           r=mid-1;
    }
    ans=l<4 ? 0:l+1;
    printf("%d
",ans);
    return;
}

int main()
{
    while (scanf("%d",&len)&&len!=0)
    {
        for (int i=0;i<len;i++) scanf("%d",&num[i]);
        if (len<10)
        {
            printf("0
");continue;
        }
        len--;  //这些地方要注意 
        for (int i=0;i<len;i++)
            num[i]=num[i+1]-num[i]+89; //消除变调的的干扰,+89是处理后防止出现负数 
        num[len]=0; //
        make_sa(num,len+1);
        make_hei(len);
        solve();
    }
    return 0;
} 
原文地址:https://www.cnblogs.com/wutongtong3117/p/7673642.html