Longest subsequence without string

algorithmsdynamic programmingstrings

Does there exist a dynamic programming algorithm to find the longest subsequence in a string X that does not contain Y as substring? Just that this problem seems so similar to other DP string algorithms such as longest common subsequence and string. It must be able to handle occurrences of Y that overlap.

It seems that this might be a 2-state DP problem, with the state [s_pos, t_pos] being the longest subsequence of string S starting at s_pos that does not have sting T[t_pos..M] as a substring. N is the length of string S and M is the length of string T. However, my transitions are not correct: it does not get the case where S=aaabc and T=aabc. The problem is in the else statement – I don't know how to transition if the characters are equal. Actually I feel that the if branch is wrong… anyone know what could be wrong?

It even fails the case S=aaab and T=aab. I can explain why it fails: assuming I call solve(0, 0). solve(0, 0) calls solve(1, 1). solve(1, 1) calls solve(2, 2). Since s[2] != t[2], it restarts the search from solve(3, 0). However, aab is a substring and it never checks this or considers this case…

int solve(int s_pos, int t_pos)
{
    if (s_pos >= N || t_pos >= M) return 0;
    if (t_pos == M - 1 && s[s_pos] == t[t_pos]) return 0;
    int ret = 0;
    if (s[s_pos] != t[t_pos])
    {
        int tmp = solve(s_pos + 1, 0);
        ret = max(ret, tmp + 1);
    }
    else
    {
        for (int i = s_pos + 1; i < N; i++)
        {
            int tmp = solve(i, t_pos + 1);
            if (tmp != 0)
            {
                ret = max(ret, 1 + tmp);
            }
        }
    }
    return ret;
}

Best Answer

This should do the trick. I wrote it in a meta code similar to basic.

LONGEST = ""
while N>0
  POS = find(T,S)
  if POS>0 then
    SUB = left(S,POS)
  else
    SUB = S
    POS = N
  endif
  if len(SUB) > len(LONGEST) then
    LONGEST = SUB
  endif
  N = N-POS
wend
Related Topic