R – c string multiple replacements within a character string

cstring

Lets say I have a string:

"(aaa and bbb or (aaa or aaa or bbb))"

**for simplicity sake, this will always be the format of the string, always 3 a's followed by a space or ')' or 3b's followed by a space or ')'.

what would be the best way to replace every occurence of 'aaa' with a '1' and everyoccurrence of 'bbb' with a '0' in C. ending string should look like:

"(1 and 0 or (1 or 1 or 0))"

EDIT Let me be more specific:

char* blah = (char *) malloc (8);
sprintf(blah, "%s", "aaa bbb");

blah = replace(blah);

how can I write replace so that it allocates space and stores a new string

"1 0"

Best Answer

The most efficient way is to use regex family which is POSIX. Some implementation will build a proper automaton for the patterns. Another way is to repeatedly use KMP or Boyer-Moore search, but you have to scan the string several times, which is less efficient. In addition, what results you want given such input: aa=1, ab=2, bb=3 on string "aabb"?

By the way, when you implement this function, a cleaner solution is to allocate a new dynamic C string and not to modify the original string while replacing. You could implement a in-place replacement, but that would be much more complicated.

regex_t r; regmatch_t match[2]; int last = 0;
regcomp(&r, "(aaa|bbb)", REG_EXTENDED);
insert(hashtable, "aaa", "0"); insert(hashtable, "bbb", "1");
while (regexec(&r, oristr, 1, match, 0) != REG_NOMATCH) {
  char *val;
  strncat(newstr, oristr + last, match->rm_so);
  lookup(hashtable, oristr + match->rm_so, match->rm_eo - match->rm_so, &val);
  last = match->rm_eo;
  strncat(newstr, val);
}
strcat(newstr, oristr + last);
oristr = realloc(oristr, strlen(newstr));
strcpy(oristr, newstr); free(newstr); regfree(&r);

In practical implementation, you should change the size of newstr dynamically. You should record the end of newstr rather than using strcat/strlen. The source code may be buggy as I have not really tried it. But the idea is there. This is the most efficient implementation I can think of.