I am reading this book, Programming Interviews Exposed by John Mongan et. al and in chapter 6 they are discussing removing all instances of characters in a src string using a removal string, e.g. removeChars(string str, string remove)
.
They suggest to having a boolean lookup array with all values initially set to false, then loop through each character in remove
setting the corresponding value in the lookup array to true. (note: this could also be a hash if the possible character set where huge like Unicode-16 or something like that or if str
and remove are both relatively small… < 100 characters I suppose). You then iterate through the str
with a source and destination index, copying each character only if its corresponding value in the lookup array is false…
I understand their explanation, but I don't understand their code. They have
for(src = 0; src < len; ++src){
flags[r[src]] == true;
}
which turns the flag value at the remove
string indexed at src to true…
so if you start out with PLEASE HELP
as your str and LEA
as your remove you will be setting in your flag table at 0,1,2... t|t|t
but after that you will get an out of bounds exception because r doesn't have have anything greater than 2 in it. Even using their example you get an out of bounds exception. Is their code example unworkable?
Entire Function:
string removeChars( string str, string remove ){
char[] s = str.toCharArray();
char[] r = remove.toCharArray();
bool[] flags = new bool[128]; // assumes ASCII!
int len = s.Length;
int src, dst;
// Set flags for characters to be removed
for( src = 0; src < len; ++src ){
flags[r[src]] = true;
}
src = 0;
dst = 0;
// Now loop through all the characters,
// copying only if they aren’t flagged
while( src < len ){
if( !flags[ (int)s[src] ] ){
s[dst++] = s[src];
}
++src;
}
return new string( s, 0, dst );
}
r
comes from the remove string. So in my example the remove string has only a size of 3 while my str
string has a size of 11. len
is equal to the length of the str
string, which would be 11. How can I loop through the r
string since it is only size 3? I haven't compiled the code so I can loop through it, but just looking at it I know it won't work. I am thinking they wanted to loop through the r
string… in other words they got the length of the wrong string here.
Am I right? Can someone explain this to me?
Best Answer
There's not a lot of code here to work with and it's unlikely everyone has a copy of this book, so this is a guess! It would help if you would post more code.
Using your
PLEASE HELP
&LEA
example, I'd wager theflags
array is a lookup table indexed by the ASCII value of a character, and at the end would look like:Then when doing the removed copy the routine would see that
'P' is the ASCII value of character P.