In lua 5.1, you can iterate of the characters of a string this in a couple of ways.
The basic loop would be:
for i = 1, #str do
local c = str:sub(i,i)
-- do something with c
end
But it may be more efficient to use a pattern with string.gmatch()
to get an iterator over the characters:
for c in str:gmatch"." do
-- do something with c
end
Or even to use string.gsub()
to call a function for each char:
str:gsub(".", function(c)
-- do something with c
end)
In all of the above, I've taken advantage of the fact that the string
module is set as a metatable for all string values, so its functions can be called as members using the :
notation. I've also used the (new to 5.1, IIRC) #
to get the string length.
The best answer for your application depends on a lot of factors, and benchmarks are your friend if performance is going to matter.
You might want to evaluate why you need to iterate over the characters, and to look at one of the regular expression modules that have been bound to Lua, or for a modern approach look into Roberto's lpeg module which implements Parsing Expression Grammers for Lua.
No, setting the key's value to nil
is the accepted way of removing an item in the hashmap portion of a table. What you're doing is standard. However, I'd recommend not overriding table.remove()
- for the array portion of a table, the default table.remove() functionality includes renumbering the indices, which your override would not do. If you do want to add your function to the table
function set, then I'd probably name it something like table.removekey()
or some such.
Best Answer
In Lua patterns, the character class
%p
represents all punctuation characters, the character class%c
represents all control characters, and the character class%s
represents all whitespace characters. So you can represent all punctuation characters, all control characters, and all whitespace characters with the set[%p%c%s]
.To remove these characters from a string, you can use string.gsub. For a string
str
, the code would be the following:(Note that this is essentially the same as Egor's code snippet above.)