Word counter
Aogbogcog
27 Oct 2024, 21:25?
ice4
03 Nov 2024, 16:14?
daeun
03 Nov 2024, 16:19in room object,
at after entering the room column,
click the code view,
type
get input {
text = result
totalLength = LengthOf(text)
textWithoutSpaces = Replace(text, " ", "")
lengthWithoutSpaces = LengthOf(textWithoutSpaces)
spaceCount = totalLength - lengthWithoutSpaces
msg ("Number of spaces: " + spaceCount)
numberofwords = spaceCount+1
msg ("Number of words: " + numberofwords)
}
Test the code,
type in "test test test",
you will get
Number of spaces: 2
Number of words: 3
Obviously, the number of spaces are redundant but I need to showcase you how the method works
Create a function named wordcounter
type in the slim down code without msg player about number of spaces
get input {
text = result
totalLength = LengthOf(text)
textWithoutSpaces = Replace(text, " ", "")
lengthWithoutSpaces = LengthOf(textWithoutSpaces)
spaceCount = totalLength - lengthWithoutSpaces
numberofwords = spaceCount+1
msg ("Number of words: " + numberofwords)
}
This is for a more flexible wordcounter so you can just call the function whenever you need it
mrangel
04 Nov 2024, 09:50Counting words isn't necessarily the same as counting spaces. This function will give odd counts if you feed it a string with multiple spaces between words; or with spaces at the beginning or end. Or punctuation marks instead of spaces.
Here's a (somewhat slower) way to count words in a string:
<function name="CountWords" parameters="input" type="int"><![CDATA[
result = 0
while (IsRegexMatch ("\\w", input)) {
result = result + 1
split = Populate ("^\\W*\\w++\\W*(?<remainder>.*)", input, "firstword")
input = DictionaryItem (split, "remainder")
}
return (result)
]]></function>
This uses the regular expression patterns \w++
(matches any complete word) and \W*
(matches a block of nonword characters - including spaces and punctuation). So the call to IsRegexMatch
checks if there is a word character (\w
, any letter or digit) in the string. If so, Populate
removes the first word and any spaces/punctuation from either side of it, and stores the part of the string that still needs to be counted in the remainder
subpattern.
ice4
04 Nov 2024, 14:07I changed mrangel's coding into the following copy and paste code view code,
I am not sure if I did it right,
- I rearranged the 'input' and 'result' as quest app recognize 'result' instead
- I added in get input {} function
- Quest app sounds an error, so I changed
\++
to\+
get input {
text = result
count = 0
while (IsRegexMatch("\\w", text)) {
count = count + 1
split = Populate("^\\W*\\w+\\W*(?<remainder>.*)", text, "firstword")
text = DictionaryItem(split, "remainder")
}
msg ("Number of words: " + count)
}
K.V.
04 Nov 2024, 14:13You could also probably add his function, then just use it in your code like this:
get input {
msg("Number of words: " + CountWords(result))
}
mrangel
04 Nov 2024, 19:17Huh… an error? Does that mean the regex engine that Quest uses doesn't support possessive (sticky) quantifiers?
I don't think it'll make any difference in this case, \w+
should work just as well. But I thought that had been part of the regex standard for a very long time.
In anyone is wondering about the distinction:
\w
matches a single word character\w+
matches one or more word characters\w++
matches one or more word characters that are not followed by any more word characters- (
\W
is the opposite of\w
, matching non-word characters in the same way that\S
matches non-space characters, and\D
matches non-digits)
K.V.
04 Nov 2024, 22:31EDIT: I initially copied and pasted the wrong thing here.
This is nifty.
I made that one change and wrapped it in <![CDATA[[]]>
, and it seems to work flawlessly.
Gng
05 Nov 2024, 07:53\w matches a single word character
\w+ matches one or more word characters
\w++ matches one or more word characters that are not followed by any more word characters
(\W is the opposite of \w, matching non-word characters in the same way that \S matches non-space characters, and \D matches non-digits)
We need more tutorials on regular expression ._.