There's a very Pythonic way to write that expression without explicitly writing a try-except block for a StopIteration
:
# some_iterable is some collection that can be iterated over
# e.g., a list, sequence, dict, set, itertools.combination(...)
for value in some_iterable:
print(value)
You can read up on the relevant PEPs 234 255 if you want to know more behind why StopIteration
was introduced and the logic behind iterators.
A general principle in python is to have one way to do something (see import this
), and preferably its beautiful, explicit, readable, and simple, which the pythonic method satisfies. Your equivalent code is only necessary as python doesn't give iterators a hasNext
member function; preferring people to just loop through the iterators directly (and if you need to do something else to just try reading it and catch an exception).
This automatic catching of an StopIteration
exception at the end of an iterator makes sense and is an analogue of the EOFError
raised if you read past an end of file.
One technique that can be used to perform clustering on multi-dimensional numeric data is the Kohonen self-organising feature map. It's a little too involved to describe here, but should be included in any beginner's level text on machine learning.
This just leaves the problem of how to convert your data to numeric form. To do this, I'd first run an an analysis to find a reasonable number (say 100) of words that appear in many of your strings, but not too many. You're looking for words in the middle of the frequency distribution, as these carry the most useful information. You can then use the presence or absence of these words as inputs to your feature map.
Best Answer
That visualizer is not showing the string data on the stack. It is showing the local references to the heap data as part of the call stack. This is very similar to Java where
String
references are local variables that point to actualString
objects on the heap.The visualizer is free to make any kind of representation simplification it cares to. It does not imply that "Python strings are allocated on the stack" in any given implementation of Python.