IMPORTANT NOTE: You have to sort your data first.
The part I didn't get is that in the example construction
groups = []
uniquekeys = []
for k, g in groupby(data, keyfunc):
groups.append(list(g)) # Store group iterator as a list
uniquekeys.append(k)
k
is the current grouping key, and g
is an iterator that you can use to iterate over the group defined by that grouping key. In other words, the groupby
iterator itself returns iterators.
Here's an example of that, using clearer variable names:
from itertools import groupby
things = [("animal", "bear"), ("animal", "duck"), ("plant", "cactus"), ("vehicle", "speed boat"), ("vehicle", "school bus")]
for key, group in groupby(things, lambda x: x[0]):
for thing in group:
print("A %s is a %s." % (thing[1], key))
print("")
This will give you the output:
A bear is a animal.
A duck is a animal.
A cactus is a plant.
A speed boat is a vehicle.
A school bus is a vehicle.
In this example, things
is a list of tuples where the first item in each tuple is the group the second item belongs to.
The groupby()
function takes two arguments: (1) the data to group and (2) the function to group it with.
Here, lambda x: x[0]
tells groupby()
to use the first item in each tuple as the grouping key.
In the above for
statement, groupby
returns three (key, group iterator) pairs - once for each unique key. You can use the returned iterator to iterate over each individual item in that group.
Here's a slightly different example with the same data, using a list comprehension:
for key, group in groupby(things, lambda x: x[0]):
listOfThings = " and ".join([thing[1] for thing in group])
print(key + "s: " + listOfThings + ".")
This will give you the output:
animals: bear and duck.
plants: cactus.
vehicles: speed boat and school bus.
When you write [x]*3
you get, essentially, the list [x, x, x]
. That is, a list with 3 references to the same x
. When you then modify this single x
it is visible via all three references to it:
x = [1] * 4
l = [x] * 3
print(f"id(x): {id(x)}")
# id(x): 140560897920048
print(
f"id(l[0]): {id(l[0])}\n"
f"id(l[1]): {id(l[1])}\n"
f"id(l[2]): {id(l[2])}"
)
# id(l[0]): 140560897920048
# id(l[1]): 140560897920048
# id(l[2]): 140560897920048
x[0] = 42
print(f"x: {x}")
# x: [42, 1, 1, 1]
print(f"l: {l}")
# l: [[42, 1, 1, 1], [42, 1, 1, 1], [42, 1, 1, 1]]
To fix it, you need to make sure that you create a new list at each position. One way to do it is
[[1]*4 for _ in range(3)]
which will reevaluate [1]*4
each time instead of evaluating it once and making 3 references to 1 list.
You might wonder why *
can't make independent objects the way the list comprehension does. That's because the multiplication operator *
operates on objects, without seeing expressions. When you use *
to multiply [[1] * 4]
by 3, *
only sees the 1-element list [[1] * 4]
evaluates to, not the [[1] * 4
expression text. *
has no idea how to make copies of that element, no idea how to reevaluate [[1] * 4]
, and no idea you even want copies, and in general, there might not even be a way to copy the element.
The only option *
has is to make new references to the existing sublist instead of trying to make new sublists. Anything else would be inconsistent or require major redesigning of fundamental language design decisions.
In contrast, a list comprehension reevaluates the element expression on every iteration. [[1] * 4 for n in range(3)]
reevaluates [1] * 4
every time for the same reason [x**2 for x in range(3)]
reevaluates x**2
every time. Every evaluation of [1] * 4
generates a new list, so the list comprehension does what you wanted.
Incidentally, [1] * 4
also doesn't copy the elements of [1]
, but that doesn't matter, since integers are immutable. You can't do something like 1.value = 2
and turn a 1 into a 2.
Best Answer
Here A and B are names of nodes