To understand what yield
does, you must understand what generators are. And before you can understand generators, you must understand iterables.
Iterables
When you create a list, you can read its items one by one. Reading its items one by one is called iteration:
>>> mylist = [1, 2, 3]
>>> for i in mylist:
... print(i)
1
2
3
mylist
is an iterable. When you use a list comprehension, you create a list, and so an iterable:
>>> mylist = [x*x for x in range(3)]
>>> for i in mylist:
... print(i)
0
1
4
Everything you can use "for... in...
" on is an iterable; lists
, strings
, files...
These iterables are handy because you can read them as much as you wish, but you store all the values in memory and this is not always what you want when you have a lot of values.
Generators
Generators are iterators, a kind of iterable you can only iterate over once. Generators do not store all the values in memory, they generate the values on the fly:
>>> mygenerator = (x*x for x in range(3))
>>> for i in mygenerator:
... print(i)
0
1
4
It is just the same except you used ()
instead of []
. BUT, you cannot perform for i in mygenerator
a second time since generators can only be used once: they calculate 0, then forget about it and calculate 1, and end calculating 4, one by one.
Yield
yield
is a keyword that is used like return
, except the function will return a generator.
>>> def create_generator():
... mylist = range(3)
... for i in mylist:
... yield i*i
...
>>> mygenerator = create_generator() # create a generator
>>> print(mygenerator) # mygenerator is an object!
<generator object create_generator at 0xb7555c34>
>>> for i in mygenerator:
... print(i)
0
1
4
Here it's a useless example, but it's handy when you know your function will return a huge set of values that you will only need to read once.
To master yield
, you must understand that when you call the function, the code you have written in the function body does not run. The function only returns the generator object, this is a bit tricky.
Then, your code will continue from where it left off each time for
uses the generator.
Now the hard part:
The first time the for
calls the generator object created from your function, it will run the code in your function from the beginning until it hits yield
, then it'll return the first value of the loop. Then, each subsequent call will run another iteration of the loop you have written in the function and return the next value. This will continue until the generator is considered empty, which happens when the function runs without hitting yield
. That can be because the loop has come to an end, or because you no longer satisfy an "if/else"
.
Your code explained
Generator:
# Here you create the method of the node object that will return the generator
def _get_child_candidates(self, distance, min_dist, max_dist):
# Here is the code that will be called each time you use the generator object:
# If there is still a child of the node object on its left
# AND if the distance is ok, return the next child
if self._leftchild and distance - max_dist < self._median:
yield self._leftchild
# If there is still a child of the node object on its right
# AND if the distance is ok, return the next child
if self._rightchild and distance + max_dist >= self._median:
yield self._rightchild
# If the function arrives here, the generator will be considered empty
# there is no more than two values: the left and the right children
Caller:
# Create an empty list and a list with the current object reference
result, candidates = list(), [self]
# Loop on candidates (they contain only one element at the beginning)
while candidates:
# Get the last candidate and remove it from the list
node = candidates.pop()
# Get the distance between obj and the candidate
distance = node._get_dist(obj)
# If distance is ok, then you can fill the result
if distance <= max_dist and distance >= min_dist:
result.extend(node._values)
# Add the children of the candidate in the candidate's list
# so the loop will keep running until it will have looked
# at all the children of the children of the children, etc. of the candidate
candidates.extend(node._get_child_candidates(distance, min_dist, max_dist))
return result
This code contains several smart parts:
The loop iterates on a list, but the list expands while the loop is being iterated. It's a concise way to go through all these nested data even if it's a bit dangerous since you can end up with an infinite loop. In this case, candidates.extend(node._get_child_candidates(distance, min_dist, max_dist))
exhaust all the values of the generator, but while
keeps creating new generator objects which will produce different values from the previous ones since it's not applied on the same node.
The extend()
method is a list object method that expects an iterable and adds its values to the list.
Usually we pass a list to it:
>>> a = [1, 2]
>>> b = [3, 4]
>>> a.extend(b)
>>> print(a)
[1, 2, 3, 4]
But in your code, it gets a generator, which is good because:
- You don't need to read the values twice.
- You may have a lot of children and you don't want them all stored in memory.
And it works because Python does not care if the argument of a method is a list or not. Python expects iterables so it will work with strings, lists, tuples, and generators! This is called duck typing and is one of the reasons why Python is so cool. But this is another story, for another question...
You can stop here, or read a little bit to see an advanced use of a generator:
Controlling a generator exhaustion
>>> class Bank(): # Let's create a bank, building ATMs
... crisis = False
... def create_atm(self):
... while not self.crisis:
... yield "$100"
>>> hsbc = Bank() # When everything's ok the ATM gives you as much as you want
>>> corner_street_atm = hsbc.create_atm()
>>> print(corner_street_atm.next())
$100
>>> print(corner_street_atm.next())
$100
>>> print([corner_street_atm.next() for cash in range(5)])
['$100', '$100', '$100', '$100', '$100']
>>> hsbc.crisis = True # Crisis is coming, no more money!
>>> print(corner_street_atm.next())
<type 'exceptions.StopIteration'>
>>> wall_street_atm = hsbc.create_atm() # It's even true for new ATMs
>>> print(wall_street_atm.next())
<type 'exceptions.StopIteration'>
>>> hsbc.crisis = False # The trouble is, even post-crisis the ATM remains empty
>>> print(corner_street_atm.next())
<type 'exceptions.StopIteration'>
>>> brand_new_atm = hsbc.create_atm() # Build a new one to get back in business
>>> for cash in brand_new_atm:
... print cash
$100
$100
$100
$100
$100
$100
$100
$100
$100
...
Note: For Python 3, useprint(corner_street_atm.__next__())
or print(next(corner_street_atm))
It can be useful for various things like controlling access to a resource.
Itertools, your best friend
The itertools module contains special functions to manipulate iterables. Ever wish to duplicate a generator?
Chain two generators? Group values in a nested list with a one-liner? Map / Zip
without creating another list?
Then just import itertools
.
An example? Let's see the possible orders of arrival for a four-horse race:
>>> horses = [1, 2, 3, 4]
>>> races = itertools.permutations(horses)
>>> print(races)
<itertools.permutations object at 0xb754f1dc>
>>> print(list(itertools.permutations(horses)))
[(1, 2, 3, 4),
(1, 2, 4, 3),
(1, 3, 2, 4),
(1, 3, 4, 2),
(1, 4, 2, 3),
(1, 4, 3, 2),
(2, 1, 3, 4),
(2, 1, 4, 3),
(2, 3, 1, 4),
(2, 3, 4, 1),
(2, 4, 1, 3),
(2, 4, 3, 1),
(3, 1, 2, 4),
(3, 1, 4, 2),
(3, 2, 1, 4),
(3, 2, 4, 1),
(3, 4, 1, 2),
(3, 4, 2, 1),
(4, 1, 2, 3),
(4, 1, 3, 2),
(4, 2, 1, 3),
(4, 2, 3, 1),
(4, 3, 1, 2),
(4, 3, 2, 1)]
Understanding the inner mechanisms of iteration
Iteration is a process implying iterables (implementing the __iter__()
method) and iterators (implementing the __next__()
method).
Iterables are any objects you can get an iterator from. Iterators are objects that let you iterate on iterables.
There is more about it in this article about how for
loops work.
TLDR
JavaScript has lexical (also called static) scoping and closures. This means you can tell the scope of an identifier by looking at the source code.
The four scopes are:
- Global - visible by everything
- Function - visible within a function (and its sub-functions and blocks)
- Block - visible within a block (and its sub-blocks)
- Module - visible within a module
Outside of the special cases of global and module scope, variables are declared using var
(function scope), let
(block scope), and const
(block scope). Most other forms of identifier declaration have block scope in strict mode.
Overview
Scope is the region of the codebase over which an identifier is valid.
A lexical environment is a mapping between identifier names and the values associated with them.
Scope is formed of a linked nesting of lexical environments, with each level in the nesting corresponding to a lexical environment of an ancestor execution context.
These linked lexical environments form a scope "chain". Identifier resolution is the process of searching along this chain for a matching identifier.
Identifier resolution only occurs in one direction: outwards. In this way, outer lexical environments cannot "see" into inner lexical environments.
There are three pertinent factors in deciding the scope of an identifier in JavaScript:
- How an identifier was declared
- Where an identifier was declared
- Whether you are in strict mode or non-strict mode
Some of the ways identifiers can be declared:
var
, let
and const
- Function parameters
- Catch block parameter
- Function declarations
- Named function expressions
- Implicitly defined properties on the global object (i.e., missing out
var
in non-strict mode)
import
statements
eval
Some of the locations identifiers can be declared:
- Global context
- Function body
- Ordinary block
- The top of a control structure (e.g., loop, if, while, etc.)
- Control structure body
- Modules
Declaration Styles
var
Identifiers declared using var
have function scope, apart from when they are declared directly in the global context, in which case they are added as properties on the global object and have global scope. There are separate rules for their use in eval
functions.
let and const
Identifiers declared using let
and const
have block scope, apart from when they are declared directly in the global context, in which case they have global scope.
Note: let
, const
and var
are all hoisted. This means that their logical position of definition is the top of their enclosing scope (block or function). However, variables declared using let
and const
cannot be read or assigned to until control has passed the point of declaration in the source code. The interim period is known as the temporal dead zone.
function f() {
function g() {
console.log(x)
}
let x = 1
g()
}
f() // 1 because x is hoisted even though declared with `let`!
Function parameter names
Function parameter names are scoped to the function body. Note that there is a slight complexity to this. Functions declared as default arguments close over the parameter list, and not the body of the function.
Function declarations
Function declarations have block scope in strict mode and function scope in non-strict mode. Note: non-strict mode is a complicated set of emergent rules based on the quirky historical implementations of different browsers.
Named function expressions
Named function expressions are scoped to themselves (e.g., for the purpose of recursion).
Implicitly defined properties on the global object
In non-strict mode, implicitly defined properties on the global object have global scope, because the global object sits at the top of the scope chain. In strict mode, these are not permitted.
eval
In eval
strings, variables declared using var
will be placed in the current scope, or, if eval
is used indirectly, as properties on the global object.
Examples
The following will throw a ReferenceError because the namesx
, y
, and z
have no meaning outside of the function f
.
function f() {
var x = 1
let y = 1
const z = 1
}
console.log(typeof x) // undefined (because var has function scope!)
console.log(typeof y) // undefined (because the body of the function is a block)
console.log(typeof z) // undefined (because the body of the function is a block)
The following will throw a ReferenceError for y
and z
, but not for x
, because the visibility of x
is not constrained by the block. Blocks that define the bodies of control structures like if
, for
, and while
, behave similarly.
{
var x = 1
let y = 1
const z = 1
}
console.log(x) // 1
console.log(typeof y) // undefined because `y` has block scope
console.log(typeof z) // undefined because `z` has block scope
In the following, x
is visible outside of the loop because var
has function scope:
for(var x = 0; x < 5; ++x) {}
console.log(x) // 5 (note this is outside the loop!)
...because of this behavior, you need to be careful about closing over variables declared using var
in loops. There is only one instance of variable x
declared here, and it sits logically outside of the loop.
The following prints 5
, five times, and then prints 5
a sixth time for the console.log
outside the loop:
for(var x = 0; x < 5; ++x) {
setTimeout(() => console.log(x)) // closes over the `x` which is logically positioned at the top of the enclosing scope, above the loop
}
console.log(x) // note: visible outside the loop
The following prints undefined
because x
is block-scoped. The callbacks are run one by one asynchronously. New behavior for let
variables means that each anonymous function closed over a different variable named x
(unlike it would have done with var
), and so integers 0
through 4
are printed.:
for(let x = 0; x < 5; ++x) {
setTimeout(() => console.log(x)) // `let` declarations are re-declared on a per-iteration basis, so the closures capture different variables
}
console.log(typeof x) // undefined
The following will NOT throw a ReferenceError
because the visibility of x
is not constrained by the block; it will, however, print undefined
because the variable has not been initialised (because of the if
statement).
if(false) {
var x = 1
}
console.log(x) // here, `x` has been declared, but not initialised
A variable declared at the top of a for
loop using let
is scoped to the body of the loop:
for(let x = 0; x < 10; ++x) {}
console.log(typeof x) // undefined, because `x` is block-scoped
The following will throw a ReferenceError
because the visibility of x
is constrained by the block:
if(false) {
let x = 1
}
console.log(typeof x) // undefined, because `x` is block-scoped
Variables declared using var
, let
or const
are all scoped to modules:
// module1.js
var x = 0
export function f() {}
//module2.js
import f from 'module1.js'
console.log(x) // throws ReferenceError
The following will declare a property on the global object because variables declared using var
within the global context are added as properties to the global object:
var x = 1
console.log(window.hasOwnProperty('x')) // true
let
and const
in the global context do not add properties to the global object, but still have global scope:
let x = 1
console.log(window.hasOwnProperty('x')) // false
Function parameters can be considered to be declared in the function body:
function f(x) {}
console.log(typeof x) // undefined, because `x` is scoped to the function
Catch block parameters are scoped to the catch-block body:
try {} catch(e) {}
console.log(typeof e) // undefined, because `e` is scoped to the catch block
Named function expressions are scoped only to the expression itself:
(function foo() { console.log(foo) })()
console.log(typeof foo) // undefined, because `foo` is scoped to its own expression
In non-strict mode, implicitly defined properties on the global object are globally scoped. In strict mode, you get an error.
x = 1 // implicitly defined property on the global object (no "var"!)
console.log(x) // 1
console.log(window.hasOwnProperty('x')) // true
In non-strict mode, function declarations have function scope. In strict mode, they have block scope.
'use strict'
{
function foo() {}
}
console.log(typeof foo) // undefined, because `foo` is block-scoped
How it works under the hood
Scope is defined as the lexical region of code over which an identifier is valid.
In JavaScript, every function-object has a hidden [[Environment]]
reference that is a reference to the lexical environment of the execution context (stack frame) within which it was created.
When you invoke a function, the hidden [[Call]]
method is called. This method creates a new execution context and establishes a link between the new execution context and the lexical environment of the function-object. It does this by copying the [[Environment]]
value on the function-object, into an outer reference field on the lexical environment of the new execution context.
Note that this link between the new execution context and the lexical environment of the function object is called a closure.
Thus, in JavaScript, scope is implemented via lexical environments linked together in a "chain" by outer references. This chain of lexical environments is called the scope chain, and identifier resolution occurs by searching up the chain for a matching identifier.
Find out more.
Best Answer
Python variables are scoped to the innermost function, class, or module in which they're assigned. Control blocks like
if
andwhile
blocks don't count, so a variable assigned inside anif
is still scoped to a function, class, or module.(Implicit functions defined by a generator expression or list/set/dict comprehension do count, as do lambda expressions. You can't stuff an assignment statement into any of those, but lambda parameters and
for
clause targets are implicit assignment.)