Advertisement
Not a member of Pastebin yet?
Sign Up,
it unlocks many cool features!
- Iterators are a special kind of class instance. When you "call" a function/method that contains the "yield" keyword, instead of running the function/method until a "return" keyword, it *instantly* returns a "iterator" class instance without running the function at all. And once you've stored the returned "iterator" instance to a variable, there are three ways you can interact with it. Note that iterators are *not* a list-like type. So you can't use the "[]" brackets to get values at a particular index.
- The first way is to call its "__next__()" method. What this does is that it runs the function you used to create the iterator. But it doesn't run the whole thing. It runs it until it hits a "yield" keyword. Then it evaluates the expression following the "yield", and it returns the result. And the NEXT time you call the "__next__()" method, it continues one line after where it stopped last time and runs until the NEXT "yield" keyword. One *important* thing is that, IF during any call, it reaches the end of the function/method used to create the iterator without hitting a "yield", a "StopIteration" exception is raised, which you'll have to handle with try except. Make sure to store the return value of "__next__()" to a different variable than the one you're using to store the iterator instance, obviously. Because if you don't, you'll lose your only way to obtain the iterator's remaining values.
- The second way is to use the global built-in function "next()", and pass the iterator instance to it as a parameter. What this does is essentially the same as the first. The function just calls the "__next__()" method for you and returns the value.
- The third way is to use a for loop. The for loop will continue to call the iterator's "__next__()" method until it receives a "StopIteration" exception. And instead of crashing and throwing the exception, it just exits the for loop. So this is the easiest way if you don't want to have do exception-handling yourself.
- There's actually a fourth way too. If you pass an iterator into the "list()" function, it'll create an empty list and repeated call the "__next__()" method and append the return value to the list until it reaches "StopIteration", then it returns the list. So basically this function gives you a list of all values the iterator returns, in the order of first to last. The same works for "tuple()", but obviously if you use that, it'll be read-only.
- So here's an example iterator function (Btw, if you want proof that iterator functions aren't run when you call them, just add a 'print' before the for loop. You'll see that nothing happens when you run the function, and the message won't be printed until the first time you call '__next__()'. And it'll also only print the first time you call next, since after that the interpreter won't go back to the beginning of the function):
- >>> def foo():
- ... for i in range(10):
- ... yield i + 1
- >>>
- Now let's create an iterator instance:
- >>> a = foo()
- >>> type(a)
- <class 'generator'>
- >>>
- As you can see, instead of actually running the function, it returns a special class instance. Now we'll call its "__next__()" method.
- >>> print(next(a))
- 1
- >>>
- Now the interpreter starts running the function at the beginning, encounters the for loop, 'i' is set to 0, then it hits a yield keyword, so it evaluates 'i + 1' and returns it (and '0 + 1' equals '1' obviously). Now, the print function receives the integer class instance '1', so it calls its "__str__()" method to convert it to the string "1", then it prints it to stdout.
- >>> print(next(a))
- 2
- >>>
- Now the interpreter goes back to one line after where it was before, which means it immediately hits the end of the first for loop iteration. So now 'i' is set to 1, and now ANOTHER "yield" keyword is hit. So 'i + 1' is evaluated to '2', and that's what the "__next__()" method returns. So now the integer instance gets converted to a string and printed to the screen.
- And obviously the same happens up to 10. Now lets skip to what happens when you call it again after it has already returned 10.
- ...
- >>> print(next(a))
- 10
- >>> print(next(a))
- Traceback (most recent call last):
- File "<pyshell#15>", line 1, in <module>
- next(b)
- StopIteration
- >>>
- As you can see, the last time you call it, it picks up where it left off inside the for loop where 'i = 9', the line after the "yield". That's the end of this iteration of the for loop, so it enters the next iteration. But then it discovers that it has already reached the end of the "range", so it exits the loop. Now it discovers that it has also reached the end of the function as well. So the function returns 'None'. And since the function returned without executing another "yield" statement, the "StopIteration" exception is thrown. And as it is an unhandled exception not wrapped in a "try" block, the error is printed out to stderr.
- And as for the second way, just do "a.__next__()" instead of "next(a)", so I don't think I have to explain further on that one.
- The third way is using a for loop as I said before:
- >>> for i in foo():
- ... print(i)
- 1
- 2
- 3
- 4
- 5
- 6
- 7
- 8
- 9
- 10
- >>>
- As you can see, I constructed a "iterator" class instance in-line and started looping through it with a for loop. The for loop repeatedly calls the "__next__()" method of "foo()" and runs the loop code until it reaches "StopIteration", then it catches the exception and exits.
- Now the last way:
- >>> list(foo())
- [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
- >>> tuple(foo())
- (1, 2, 3, 4, 5, 6, 7, 8, 9, 10)
- >>>
- As you can see, the 'list' or 'set' functions create an empty container, then repeated append the return values of each "__next__()" call to the end of the container. Then when it hits the "StopIteration" exception, it returns the container.
- And last thing of all, there are a few built-in functions in Python that are actually iterator functions, enumerate is one example:
- >>> a = enumerate("Hello, world!")
- >>> print(a[4])
- Traceback (most recent call last):
- File "<pyshell#2>", line 1, in <module>
- print(a[4])
- TypeError: 'enumerate' object is not subscriptable
- >>>
- You cannot get the value of an iterator function at a particular index because the interpreter doesn't know what the iterator function will return when called that many times. Like in this example, the interpreter has no way of knowing what 'a' will return when its "__next__()" method is called 5 times. So it causes an exception.
- And since 'enumerate' is an iterator, anything that works on iterators also work on enumerate, so if you want a simple way to convert it to a list, just do:
- >>> print(list(enumerate("Hello, world!")))
- [(0, 'H'), (1, 'e'), (2, 'l'), (3, 'l'), (4, 'o'), (5, ','), (6, ' '), (7, 'w'), (8, 'o'), (9, 'r'), (10, 'l'), (11, 'd'), (12, '!')]
- >>>
- Btw, last thing. If you want your iterator to stop iterating earlier, just return 'None' earlier. For example:
- >>> def foo():
- ... for i in range(10):
- ... if i == 5:
- ... return None
- ... else:
- ... yield i + 1
- >>> list(foo())
- [1, 2, 3, 4, 5]
- >>>
- As you can see, bc the function returns None at 'i == 5', so the iterator only returns values up to '4 + 1'. The sixth time you call "__next__()", the iterator returns "StopIteration", causing 'list' to return.
- I hope you understand how iterators work, and how to create user-defined iterator functions now :D
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement