2022-02-04 19:23:09 +01:00
# Using `asyncio-taskpool`
2022-02-05 18:02:32 +01:00
## Minimal example for `SimpleTaskPool`
2022-02-04 19:23:09 +01:00
2022-02-25 19:57:54 +01:00
With a `SimpleTaskPool` the function to execute as well as the arguments with which to execute it must be defined during its initialization (and they cannot be changed later). The only control you have after initialization is how many of such tasks are being run.
2022-02-08 16:15:55 +01:00
The minimum required setup is a "worker" coroutine function that can do something asynchronously, and a main coroutine function that sets up the `SimpleTaskPool` , starts/stops the tasks as desired, and eventually awaits them all.
2022-02-04 19:23:09 +01:00
The following demo code enables full log output first for additional clarity. It is complete and should work as is.
### Code
2022-02-08 23:09:33 +01:00
2022-02-04 19:23:09 +01:00
```python
import logging
import asyncio
2022-02-09 23:14:42 +01:00
from asyncio_taskpool import SimpleTaskPool
2022-02-04 19:23:09 +01:00
logging.getLogger().setLevel(logging.NOTSET)
logging.getLogger('asyncio_taskpool').addHandler(logging.StreamHandler())
async def work(n: int) -> None:
"""
Pseudo-worker function.
Counts up to an integer with a second of sleep before each iteration.
In a real-world use case, a worker function should probably have access
2022-02-09 23:14:42 +01:00
to some synchronisation primitive (such as a queue) or shared resource
to distribute work between an arbitrary number of workers.
2022-02-04 19:23:09 +01:00
"""
for i in range(n):
await asyncio.sleep(1)
2022-02-24 19:16:24 +01:00
print("> did", i)
2022-02-04 19:23:09 +01:00
async def main() -> None:
2022-02-25 19:57:54 +01:00
pool = SimpleTaskPool(work, args=(5,)) # initializes the pool; no work is being done yet
2022-02-06 13:08:39 +01:00
await pool.start(3) # launches work tasks 0, 1, and 2
2022-02-04 19:23:09 +01:00
await asyncio.sleep(1.5) # lets the tasks work for a bit
2022-02-06 13:08:39 +01:00
await pool.start() # launches work task 3
2022-02-04 19:23:09 +01:00
await asyncio.sleep(1.5) # lets the tasks work for a bit
2022-02-25 19:57:54 +01:00
pool.stop(2) # cancels tasks 3 and 2 (LIFO order)
2022-02-08 23:09:33 +01:00
pool.lock() # required for the last line
2022-02-24 19:16:24 +01:00
await pool.gather_and_close() # awaits all tasks, then flushes the pool
2022-02-04 19:23:09 +01:00
if __name__ == '__main__':
asyncio.run(main())
```
### Output
```
2022-02-05 18:02:32 +01:00
SimpleTaskPool-0 initialized
Started SimpleTaskPool-0_Task-0
Started SimpleTaskPool-0_Task-1
Started SimpleTaskPool-0_Task-2
2022-02-24 19:16:24 +01:00
> did 0
> did 0
> did 0
2022-02-05 18:02:32 +01:00
Started SimpleTaskPool-0_Task-3
2022-02-24 19:16:24 +01:00
> did 1
> did 1
> did 1
> did 0
> did 2
> did 2
2022-02-08 23:09:33 +01:00
SimpleTaskPool-0 is locked!
2022-02-05 18:02:32 +01:00
Cancelling SimpleTaskPool-0_Task-2 ...
Cancelled SimpleTaskPool-0_Task-2
Ended SimpleTaskPool-0_Task-2
2022-02-24 19:16:24 +01:00
Cancelling SimpleTaskPool-0_Task-3 ...
Cancelled SimpleTaskPool-0_Task-3
Ended SimpleTaskPool-0_Task-3
> did 3
> did 3
2022-02-05 18:02:32 +01:00
Ended SimpleTaskPool-0_Task-0
Ended SimpleTaskPool-0_Task-1
2022-02-24 19:16:24 +01:00
> did 4
> did 4
2022-02-04 19:23:09 +01:00
```
2022-02-08 16:15:55 +01:00
## Advanced example for `TaskPool`
2022-02-04 19:23:09 +01:00
2022-02-08 16:15:55 +01:00
This time, we want to start tasks from _different_ coroutine functions **and** with _different_ arguments. For this we need an instance of the more generalized `TaskPool` class.
As with the simple example, we need "worker" coroutine functions that can do something asynchronously, as well as a main coroutine function that sets up the pool, starts the tasks, and eventually awaits them.
The following demo code enables full log output first for additional clarity. It is complete and should work as is.
### Code
2022-02-08 23:09:33 +01:00
2022-02-08 16:15:55 +01:00
```python
import logging
import asyncio
2022-02-09 23:14:42 +01:00
from asyncio_taskpool import TaskPool
2022-02-08 16:15:55 +01:00
logging.getLogger().setLevel(logging.NOTSET)
logging.getLogger('asyncio_taskpool').addHandler(logging.StreamHandler())
async def work(start: int, stop: int, step: int = 1) -> None:
"""Pseudo-worker function counting through a range with a second of sleep in between each iteration."""
for i in range(start, stop, step):
await asyncio.sleep(1)
2022-02-24 19:16:24 +01:00
print("> work with", i)
2022-02-08 16:15:55 +01:00
async def other_work(a: int, b: int) -> None:
"""Different pseudo-worker counting through a range with half a second of sleep in between each iteration."""
for i in range(a, b):
await asyncio.sleep(0.5)
2022-02-24 19:16:24 +01:00
print("> other_work with", i)
2022-02-08 16:15:55 +01:00
async def main() -> None:
# Initialize a new task pool instance and limit its size to 3 tasks.
pool = TaskPool(3)
2022-02-25 19:57:54 +01:00
# Queue up two tasks (IDs 0 and 1) to run concurrently (with the same keyword-arguments).
2022-02-24 19:16:24 +01:00
print("> Called `apply` ")
2022-02-08 16:15:55 +01:00
await pool.apply(work, kwargs={'start': 100, 'stop': 200, 'step': 10}, num=2)
# Let the tasks work for a bit.
await asyncio.sleep(1.5)
# Now, let us enqueue four more tasks (which will receive IDs 2, 3, 4, and 5), each created with different
2022-02-25 19:57:54 +01:00
# positional arguments by using `starmap` , but we want no more than two of those to run concurrently.
2022-02-08 16:15:55 +01:00
# Since we set our pool size to 3, and already have two tasks working within the pool,
# only the first one of these will start immediately (and receive ID 2).
# The second one will start (with ID 3), only once there is room in the pool,
2022-02-24 19:16:24 +01:00
# which -- in this example -- will be the case after ID 2 ends.
2022-02-08 16:24:37 +01:00
# Once there is room in the pool again, the third and fourth will each start (with IDs 4 and 5)
2022-02-25 19:57:54 +01:00
# only once there is room in the pool and no more than one other task of these new ones is running.
2022-02-08 16:15:55 +01:00
args_list = [(0, 10), (10, 20), (20, 30), (30, 40)]
2022-02-19 16:02:50 +01:00
await pool.starmap(other_work, args_list, group_size=2)
2022-02-24 19:16:24 +01:00
print("> Called `starmap` ")
2022-02-08 23:09:33 +01:00
# Now we lock the pool, so that we can safely await all our tasks.
pool.lock()
2022-02-08 16:15:55 +01:00
# Finally, we block, until all tasks have ended.
2022-02-24 19:16:24 +01:00
print("> Calling `gather_and_close` ...")
await pool.gather_and_close()
print("> Done.")
2022-02-08 16:15:55 +01:00
if __name__ == '__main__':
asyncio.run(main())
```
### Output
Additional comments for the output are provided with `<---` next to the output lines.
(Keep in mind that the logger and `print` asynchronously write to `stdout` .)
```
TaskPool-0 initialized
Started TaskPool-0_Task-0
Started TaskPool-0_Task-1
2022-02-24 19:16:24 +01:00
> Called `apply`
> work with 100
> work with 100
> Called `starmap` <--- notice that this immediately returns, even before Task-2 is started
> Calling `gather_and_close`... <--- this blocks `main()` until all tasks have ended
2022-02-08 23:09:33 +01:00
TaskPool-0 is locked!
2022-02-24 19:16:24 +01:00
Started TaskPool-0_Task-2 < --- at this point the pool is full
> work with 110
> work with 110
> other_work with 0
> other_work with 1
> work with 120
> work with 120
> other_work with 2
> other_work with 3
> work with 130
> work with 130
> other_work with 4
> other_work with 5
> work with 140
> work with 140
> other_work with 6
> other_work with 7
> work with 150
> work with 150
> other_work with 8
Ended TaskPool-0_Task-2 < --- this frees up room for one more task from `starmap`
2022-02-08 16:15:55 +01:00
Started TaskPool-0_Task-3
2022-02-24 19:16:24 +01:00
> other_work with 9
> work with 160
> work with 160
> other_work with 10
> other_work with 11
> work with 170
> work with 170
> other_work with 12
> other_work with 13
> work with 180
> work with 180
> other_work with 14
> other_work with 15
2022-02-08 16:15:55 +01:00
Ended TaskPool-0_Task-0
2022-02-24 19:16:24 +01:00
Ended TaskPool-0_Task-1 < --- these two end and free up two more slots in the pool
Started TaskPool-0_Task-4 < --- since the group size is set to 2 , Task-5 will not start
> work with 190
> work with 190
> other_work with 16
> other_work with 17
> other_work with 20
> other_work with 18
> other_work with 21
Ended TaskPool-0_Task-3 < --- now that only Task-4 of the group remains , Task-5 starts
2022-02-08 16:15:55 +01:00
Started TaskPool-0_Task-5
2022-02-24 19:16:24 +01:00
> other_work with 19
> other_work with 22
> other_work with 23
> other_work with 30
> other_work with 24
> other_work with 31
> other_work with 25
> other_work with 32
> other_work with 26
> other_work with 33
> other_work with 27
> other_work with 34
> other_work with 28
> other_work with 35
> other_work with 29
> other_work with 36
2022-02-08 16:15:55 +01:00
Ended TaskPool-0_Task-4
2022-02-24 19:16:24 +01:00
> other_work with 37
> other_work with 38
> other_work with 39
2022-02-08 16:15:55 +01:00
Ended TaskPool-0_Task-5
2022-02-24 19:16:24 +01:00
> Done.
2022-02-08 16:15:55 +01:00
```
2022-02-09 23:14:42 +01:00
© 2022 Daniil Fajnberg