Differences between revisions 23 and 29 (spanning 6 versions)
Revision 23 as of 2015-01-13 17:58:07
Size: 6921
Comment: added asyncio, shortened GIL mention (redundant with dedicated page)
Revision 29 as of 2020-07-26 19:54:21
Size: 8783
Editor: ChrisM
Comment: Rewrote (faulty) intro, defined concurrency for beginners & improved concurrency motivation
Deletions are marked like this. Additions are marked like this.
Line 2: Line 2:
Central clearing house for Concurency related issues and discussion Concurrency in programming means that multiple computations happen at the same time. For example, you may have multiple Python programs running on your computer. Or you may connect multiple computers via a network (e.g., Ethernet) that work together towards a common objective (e.g., distributed data analytics). As all concurrent programs can access shared resources (e.g., files) at the same time, this can lead to a variety of problems and challenges (e.g., race conditions).

The goal of this article is to clear concurrency-related questions, problems, or issues you may have. It also serves as a first starting point for further discussions on concurrency in Python.
Line 5: Line 7:
 * [[http://mail.python.org/mailman/listinfo/concurrency-sig|Concurrency SIG Mailing List]]
Line 11: Line 12:
== Talks ==
 * [[https://www.youtube.com/watch?v=MCs5OvhV9S4|Concurrency from the Ground Up - LIVE!]] DavidBeazley blows some minds at Pycon 2015
 * [[https://www.youtube.com/watch?v=9zinZmE3Ogk|Keynote on Concurrency]] RaymondHettinger at PyBay 2017
 * [[https://www.youtube.com/watch?v=31fXwpb0P9c|Implementing Concurrency & Parallelism from the Ground Up]] Pycon 2017
 * [[https://www.youtube.com/watch?v=Z_OAlIhXziw|A Curious Course on Coroutines and Concurrency]] A Beazley tutorial from 2009.

Line 12: Line 20:
 * [[http://docs.python.org/library/multiprocessing.html | multiprocessing ]] (available in CPython 2.6+ and 3, also in recent PyPy 1.5 alpha builds)
 * [[http://docs.python.org/library/threading.html | threading]] - make your GIL-burdened life easy and your code fast: use multiprocessing. ;)
 * [[http://docs.python.org/3/library/concurrent.futures.html | concurrent.futures]] (available in CPython 3.2 and up)
 * [[http://docs.python.org/3/library/asyncio.html | asyncio]] (available in CPython 3.4 and up) - single-threaded concurrency
 * [[http://docs.python.org/library/multiprocessing.html|multiprocessing]] (available in CPython 2.6+ and 3, also in recent PyPy 1.5 alpha builds)
 * [[http://docs.python.org/library/threading.html|threading]] - make your GIL-burdened life easy and your code fast: use multiprocessing. ;)
 * [[http://docs.python.org/3/library/concurrent.futures.html|concurrent.futures]] (available in CPython 3.2 and up)
 * [[http://docs.python.org/3/library/asyncio.html|asyncio]] (available in CPython 3.4 and up) - single-threaded concurrency
 * async/await simpler syntatic sugar for coroutines (Python 3.5+)
Line 17: Line 26:
== Frameworks ==
See also: ParallelProcessing, [[http://pypi.python.org/pypi?:action=browse&show=all&c=450 | Pypi Distributed Computing Trove]]
== Popular Frameworks ==
See also: ParallelProcessing, [[http://pypi.python.org/pypi?:action=browse&show=all&c=450|Pypi Distributed Computing Trove]]
Line 21: Line 30:
 * [[http://asyncoro.sourceforge.net | asyncoro]] - Framework for asynchronous, concurrent, distributed and network programming.
 * [[http://code.google.com/p/fibra/ | Fibra]] - Fibra provides advanced cooperatve concurrency using Python generators.
 * [[http://code.google.com/p/cogen/ | Cogen]] - cogen is a crossplatform library for network oriented, coroutine based programming using the enhanced generators. The project aims to provide a simple straightforward programming model similar to threads but without all the problems and costs.
 * [[http://pypi.python.org/pypi/greenlet | Greenlet]] - The "greenlet" package is a spin-off of StacklessPython, a version of CPython that supports micro-threads called "tasklets". Tasklets run pseudo-concurrently (typically in a single or a few OS-level threads) and are synchronized with data exchanges on "channels". [[http://codespeak.net/py/0.9.2/greenlet.html | Documentation]]
 * [[http://www.gevent.org/ | Gevent]] - coroutine-based networking library that uses greenlet to provide a high-level synchronous API on top of libevent event loop. It features standard library-inspired API and high performance.
 * [[http://wiki.secondlife.com/wiki/Eventlet | Eventlet]] - Eventlet is a networking library written in Python. It achieves high scalability by using non-blocking io while at the same time retaining high programmer usability by using coroutines to make the non-blocking io operations appear blocking at the source code level.
 * [[http://trac.softcircuit.com.au/circuits/ | Circuits]] - circuits is a Lightweight, Event driven Framework with a strong Component Architecture.
 * [[http://twistedmatrix.com | Twisted]] - Full-featured and well-tested asynchronous networking library. Concurrency model: Asynchronous code written around central objects known as Deferreds. Cooperative task scheduler via the [[http://twistedmatrix.com/trac/browser/trunk/twisted/internet/task.py#L182 | Cooperator]] object. Time based scheduling with [[http://twistedmatrix.com/trac/browser/trunk/twisted/internet/task.py#L23 | LoopingCall]] and reactor.callLater.
 * [[http://www.kamaelia.org/Home | Kamaelia]] - In Kamaelia you build systems from simple components that talk to each other. This speeds development, massively aids maintenance and also means you build naturally concurrent software. It's intended to be accessible by any developer, including novices. It also makes it fun.
 * [[http://opensource.hyves.org/concurrence/ | Concurrence]] - Concurrence is a framework for creating massively concurrent network applications in Python. It takes a Lightweight-tasks-with-message-passing approach to concurrency.
 * [[http://www.parallelpython.com/ | Parallel Python]] - Parallel Python is a python module which provides mechanism for parallel execution of python code on SMP (systems with multiple processors or cores) and clusters (computers connected via network).
 * [[http://www.boddie.org.uk/python/pprocess/tutorial.html | pprocess]] - The pprocess module provides elementary support for parallel programming in Python using a fork-based process creation model in conjunction with a channel-based communications model implemented using socketpair and poll.
 * [[http://code.google.com/p/pysage | pysage]] - lightweight high-level message passing library supporting actor based concurrency
 * [[http://www.pypes.org | pypes]] - Pypes is a framework for designing highly scalable concurrent Flow-Based applications. It uses a component model (Stackless tasklets) to achieve modularity (very loose coupling) along with the 2.6 multiprocessing module to achieve parallelism (multi-processing support). The goals are focused on performance (capable of processing millions of documents) and ease of use. Building custom components is simple and pypes offers Paste templates for generating boiler plate component code along with supporting build files (plugins/components are eggs). Pypes also provides a full [[https://sourceforge.net/project/screenshots.php?group_id=271766 | Web 2.0 WSGI user interface]] that allows users to create, destroy, configure, and connect components together using simple drag-n-drop functionality from any standards based web browser.
 * [[http://dieselweb.org/ | diesel]] - Diesel is a framework for writing network applications using asynchronous I/O in Python. It uses Python's generators to provide a friendly syntax for coroutines and continuations. It performs well and handles high concurrency with ease.
 * [[http://chiral.j4cbo.com/ | Chiral]] - Chiral is a lightweight coroutine-based networking framework for high-performance internet and Web services.
 * [[http://twistedmatrix.com|Twisted]] - Full-featured and well-tested asynchronous networking library. Concurrency model: Asynchronous code written around central objects known as Deferreds. Cooperative task scheduler via the [[http://twistedmatrix.com/trac/browser/trunk/twisted/internet/task.py#L182|Cooperator]] object. Time based scheduling with [[http://twistedmatrix.com/trac/browser/trunk/twisted/internet/task.py#L23|LoopingCall]]and reactor.callLater.
 * [[https://github.com/dabeaz/curio|curio]] the coroutine library you were warned about
 * [[https://github.com/python-trio/trio|trio]] Pythonic asynchronous I/O for humans and snakes
 * [[http://pypi.python.org/pypi/greenlet|Greenlet]] - The "greenlet" package is a spin-off of StacklessPython, a version of CPython that supports micro-threads called "tasklets". Tasklets run pseudo-concurrently (typically in a single or a few OS-level threads) and are synchronized with data exchanges on "channels". [[http://codespeak.net/py/0.9.2/greenlet.html|Documentation]]
 * [[http://www.gevent.org/|Gevent]] - coroutine-based networking library that uses greenlet to provide a high-level synchronous API on top of libevent event loop. It features standard library-inspired API and high performance.
 * [[http://wiki.secondlife.com/wiki/Eventlet|Eventlet]] - Eventlet is a networking library written in Python. It achieves high scalability by using non-blocking io while at the same time retaining high programmer usability by using coroutines to make the non-blocking io operations appear blocking at the source code level.
 * [[http://www.tornadoweb.org/en/stable/|Tornado]] high performance web framework and asynchronous networking library capable of scaling to tens of thouands of open connections
 * [[https://github.com/ray-project/ray|Ray]] - Parallel (and distributed) process-based execution framework which uses a lightweight API based on dynamic task graphs and actors to flexibly express a wide range of applications. Uses shared-memory and zero-copy serialization for efficient data handling. Supports low-latency and high-throughput task scheduling. Includes higher-level libraries for machine learning and AI applications.
Line 39: Line 40:
== Other Frameworks ==
 * [[http://asyncoro.sourceforge.net|asyncoro]] - Framework for asynchronous, concurrent, distributed and network programming.
 * [[http://code.google.com/p/fibra/|Fibra]] - Fibra provides advanced cooperative concurrency using Python generators.
 * [[http://code.google.com/p/cogen/|Cogen]] - cogen is a crossplatform library for network oriented, coroutine based programming using the enhanced generators. The project aims to provide a simple straightforward programming model similar to threads but without all the problems and costs.
 * [[http://trac.softcircuit.com.au/circuits/|Circuits]] - circuits is a Lightweight, Event-driven Framework with a strong Component Architecture.
 * [[http://www.kamaelia.org/Home|Kamaelia]] - In Kamaelia you build systems from simple components that talk to each other. This speeds development, massively aids maintenance and also means you build naturally concurrent software. It's intended to be accessible by any developer, including novices. It also makes it fun.
 * [[http://opensource.hyves.org/concurrence/|Concurrence]] - Concurrence is a framework for creating massively concurrent network applications in Python. It takes a Lightweight-tasks-with-message-passing approach to concurrency.
 * [[http://www.parallelpython.com/|Parallel Python]] - Parallel Python is a python module which provides a mechanism for parallel execution of python code on SMP (systems with multiple processors or cores) and clusters (computers connected via network).
 * [[http://www.boddie.org.uk/python/pprocess/tutorial.html|pprocess]] - The pprocess module provides elementary support for parallel programming in Python using a fork-based process creation model in conjunction with a channel-based communications model implemented using socketpair and poll.
 * [[http://code.google.com/p/pysage|pysage]] - lightweight high-level message passing library supporting actor based concurrency
 * [[http://www.pypes.org|pypes]] - Pypes is a framework for designing highly scalable concurrent Flow-Based applications. It uses a component model (Stackless tasklets) to achieve modularity (very loose coupling) along with the 2.6 multiprocessing module to achieve parallelism (multi-processing support). The goals are focused on performance (capable of processing millions of documents) and ease of use. Building custom components is simple and pypes offers Paste templates for generating boiler plate component code along with supporting build files (plugins/components are eggs). Pypes also provides a full [[https://sourceforge.net/project/screenshots.php?group_id=271766|Web 2.0 WSGI user interface]] that allows users to create, destroy, configure, and connect components together using simple drag-n-drop functionality from any standards based web browser.
 * [[http://dieselweb.org/|diesel]] - Diesel is a framework for writing network applications using asynchronous I/O in Python. It uses Python's generators to provide a friendly syntax for coroutines and continuations. It performs well and handles high concurrency with ease.
 * [[http://chiral.j4cbo.com/|Chiral]] - Chiral is a lightweight coroutine-based networking framework for high-performance internet and Web services.
Line 40: Line 55:
 * '''Using Coroutines to Create Efficient, High-Concurrency Web Applications''' - Matt Spitz, Pycon 2011 [[http://blip.tv/file/4883016 | Video]]
 * '''Concurrency and Distributed Systems''' - Jesse Noller, Pycon 2009 [[http://jessenoller.com//code/pycon_jnoller_distributed.pdf | Slides]] [[http://pycon.blip.tv/file/1947511/ | Video]]
 * '''Introduction to Multiprocessing''' - Jesse Noller, Pycon 2009 [[http://jessenoller.com/code/pycon_jnoller_multiprocessing.pdf | Slides]] [[http://pycon.blip.tv/file/1947354/ | Video]]
 * '''Inside the GIL''' - a detailed analysis of how the GIL is even worse than we thought by David Beazley, ChiPy June 2009 [[http://www.dabeaz.com/python/GIL.pdf | Slides]] [[http://blip.tv/file/2232410 | Video]]
 * '''Using Coroutines to Create Efficient, High-Concurrency Web Applications''' - Matt Spitz, Pycon 2011 [[http://blip.tv/file/4883016|Video]]
 * '''Concurrency and Distributed Systems''' - Jesse Noller, Pycon 2009 [[http://jessenoller.com//code/pycon_jnoller_distributed.pdf|Slides]] [[http://pycon.blip.tv/file/1947511/|Video]]
 * '''Introduction to Multiprocessing''' - Jesse Noller, Pycon 2009 [[http://jessenoller.com/code/pycon_jnoller_multiprocessing.pdf|Slides]] [[http://pycon.blip.tv/file/1947354/|Video]]
 * '''Inside the GIL''' - a detailed analysis of how the GIL is even worse than we thought by David Beazley, ChiPy June 2009 [[http://www.dabeaz.com/python/GIL.pdf|Slides]] [[http://blip.tv/file/2232410|Video]]

Concurrency

Concurrency in programming means that multiple computations happen at the same time. For example, you may have multiple Python programs running on your computer. Or you may connect multiple computers via a network (e.g., Ethernet) that work together towards a common objective (e.g., distributed data analytics). As all concurrent programs can access shared resources (e.g., files) at the same time, this can lead to a variety of problems and challenges (e.g., race conditions).

The goal of this article is to clear concurrency-related questions, problems, or issues you may have. It also serves as a first starting point for further discussions on concurrency in Python.

Talks

Concurrency support offered by the Standard Library

  • multiprocessing (available in CPython 2.6+ and 3, also in recent PyPy 1.5 alpha builds)

  • threading - make your GIL-burdened life easy and your code fast: use multiprocessing. ;)

  • concurrent.futures (available in CPython 3.2 and up)

  • asyncio (available in CPython 3.4 and up) - single-threaded concurrency

  • async/await simpler syntatic sugar for coroutines (Python 3.5+)

See also: ParallelProcessing, Pypi Distributed Computing Trove

  • StacklessPython - A Python Implementation That Does Not Use The C Stack. It purportedly can handle thousands of threads (tasklets) in the context of a very parallel game application, but remains pretty single-core.

  • Twisted - Full-featured and well-tested asynchronous networking library. Concurrency model: Asynchronous code written around central objects known as Deferreds. Cooperative task scheduler via the Cooperator object. Time based scheduling with LoopingCalland reactor.callLater.

  • curio the coroutine library you were warned about

  • trio Pythonic asynchronous I/O for humans and snakes

  • Greenlet - The "greenlet" package is a spin-off of StacklessPython, a version of CPython that supports micro-threads called "tasklets". Tasklets run pseudo-concurrently (typically in a single or a few OS-level threads) and are synchronized with data exchanges on "channels". Documentation

  • Gevent - coroutine-based networking library that uses greenlet to provide a high-level synchronous API on top of libevent event loop. It features standard library-inspired API and high performance.

  • Eventlet - Eventlet is a networking library written in Python. It achieves high scalability by using non-blocking io while at the same time retaining high programmer usability by using coroutines to make the non-blocking io operations appear blocking at the source code level.

  • Tornado high performance web framework and asynchronous networking library capable of scaling to tens of thouands of open connections

  • Ray - Parallel (and distributed) process-based execution framework which uses a lightweight API based on dynamic task graphs and actors to flexibly express a wide range of applications. Uses shared-memory and zero-copy serialization for efficient data handling. Supports low-latency and high-throughput task scheduling. Includes higher-level libraries for machine learning and AI applications.

Other Frameworks

  • asyncoro - Framework for asynchronous, concurrent, distributed and network programming.

  • Fibra - Fibra provides advanced cooperative concurrency using Python generators.

  • Cogen - cogen is a crossplatform library for network oriented, coroutine based programming using the enhanced generators. The project aims to provide a simple straightforward programming model similar to threads but without all the problems and costs.

  • Circuits - circuits is a Lightweight, Event-driven Framework with a strong Component Architecture.

  • Kamaelia - In Kamaelia you build systems from simple components that talk to each other. This speeds development, massively aids maintenance and also means you build naturally concurrent software. It's intended to be accessible by any developer, including novices. It also makes it fun.

  • Concurrence - Concurrence is a framework for creating massively concurrent network applications in Python. It takes a Lightweight-tasks-with-message-passing approach to concurrency.

  • Parallel Python - Parallel Python is a python module which provides a mechanism for parallel execution of python code on SMP (systems with multiple processors or cores) and clusters (computers connected via network).

  • pprocess - The pprocess module provides elementary support for parallel programming in Python using a fork-based process creation model in conjunction with a channel-based communications model implemented using socketpair and poll.

  • pysage - lightweight high-level message passing library supporting actor based concurrency

  • pypes - Pypes is a framework for designing highly scalable concurrent Flow-Based applications. It uses a component model (Stackless tasklets) to achieve modularity (very loose coupling) along with the 2.6 multiprocessing module to achieve parallelism (multi-processing support). The goals are focused on performance (capable of processing millions of documents) and ease of use. Building custom components is simple and pypes offers Paste templates for generating boiler plate component code along with supporting build files (plugins/components are eggs). Pypes also provides a full Web 2.0 WSGI user interface that allows users to create, destroy, configure, and connect components together using simple drag-n-drop functionality from any standards based web browser.

  • diesel - Diesel is a framework for writing network applications using asynchronous I/O in Python. It uses Python's generators to provide a friendly syntax for coroutines and continuations. It performs well and handles high concurrency with ease.

  • Chiral - Chiral is a lightweight coroutine-based networking framework for high-performance internet and Web services.

Presentations etc.

  • Using Coroutines to Create Efficient, High-Concurrency Web Applications - Matt Spitz, Pycon 2011 Video

  • Concurrency and Distributed Systems - Jesse Noller, Pycon 2009 Slides Video

  • Introduction to Multiprocessing - Jesse Noller, Pycon 2009 Slides Video

  • Inside the GIL - a detailed analysis of how the GIL is even worse than we thought by David Beazley, ChiPy June 2009 Slides Video

Concurrency (last edited 2020-07-26 19:54:21 by ChrisM)

Unable to edit the page? See the FrontPage for instructions.