Parallel Computing in Python

Introduction

Python is the lingua franca of data science and artificial intelligence.

Because of this, it's a useful time to learn how to use the language's powerful and build-in tools for concurrency and parallelism.

This article will cover the basics of concurrency and parallelism before diving into the modules that Python provides for multithreading, multiprocessing, and asynchronous programming.

NOTE: You can find all the code for this article in this Github repository I put together for you to follow along with.

Concurrency vs. Parallelism

I have always found that visual examples do far more to explain this concept than I could ever do with words, so here's a simple diagram of concurrency versus parallelism:

In simple terms, concurrency is about doing multiple things on the same CPU core. The core will switch between tasks as needed, and it's really advantageous for tasks that are "I/O-bound" (meaning they spend most of their time waiting for input/output operations to complete). Network requests and opening/editing files are good examples of these tasks.

In contrast, parallism is about doing multiple things at the same time on multiple CPU cores. This is excellent for anything that is "CPU-bound" (meaning they spend most of their time doing CPU-intensive calculations). Math operations, sorting, and encryption are good examples of these sorts of operations.

Parallel Computing in Python

Introduction

Concurrency vs. Parallelism

Threading Module

Multiprocessing Module

Asyncio Module

Conclusion