Select a Subtopic
Day 13: Multithreading and Multiprocessing
Key Concepts:
1. Multithreading
It allows you to run multiple threads (smaller units of a process) concurrently within the same program. Each thread runs independently, but they all share the same memory space.
2. Multiprocessing
Unlike multithreading, multiprocessing involves running multiple processes concurrently. Each process runs independently and has its own memory space, making it ideal for CPU-intensive tasks.
Multithreading:
Multithreading is useful when you have tasks that are I/O-bound, such as downloading files, reading from a database, or interacting with external services. It allows your program to continue executing other tasks while waiting for I/O operations to complete.
import threading
import time
# Define a function to be executed in a thread
def print_numbers():
for i in range(1, 6):
print(f"Number {i}")
time.sleep(1) # Simulate I/O-bound task by pausing for a second
# Create a thread that runs the print_numbers function
thread1 = threading.Thread(target=print_numbers)
# Start the thread
thread1.start()
# Wait for the thread to finish
thread1.join()
print("Thread execution finished.")
Multiprocessing:
Multiprocessing is used for CPU-bound tasks. Unlike multithreading, where multiple threads share the same memory space, each process in multiprocessing runs independently, making it more suitable for computationally heavy tasks.
import multiprocessing
import time
# Function to simulate a CPU-bound task
def square_numbers():
for i in range(1, 6):
print(f"Square of {i}: {i ** 2}")
time.sleep(1) # Simulate time-consuming task
# Create a process
process1 = multiprocessing.Process(target=square_numbers)
# Start the process
process1.start()
# Wait for the process to finish
process1.join()
print("Process execution finished.")
Key Differences Between Threading and Multiprocessing:
Feature | Multithreading | Multiprocessing |
---|---|---|
Concurrency Type | I/O-bound tasks (non-blocking) | CPU-bound tasks (block CPU) |
Memory Usage | Shared memory space between threads | Each process has its own memory space |
Performance | Doesn't utilize multiple cores effectively | Utilizes multiple cores (better for heavy computations) |
Use Case | Web scraping, downloading files, reading files | Data processing, scientific computations, image processing |
Practical Exercises:
Exercise 1 - Download Multiple Files Concurrently (Multithreading)
import threading
import requests
def download_file(url):
response = requests.get(url)
filename = url.split("/")[-1]
with open(filename, 'wb') as file:
file.write(response.content)
print(f"Downloaded {filename}")
# List of URLs to download
urls = [
"https://example.com/file1.jpg",
"https://example.com/file2.jpg",
"https://example.com/file3.jpg"
]
# Create a list to hold the thread objects
threads = []
# Start a thread for each download task
for url in urls:
thread = threading.Thread(target=download_file, args=(url,))
thread.start()
threads.append(thread)
# Wait for all threads to finish
for thread in threads:
thread.join()
print("All downloads completed.")
Exercise 2 - Perform a CPU-Intensive Task Using Multiprocessing
import multiprocessing
def factorial(n):
result = 1
for i in range(1, n + 1):
result *= i
print(f"Factorial of {n} is {result}")
# List of numbers to calculate factorial
numbers = [5, 7, 10, 12]
# Create a process for each factorial calculation
processes = []
for number in numbers:
process = multiprocessing.Process(target=factorial, args=(number,))
process.start()
processes.append(process)
# Wait for all processes to finish
for process in processes:
process.join()
print("All factorials calculated.")
Recap of Key Points:
- Multithreading is best for I/O-bound tasks, allowing your program to stay responsive.
- Multiprocessing is best for CPU-bound tasks, making use of multiple processors to perform tasks in parallel.
- The `threading` module helps you manage threads and the `multiprocessing` module manages processes in Python.
Next Steps:
If you want to explore more advanced concepts, you can look into Thread Pools and Process Pools to manage a group of threads or processes more efficiently using Python's concurrent.futures module.