Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sketch out a way to support multithreading #23

Draft
wants to merge 5 commits into
base: main
Choose a base branch
from

Commits on Feb 7, 2020

  1. Sketch out a way to support multithreading

    This implementation is kinda wonky, but is the best way I've come up with to support sessions/clients across multiple threads and pooling connections across multiple threads. It's based on the kind of hacky implementation in edgi-govdata-archiving/web-monitoring-processing#551.
    
    The basic idea here includes two pieces, and works around the fact that urllib3 is thread-safe, while requests is not:
    
    1. `WaybackSession` is renamed to `UnsafeWaybackSession` (denoting it should only be used on a single thread) and a new `WaybackSession` class just acts as a proxy to multiple UnsafeWaybackSessions, one per thread.
    
    2. A special subclass of requests's `HTTPAdapter` that takes an instance of urllib3's `PoolManager` to wrap. `HTTPAdapter` itself is really just a wrapper around `PoolManager`, but it always creates a new one. This version just wraps whatever one is given to it. `UnsafeWaybackSession` now takes a `PoolManager` as an argument, which, if provided, is passed to its `HTTPAdapter`. `WaybackSession` creates one `PoolManager` which it sets on all the actual `UnsafeWaybackSession` objects it creates and proxies access to. That way a single pool of requests is shared across many threads.
    
    This is super wonky! It definitely makes me feel like we might just be better off dropping requests and just using urllib3 directly (especially given #2 -- which means requests wouldn't be part of our public interface in any way). But this is a smaller change that *probably* carries less short-term risk.
    Mr0grog committed Feb 7, 2020
    Configuration menu
    Copy the full SHA
    7ee343f View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    1adb555 View commit details
    Browse the repository at this point in the history
  3. Make the linter happy

    Mr0grog committed Feb 7, 2020
    Configuration menu
    Copy the full SHA
    fcc8180 View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    8d45972 View commit details
    Browse the repository at this point in the history
  5. Configuration menu
    Copy the full SHA
    c397556 View commit details
    Browse the repository at this point in the history