Hardware-wallet/usb-subscribe-refactor#7860
Conversation
|
It looks like @niklasad1 hasn't signed our Contributor License Agreement, yet.
You can read and sign our full Contributor License Agreement at the following URL: https://cla.parity.io Once you've signed, plesae reply to this thread with Many thanks, Parity Technologies CLA Bot |
|
[clabot:check]. |
|
It looks like @niklasad1 signed our Contributor License Agreement. 👍 Many thanks, Parity Technologies CLA Bot |
| const LEDGER_PIDS: [u16; 2] = [0x0000, 0x0001]; // Nano S and Blue | ||
| // Expose these to the callback configuration | ||
| pub const LEDGER_VID: u16 = 0x2c97; | ||
| pub const LEDGER_PIDS: [u16; 2] = [0x0000, 0x0001]; // Nano S and Blue |
There was a problem hiding this comment.
needs documentation for public items
| ::std::thread::sleep(Duration::from_millis(200)); | ||
| match f() { | ||
| Ok(handle) => Ok(handle), | ||
| Err(e) => Err(From::from(e)), |
| } | ||
| } | ||
|
|
||
| pub struct EventHandler { |
| } | ||
|
|
||
| impl EventHandler { | ||
| pub fn new(ledger: Weak<Manager>) -> Self { |
|
|
||
| const USB_DEVICE_CLASS_DEVICE: u8 = 0; | ||
|
|
||
|
|
| let usb_context = Arc::new(libusb::Context::new()?); | ||
| let usb_context_trezor = Arc::new(libusb::Context::new()?); | ||
| let usb_context_ledger = Arc::new(libusb::Context::new()?); | ||
| let mut active_threads: Vec<Option<thread::JoinHandle<()>>> = Vec::with_capacity(10); |
There was a problem hiding this comment.
why bother setting the capacity to 10? (vec internally uses initial of 4 AFAIK)
| .name("hw_wallet".to_string()) | ||
|
|
||
| // Ledger event thread | ||
| // FIXME: check if we can move it to ledger.rs (lifetime issue) |
| } | ||
| loop { | ||
| usb_context_ledger.handle_events(Some(Duration::from_millis(500))) | ||
| .unwrap_or_else(|e| debug!("Error processing USB events: {}", e)); |
There was a problem hiding this comment.
provide a target for log events
| } | ||
| } | ||
|
|
||
| pub struct EventHandler { |
| debug!("TREZOR Connect Error: {:?}", e); | ||
| } | ||
| println!("unlocked devices: {:?}", t.list_devices()); | ||
| println!("locked devices: {:?}", t.list_locked_devices()); |
There was a problem hiding this comment.
stray printlns in this file
| println!("trezor arrived"); | ||
| if let Some(t) = self.trezor.upgrade() { | ||
| // Wait for the device to boot-up | ||
| thread::sleep(Duration::from_millis(1000)); |
There was a problem hiding this comment.
this seems arbitrary and prone to race conditions. is there a better way?
There was a problem hiding this comment.
Hrm, only a physical device TREZOR device can generate this and the update_devices() iterates through the devices and read/write to them after acquiring a RwLock. Thus, the risk is pretty low that somebody plugs in two TREZOR keys at the same time
But still Weak::upgrade() -> Rc is not atomic, so it that what you refer to? That could probably be solved by another lock (before line 420)
Thoughts/suggestions?
| impl libusb::Hotplug for EventHandler { | ||
| fn device_arrived(&mut self, device: libusb::Device) { | ||
| println!("ledger arrived"); | ||
| Duration::from_millis(1000); |
There was a problem hiding this comment.
this line does nothing, stray println above
There was a problem hiding this comment.
Oops, should be thread::sleep(Duration::from_millis(1000)) ofc
| /// Hardware wallet management interface. | ||
| pub struct HardwareWalletManager { | ||
| update_thread: Option<thread::JoinHandle<()>>, | ||
| active_threads: Vec<Option<thread::JoinHandle<()>>>, |
There was a problem hiding this comment.
i'm curious about the motivation to have a separate thread for each HW wallet -- I am mildly worried about the overhead of waking these threads up fairly often to poll for exit and incurring context switches during regular import.
The other downside to joining the threads manually is that bugs in libusb or drivers can cause Parity's exit to completely hang -- this is something we want to avoid because the HardwareWalletManager cleanup might be blocking e.g. rocksdb cleanup and the inevitable forced shutdown could break things.
This is a separate issue, but we have a StopGuard utility that currently wraps the Arc<AtomicBool> logic, with the idea that there is a single one shared by all the utilities in the program (with only the main thread capable of writing to it). I don't think we do this quite yet, but it would be nice to have the main thread signal stop and wait for a few seconds or for the ref-count to go to zero (whichever comes first).
There was a problem hiding this comment.
@rphmeier
Great observations, the main motivation is to check whether ÙSB Hotplug callbacks are the reason of the crashes along with removing dummy for loops. By splitting it into two threads makes it possible the control the callbacks in fine-grained manner instead of subscribing to all devices and the code becomes easier to understand IMO.
It's possible to do it with one thread and some additional checks though, but let's wait for the test on the Windows machine take it from there (I would be very surprised if it still fails... 🔨 )
Ok, didn't know that I just did similar to the previous implementation
|
Approach generally looks good, major point of question is the shutdown logic. |
b720882 to
fd41215
Compare
The logs looks as follows: 2018-02-13 16:00:25 hw_wallet_trezor DEBUG hardware_wallet Trezor startup
2018-02-13 16:00:29 hw_wallet_trezor DEBUG hardware_wallet::trezor Trezor V1 left
2018-02-13 16:00:35 hw_wallet_trezor DEBUG hardware_wallet::trezor Trezor V1 arrived
2018-02-13 16:00:40 hw_wallet_trezor DEBUG hardware_wallet::trezor Trezor V1 leftBut I don't really understand your comment, can you clarify what you mean? EDIT: debug!(target: "hw", "some debug text"); |
fd41215 to
3317ba3
Compare
| if let Some(thread) = self.update_thread.take() { | ||
| thread.thread().unpark(); | ||
| thread.join().ok(); | ||
| for thread in self.active_threads.iter_mut() { |
There was a problem hiding this comment.
better to kill the source of a potential hang for now. what are the repercussions if the process exits without joining the threads?
There was a problem hiding this comment.
If the main thread just terminates the two other threads will run and listen for new events and try to connect/disconnect to trezors and ledgers. But I think that is ok since these threads as suppose to live as long as parity does!
Thoughts?
EDIT: But still AtomicBool is shared between the threads and should save us in this context and the threads should terminate once this propagated.
There was a problem hiding this comment.
I think signalling that Parity has gone down with an atomic bool or similar is fine. Just that I don't think blocking while waiting for them to join is necessarily a good idea.
* More fine-grained initilization of callbacks by vendorID, productID and usb class * Each device manufacturer gets a seperate handle thread each * Replaced "dummy for loop" with a delay to wait for the device to boot-up properly * Haven't been very carefully with checking dependencies cycles etc * Inline comments explaining where shortcuts have been taken * Need to test this on Windows machine and with Ledger (both models) Signed-off-by: niklasad1 <niklasadolfsson1@gmail.com>
* Remove thread joining in HardwareWalletManager * Remove thread handlers in HardwareWalletManager because this makes them unused
3317ba3 to
aaff934
Compare
|
FYI, I think the next step should be to remove HardwareWalletManager struct and replace it with a trait that all Wallets implement instead |
|
The |
* Hardware-wallet fix * More fine-grained initilization of callbacks by vendorID, productID and usb class * Each device manufacturer gets a seperate handle thread each * Replaced "dummy for loop" with a delay to wait for the device to boot-up properly * Haven't been very carefully with checking dependencies cycles etc * Inline comments explaining where shortcuts have been taken * Need to test this on Windows machine and with Ledger (both models) Signed-off-by: niklasad1 <niklasadolfsson1@gmail.com> * Validate product_id of detected ledger devices * closed_device => unlocked_device * address comments * add target in debug * Address feedback * Remove thread joining in HardwareWalletManager * Remove thread handlers in HardwareWalletManager because this makes them unused
|
@rphmeier Polling should be simple to introduce by something like: let start_time = Instant::now();
loop {
if let Ok(e) = self.update_devices() {
break;
}
if start_time.elapsed() >= Duration::from_millis(1000) {
// ooops failed with USB while trying for 1 second
debug!(".....");
break;
}
}Let me know this should be easy to fix |
* Hardware-wallet/usb-subscribe-refactor (#7860) * Hardware-wallet fix * More fine-grained initilization of callbacks by vendorID, productID and usb class * Each device manufacturer gets a seperate handle thread each * Replaced "dummy for loop" with a delay to wait for the device to boot-up properly * Haven't been very carefully with checking dependencies cycles etc * Inline comments explaining where shortcuts have been taken * Need to test this on Windows machine and with Ledger (both models) Signed-off-by: niklasad1 <niklasadolfsson1@gmail.com> * Validate product_id of detected ledger devices * closed_device => unlocked_device * address comments * add target in debug * Address feedback * Remove thread joining in HardwareWalletManager * Remove thread handlers in HardwareWalletManager because this makes them unused * fixed broken logs (#7934) * fixed broken logs * bring back old lock order * removed bloom groups from blockchain * revert unrelated changes * simplify blockchain_block_blooms * Bump WS (#7952) * Calculate proper keccak256/sha3 using parity. (#7953) * Increase max download limit to 128MB (#7965) * fetch: increase max download limit to 64MB * parity: increase download size limit for updater service * Detect too large packets in snapshot sync. (#7977) * fix traces, removed bloomchain crate, closes #7228, closes #7167 (#7979) * Remvoe generator.rs * Make block generator easier to use (#7888) * Make block generator easier to use * applied review suggestions * rename BlockMetadata -> BlockOptions * removed redundant uses of blockchain generator and genereator.next().unwrap() calls
Hi,
This PR changes how events from
libusb/Hotplugare subscribed in more fine-grained manner i.e, by manufacturer_id, product_id and USB class.By doing this it fixes #7516 & #7793 it seems to be Windows only issue because we tested it earlier on both Linux/Ubuntu and OSX