Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ag liveplot #315

Merged
merged 13 commits into from
Sep 30, 2016
Merged

Ag liveplot #315

merged 13 commits into from
Sep 30, 2016

Conversation

alan-geller
Copy link

Changes to Loop/ActiveLoop and DataSet to make live plotting work for foreground loops with local data sets.

@giulioungaretti
Copy link
Contributor

@alan-geller looks nice, smoother than before too!

One thing is that adding another plot seems broken.
That is having a figure with two live plots (done before with plot.add(foo), plot.add(bar) ).
Maybe I am missing something trivial but using the same API as before does not produce the same result.

Note that the mock model things are still using multiple processes, which I think we should avoid.

@giulioungaretti
Copy link
Contributor

@alan-geller I just noticed that this PR breaks all the tests :D Do you have acces to travis-ci ?

@alan-geller
Copy link
Author

@giulioungaretti No, I don't have access to travis-ci, as far as I know.

I'll see if I can figure out why adding a second plot doesn't work. Is the broken scenario two separate plots of the same DataSet, or a single plot with two different values getting graphed (i.e., two lines drawn)?

@alan-geller alan-geller self-assigned this Aug 29, 2016
@alan-geller
Copy link
Author

@giulioungaretti OK, I think I have access to travis-ci now. I'll see what broke all the tests.

@giulioungaretti
Copy link
Contributor

@alan-geller it seems like the plot.add() overwrites the data one had already added (?).
And it's the first, two separate plots of the same data_set.

Awesome if you have access else, let me know and then we can figure out a way to let everybody have access.

@alan-geller
Copy link
Author

@giulioungaretti I found and fixed one problem: if you use the same loop twice, it was using the same data set for the second time. This is bad if you have different parameters or labels. I've updated my pull request to fix this, and it now passes all of the tests -- locally, at least.

@alan-geller
Copy link
Author

@giulioungaretti The one failing test case is test_multiprocessing.TestQcodesProcess.test_qcodes_process, which succeeds consistently when I run it locally.
Any suggestions? I'm going to start walking through the test case to see if I can see anything.

@alan-geller
Copy link
Author

@giulioungaretti Looking at the failing test -- test_qcodes_process -- and knowing that it passes consistently on Windows and fails sporadically on Linux, I have a thought. In some versions of Linux, the minimum sleep() duration is one second. Sleep(0) is essentially a thread yield, not a sleep for any amount of time. (There is often, although not always, a nap() system call that provides a sub-second sleep). Depending on how the Python standard library implements time.sleep(), the time.sleep(0.01) on line 197 of test_multiprocessing.py might not actually cause any delay on Linux.

I have no idea how iOS treats sleep(), although since it's of Unix descent it may act similarly to Linux.

Changing the sleeps in test_multiprocessing.py to always sleep at least one second might make the test case pass more frequently.

@peendebak
Copy link
Contributor

@alan-geller @giulioungaretti The PR does not work outside of ipython notebook. To make it work outside I need to do something like:

    import pyqtgraph as pg
    app=pg.mkQApp()
    def fun():
        liveplotwindow.update()
        app.processEvents()
    data = loop.with_bg_task(fun, 0.001).run(location=location, data_manager=data_manager, background=background)

This has to do with the Qt event loop and probably not something to be fixed in this PR.

I need Qcodes to be useable without ipython notebook, so if it is possible please also test without the notebook and make the examples (e.g. Tutorial.ipynb available as plain python scripts). Also see #290, #278, #316.

@peendebak
Copy link
Contributor

The loop.get_data_set only works after a loop.each(...). The defaults measurement with station.set_measurement(...) does not work. I am not sure this is really an issue.

@peendebak
Copy link
Contributor

Besides the comments above the PR looks good, works fine with my code!

@peendebak
Copy link
Contributor

@alexcjohnson @giulioungaretti @alan-geller

One more thing. When I set the min_delay to 0.1 seconds (which seems like a good update rate), then the loop in the Tuturial.ipynb takes about 10 seconds on my system. Without the live plotting the loop takes 5 seconds. This means the live plotting has a serious influence on the measurement speed.

@alan-geller
Copy link
Author

@peendebak That get_data_set is only available after the .each is by design; until the measurements are set, you can't create the data set. I suppose it would be possible to work around this somehow, but I'm not sure how, since the ActiveLoop instance (which is where the data set lives) doesn't get created until the .each method is called. Loop.run and .run_temp are special-case methods that create and immediately use the ActiveLoop; there's not really a good place to hook in the data set fetch.

I'll work on creating a command-line way I can test this.

And yes, since it updates the plot every tenth of a second (with min_delay=0.1), it will have a significant impact on measurement speed. I'd suggest setting min_delay somewhat longer, say 1 second; this will result in jumpier plotting, but less impact on measurement perf.

@alan-geller
Copy link
Author

I'm not sure what my next step is here. The one test case that fails is unrelated to this check-in and has a comment that it sometimes fails randomly. Is there a way to resubmit the PR for another test pass, without doing another commit?

@alexcjohnson
Copy link
Contributor

Oh weird... when I looked at this PR yesterday it said there were conflicts with the base branch (and I was planning to play with it after you fixed those) but now there are none. Perhaps the conflicts were with the default parameter PR that was just reverted?

Anyway, I will try it out later tonight.

# TODO: compare timestamps to know if we need to read?
# Compare timestamps to avoid overwriting unsaved data
if self.last_store > self.last_write:
return True
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point! But this is a little different from the situation I was imagining where you'd hit this block (calling sync with a DataSet in LOCAL mode) which is if you started a measurement in one notebook/process, and wanted to watch it live from a totally independent process, so the only way to receive new data is to read it from disk.

What I was referring to with the TODO was comparing the file modification timestamp with the last time we read the file from disk. But this makes it clear that more generally we should be comparing the mod time with the last time the disk and memory copies were synced - ie instead of last_write we could have last_io_sync? With such a test and local data taking we would (correctly) never read the file back in due to a sync call.

Then of course we get into trouble if self.last_store > self.last_io_sync AND file_mod_time > self.last_io_sync... merge conflict! Two agents are writing to this file at the same time! At least to start I would think in that situation we do nothing, just emit a warning and the problem gets buried at the next write.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We probably need to distinguish between "LOCAL-IN-MEMORY" mode and "LOCAL_READ_FROM_DISK" mode (not the best names) for this: if I'm the one writing the data, then you should never read, but if you're the one reading, then you need to check the file update time.
This feels like a bigger change than should be in this PR, though.
I can either:

  1. Leave the code as I have it now, but add another TODO comment.
  2. Leave the code as I have it now, but add an issue.
  3. Add a new mode as part of this PR.

I think I'd vote for option 2, since I think in the full multi-processing, adaptive sweep future there might be more than one additional mode, and so thinking this through carefully is a worthwhile exercise. On the other hand, if you've already thought this through, then maybe options 1 or 3 make more sense.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alan-geller what do you mean by multi-processing, adaptive sweep future ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought about suggesting splitting modes like that... but in the end I decided it would be better to just let the file speak for itself, not make assumptions about who else might be writing to the file.

Sure, lets make an issue for it - digging in a little more, it seems to me it will do the right thing right now (never trying to read the file back in if all actions were local) as long as nobody else edits the file mid-stream.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@giulioungaretti I mean after v1.
@alexcjohnson OK, I'll open an issue to track this, for post-v1.

@@ -595,6 +640,45 @@ def _check_signal(self):
else:
raise ValueError('unknown signal', signal_)

def get_data_set(self, data_manager=None, *args, **kwargs):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like it! We should check that if there already is a DataSet when we call this, that the mode (or data_manager value) matches what's already there. Otherwise run(data_manager=...) could silently fail to deliver the requested mode.

Also, there will be a conflict with #319 where @giulioungaretti changed data_manager to a simple boolean... just a heads up.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or perhaps (if we merge #319 first) allow data_manager=None to propagate through without checking against the existing DataSet, or inserting the default (which will at that point be False) if the DataSet doesn't exist yet... That would avoid making you specify it twice in case you do want a non-default value, but would still alert you if you provide inconsistent settings (most likely this would happen if you use get_data_set to prepare the plot without specifying data_manager, then specify the non-default value in run thinking that this is where it should go)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd merge this first, as it adds a feature then fix the conflicts :D

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good, so ignore this comment for now and I'll look for it to get cleaned up in #319 .

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, ignored :-).

@@ -657,8 +741,7 @@ def run(self, background=True, use_threads=True, quiet=False,
else:
data_mode = DataMode.PUSH_TO_SERVER
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This block can go away now that it's in get_data_set, we have no other use of data_mode.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. This was in one of the commits that didn't make it through...

@@ -4,6 +4,7 @@
from IPython.display import display

from qcodes.widgets.widgets import HiddenUpdateWidget
from qcodes.utils.helpers import tprint
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

did you use this?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was using it for debugging -- it was removed in one of the mysterious vanishing commits, which are now unvanished.

t = time.time()
if t - last_task >= self.bg_min_delay:
self.bg_task()
last_task = t
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should also run bg_task after the loop is complete - I'd say right before the then_actions section below.

I noticed this by increasing the delay, then I don't see the final points on the plot!

On my laptop, running the tutorial notebook as you have it with delay set to less than the per-point delay (so I guess it will always update between points) the loop takes 50 seconds, up from 6 without the live plotting! If I bump it to 1 sec the time penalty is minimal. So I'd change the example delay to maybe 0.5 seconds. On strong PCs a faster default may be fine, but that's about the best my laptop can handle, at least with matplotlib.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alexcjohnson The ~0.5 seconds update is the main reason I don't use matplotlib for live plotting or anything interactive in the notebook. My experience is that even when it runs in a separate process mpl is very slow and not built for interactivity.

To me those are the main arguments for using a fast plotting library built with interactivity in mind for this purpose such as pyqtgraph or plotly.

But bumping up the default delay makes sense IMO.

Copy link
Contributor

@giulioungaretti giulioungaretti Sep 15, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@AdriaanRol @alexcjohnson is it matplotlib or the widget stuff that kind of slows down ?
Note that we'll have one core mpl developer in house, so things will change.

For now supported things are only mpl and QT.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@giulioungaretti , I have not tested this specific example but all my experience with MPL indicates that that is the slow component. I'm a big fan of pyqgraph/QT so no big issues for me here :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep - if I switch to QtPlot in the tutorial nb, it works fine and I can use 0.1 sec without too much overhead.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One more thing to consider: we probably don't want background tasks to be able to break the loop. In DataSet.complete which implements similar functionality but just for background loops, we catch any errors and if the same function errors twice in a row we stop trying to execute it. Would it be worth including that here too?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a task invocation after the loop finishes, and a catch so that the task can't break the loop, with the two-in-a-row check @alexcjohnson suggested.

Copy link
Contributor

@alexcjohnson alexcjohnson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alan-geller looks good. Just a few things to change/consider before we go with it.

This is my first go with github's new review system. can't say I'm super excited about it, mainly because posting review comments one-by-one is really useful, it allows all those conversations to start while you're still doing the rest of the review - if anyone else is awake :) And already encountered a bug, where it posted my reply to my own comment before posting the comment itself! The more formalized approval could be good though, rather than our dancer convention (which I guess every project now does differently...)

@alan-geller
Copy link
Author

Looking at this, it seems that my last two commits (from back in August) got lost somehow. One of them was trivial -- it's removing the line in base.py that Alex pointed out isn't needed -- but the other fixed some bugs related to multiple plots from the same loop, and they need to be in there.
Rather foolishly in retrospect, I synched my branch with GitHub just a few minutes ago, which I thought would push my two commits up to GitHub, but which instead reverted them back to the last commit on GitHub.
Any suggestions, other than trying to recreate the bugs and so refixing them?

@alexcjohnson
Copy link
Contributor

@alan-geller that's annoying... did you just do a git pull or something else? I've never seen it delete commits that you made, so perhaps if it made a revert commit or something, you can still see the commits you want in git reflog and then maybe cherry-pick them back in. If they somehow got dropped instead, they should still be in git fsck --lost-found but for some reason on my machine that has 91 objects in it and only shows their hashes, not commit messages (nor are they in any particular order)

@giulioungaretti
Copy link
Contributor

@alexcjohnson from my side commenting behaves the same as before with some more shiny shiny roundy design frills. Git usually does not erase things, @alan-geller are you using the desktop client or CLI ?

@alan-geller
Copy link
Author

@giulioungaretti I generally use the desktop client.
@alexcjohnson When I do a git reflog I see all 4 commits:
997dc15 HEAD@{0}: rebase finished: returning to refs/heads/ag-liveplot
997dc15 HEAD@{1}: pull --progress --rebase --prune origin ag-liveplot: checkout 997dc158531b812973c479ae19fe1802a93e5275c858a9f
HEAD@{2}: commit: Final clean-up before resubmitting
5fbf934 HEAD@{3}: commit: Fixed a bug with re-using the same AciveLoop instance
997dc15 HEAD@{4}: commit: Live plot works with local data sets
9647808 HEAD@{5}: commit: Snapshot local commit

When I do a git log, though, I don't see the last two commits on this branch.

I assume the problem is the pull/rebase, which I assume the desktop client generated when I clicked on the "sync" button. That doesn't explain how the two commits got lost on GitHub, though; I wonder/assume this is connected to the dropped commits @giulioungaretti saw earlier this week.

I'll try to figure out how to recover them.

Alan

@giulioungaretti
Copy link
Contributor

@alan-geller if you click on the links you get the commits, I have no clue what happened tbh :D
Could be that somebody messed up with your branch :D

@alan-geller
Copy link
Author

@giulioungaretti : @alexcjohnson walked me through fixing this, and now all 4 commits show on-line. Now working on the merge conflicts.

@@ -710,7 +788,9 @@ def run(self, background=True, use_threads=True, quiet=False,
if not quiet:
print(repr(self.data_set))
print(datetime.now().strftime('started at %Y-%m-%d %H:%M:%S'))
return self.data_set
ds = self.data_set
self.data_set = None
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch :)

Which makes me wonder, do we want to do the same with bg_task? I'm not sure how to handle this. On the one hand, if you're reusing the same plot, it's convenient to not have to attach plot.update again. On the other hand, if you're using a different plot or something, currently you wouldn't be able to use the same with_bg_task to change or remove it, you'd have to just set the bg_task attribute.


# run the background task one last time to catch the last setpoint(s)
if self.bg_task is not None:
self.bg_task()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh dear, hard tabs... 😦

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, notepad++ instead of VS... I assumed that in Python mode it would switch to spaces by default, but it doesn't.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, just pushed a fix for this.

@giulioungaretti giulioungaretti mentioned this pull request Sep 16, 2016
2 tasks
@alexcjohnson
Copy link
Contributor

@alan-geller great! The only one left is how bg_task should behave when re-running the ActiveLoop (https://github.com/qdev-dk/Qcodes/pull/315/files#r79203169) - which is enough of an edge case that I'd say lets just let people use this and see what makes most sense, rather than trying to address it now. 💃

try:
self.bg_task()
last_task_failed = False
except:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to specify which exception here.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The notion is to catch all possible exceptions and prevent them from interrupting the measurement loop.
Should I add a comment to that effect?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In that case I would change except: to except Exception:, this makes it works with everything that inherits from Exception but not things like a keyboard interrupt. Also I'd add a comment to that, I'd agree that you don't want your plotting to crash your measurement but I'm also not a big fan of except all statements.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sweet, this allows to abort a measurement while plotting.

@giulioungaretti
Copy link
Contributor

@alan-geller is it me or the 2d plots are not live with this patch ?

In the example notebook, the part named " Example: multiple 2D measurements " blocks untill done, then it's displayed. Does it work on your end ?

@alan-geller
Copy link
Author

@giulioungaretti I only wired up the first plot in the example notebook, so none of the others are live "by design".

@alan-geller
Copy link
Author

I've added a 2d live plot to the Tutorial notebook. It works fine for me locally.

@alan-geller
Copy link
Author

It looks like the Travis failure is in an unrelated multiprocessing test.

@giulioungaretti
Copy link
Contributor

@alan-geller yes, it's something neither me or @alexcjohnson figured out, likely a timing issue with the old multiprocessing, can sometime happen locally too.

@alan-geller
Copy link
Author

Giulio, you have to do the merge, I'm afraid; I'm not authorized (which is fine).

@giulioungaretti
Copy link
Contributor

All seems fine, will just polish the notebook a bit and then merge !

@giulioungaretti
Copy link
Contributor

And in the mean time make sure the default to no multiprocessing merge goes smooth.

@giulioungaretti giulioungaretti merged commit a476f61 into master Sep 30, 2016
giulioungaretti pushed a commit that referenced this pull request Sep 30, 2016
Merge: 35515fa d9a02ad
Author: Giulio Ungaretti <[email protected]>

    Merge pull request #315 from qdev-dk/ag-liveplot
@giulioungaretti giulioungaretti deleted the ag-liveplot branch October 31, 2016 14:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants