-
Notifications
You must be signed in to change notification settings - Fork 57
Conversation
@philipcmonk I've retargeted this at |
They can produce both effects and values. In this case, for example, we produce an updated dome (desk state) and rang (global object store). The code that evaluates the monad (
Wireless computing is the future. Well, at least when you're inside the monad (the monad interpreter uses wires, of course). Although, I think I might add a path field to the notes we can produce, which will be appended to the wire for convenient |
@jtobin whoops, I'll merge from there. |
a425322
to
c9c371d
Compare
c9c371d
to
8cd2614
Compare
"Produces an effect which is evaluated by clay itself" is what I was gesturing at wrt indirect state changes, nod
My concern is not upgradable state, but upgradable code: it used to be possible to add a printf in the middle of a transaction(or expand a step's input type according to an interface change), now it is not. I agree that the |
Adding printfs without restarting a transaction is an interesting use I hadn't thought of, but in general upgrading code isn't always possible in the middle of a transaction, and our usual approach has been "ignore any ships in the middle of a transaction and hope they don't sink", which is worse. My guess is that in practice restarting the transaction on upgrade will be fine for debugging. |
I think I introduced the unix sync bug when I redid the mime cache a few months ago. Sorry! I didn't realize it broke syncing to Unix. The reason I messed with the mime cache was to reduce restart time. When you restart Urbit (at least on master), I believe it rebuilds every mounted file in the filesystem. This meant restarting a ship could take in the tens of seconds, if I recall correctly. I feel lile there should be a way to unbork unix syncs without losing the caching. |
It should be noted that this PR is now also a proposal to merge |
Ah, you're correct. It looks like sync portion of restarts take about 5-20 seconds without the mime cache, and about 2 seconds with it. I re-added it, updating it on every As an aside, I dislike how ford's build times seem to be extremely variable. Maybe it's missing caches for spurious reasons? |
Is |
|
There seems to be a fairly frequent oscillation between master being
stable+deployed and having a "devel" branch, and deploys being tied to
stable releases with master containing whatever code is desired for deploy
sometime soon. Is there a collection anywhere of arguments for the
respective development models?
…On Wednesday, 15 May 2019, Jared Tobin ***@***.***> wrote:
It should be noted that this PR is now also a proposal to merge next into
master, as this work was done on top of next.
Is next deprecated? Should everything branch off of and PR to master?
next is officially deprecated. I'll actually go ahead and delete the
branch soon enough.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#1169?email_source=notifications&email_token=AAOFPBU3BB7IN3TUNQR6KHLPVSUJRA5CNFSM4HMW3DB2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODVQKLFY#issuecomment-492873111>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAOFPBRKKH4VPTUPBUJR43LPVSUJRANCNFSM4HMW3DBQ>
.
|
Some of Clay's Ford builds have around 1500 sub-builds, which has a tendency to flush the Ford cache. This is a common problem in caching, and systems that can handle this sort of thing well are called scan-resistant. Ford's cache replacement is a basic clock algorithm and is not scan-resistant. |
Not to derail the conversation around the actual contribution here too much, but the idea is to move to the latter scheme you've mentioned -- i.e. to scrap the non-master branches that have semantic operational meaning ascribed to them (so: (Hotfixes, which are really just "unscheduled releases," can be accomplished by applying commits to the relevant release tag in a distinct branch or what have you.) This generally just feels simpler than the "dev branch" model, which has felt superfluous to me in all projects I've worked on that use it, but it arguably requires a technical release manager of sorts to make sure things go smoothly. I've volunteered for that job, so hopefully things go smoothly.. 😄 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
Sorry for the delayed review. FWIW, I have no objections to merging next
into master
now. The only issue I can see is that %crud
error printing in %dill now assumes the tweaked event-replacement log from cc-release, and that's a minor inconvenience at most.
A tangent on restart and caching: it should be generally knowledge that all the vere caches are completely cleared on restart. This is done out of an abundance of caution after the livenet memory issues from earlier this year.
These isolated clay transactions seem like a big improvement. From a style standpoint, it feels like more of the data transformation could be factored out of the bind invocations, which would bring them closer together and make the I/O and state transitions even clearer. And it would be nice to bundle up +clad
with its interpreter(s) (or to factor out some kind of generic interpreter). But these are just nits ... Nice work!
Addressed to some extent in #1174. |
This rather a large PR; I'm happy to walk reviewers through it in person if that's easier. I think most people have some context for these changes.
A few changes here, most of which are due to a single issue. There was a 1-line fix or a 2000-line fix. I chose the latter, of course. The most important changes are these:
%drip
(only the changes in sys/vane/behn.hoon)
While debugging an indivdual breach bug, it turned out the issue was that clay was getting the sunk notification before ames, so clay's subscribers were trying to restart syncs before ames cleared its state -- which clears the attempted restarts. One solution would be to reverse the order of the sunk notifications (the 1-line fix), but the real issue is that when clay gives a subscription update, it shouldn't do that in the same event.
This is a textbook case for what we've informally called "toss". In reality, there are two related things we've called "toss". I've named one of them "drip", and that's the one I've implemented. "toss" means to "pass" in a later event (often the next event, but not necessarily if other stuff is in the queue). "drip" means to "give" in a later event.
Eventually, these should be implemented in arvo alongside give and pass. Then, in places where you want to give in the next event, you change:
For
For now, I've implemented it in behn, with a "slip" calling convention, so it becomes:
Note that the sign will come into the receiving vane from behn now. If all instances of that gift will be dripped through behn, the the receiving vane can switch the associated sign from the sending vane to behn. If not all will be dripped, then it must include a sign entry for both vanes. This wrinkle will go away when drip is implemented in arvo.
Clay drip
All subscription updates from clay (
%writ
and%wris
gifts) are now dripped through behn. Thus, a commit can't fail because a subscriber crashed. Additionally, this make it much more likely that a commit or a merge will take more than one arvo event (since both involve several round-trips to ford, involving drips). It has always been the case that a commit could take more than one event, but it was pretty rare so some code assumed it didn't.Relevant to individual breaches, this makes it so that any clay changes resulting from a sunk notification trigger kiln in the next event, so Ames has already had time to handle the sunk notification.
Transactions in clay
Several parts of clay assumed that a commit would take exactly one arvo event. Since that's no longer the case, we needed a way to keep an in-progress clay commit and maintain a queue of pending commits or merges which will be started after the current one finishes. In other words, we needed "macrotransactions" -- transactions which take more than one arvo event.
We use the "clad" monad, which is very similar to the "ph" monad, to provide this transaction layer. All effects are stored until the transaction is completed, when they are all emitted at once. In case of upgrade, if you don't write any special upgrade logic in
+load
, the current write transaction will continue un-upgraded until it finishes (so a copy of the old kernel will stay until then). If you changed enough that you prefer to kill and restart the current transaction, you may do so with confidence that you won't leave it in an inconsistent state.Overall, I believe this makes clay drastically less fragile. In particular, the data flow in the code is much more readable, and I'm no longer afraid that small changes will introduces subtle bugs in far-away sections of the code. Also, I was able to deleted hundreds of lines of comments describing the invariants required in the code because they were no longer necessary.
This is the vast majority of the changes in clay.hoon.
Individual breaches for clay
This implements the "forget they ever existed" interpretation of breaches for clay. In particular, we cancel all clay subscriptions to that ship, wipe our clay memory of it, and clear the ford cache entirely. This code is in clay.hoon in
+handle-task
under case%sunk
(you can just grep for "sunk"). There are some related changes inlib/hood/kiln.hoon
to properly restart syncs when the other ship sinks.The main thing I've tested is that when a parent breaches, the child stays synced to their new instance.
Limitations:
This doesn't fix the ames/jael bootstrapping issues. In practice, this means that to breach a parent, you must either (1) make sure the old parent keeps running until it pushes the jael breach notification to all its children or (2) have all its children switch to another authority for jael, eg with
|eth-manage %look
. The first is not possible if you destroyed your ship too badly, and second isn't 100-year-compatible. This will be my next task.