Skip to content

Conversation

@nakabonne
Copy link
Member

What this PR does / why we need it:
This PR adds a new apistore for Events according to @nghialv 's suggestion. It allows us to reduce communication number by caching the latest events.

This will be used by #1411 at first.

Which issue(s) this PR fixes:

Ref #1381

Does this PR introduce a user-facing change?:

NONE

@pipecd-bot
Copy link
Collaborator

TODO

The following ISSUES will be created once got merged. If you want me to skip creating the issue, you can use /todo skip command.

Details

1. Use UTC and let control-plane convert it into the control-plane's Local time

https://github.com/pipe-cd/pipe/blob/ab31e9a63e0bab6591e8c572bc06f982dc34ea32/pkg/app/piped/apistore/eventstore/store.go#L104-L107

This was created by todo plugin since "TODO:" was found in ab31e9a when #1429 was merged. cc: @nakabonne.

Comment on lines 104 to 105
// TODO: Use UTC and let control-plane convert it into the control-plane's Local time
// Unexpected behavior can be happened if Piped uses different timezone from the control-plane.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will look into this concern when working on another PR.

Copy link
Member

@nghialv nghialv Jan 14, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the timezone difference is not a problem, because we are using Unix time (the number of seconds elapsed since January 1, 1970 UTC.).
But the clock drift may cause the problem. But that is rare.

Btw, maybe we can leave the to parameter as zero to use time.Now of the server.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the timezone difference is not a problem, because we are using Unix time (the number of seconds elapsed since January 1, 1970 UTC.).

I put off thinking about it but after thinking, it's a basic thing and no problem as you mentiond.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Btw, maybe we can leave the to parameter as zero to use time.Now of the server.

Cool, I'll try it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Btw, maybe we can leave the to parameter as zero to use time.Now of the server.

After thinking, it makes derivating the next's from tougher. As you mentioned, clock drifts is reraly happened. I guess it's enough to be as-is.

Copy link
Member Author

@nakabonne nakabonne Jan 14, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not tougher, I came up with a better solution just now. I'm going to make From and To optional.

@pipecd-bot
Copy link
Collaborator

COVERAGE

Code coverage for golang is 33.89%. This pull request decreases coverage by -0.07%.

File Function Base Head Diff
pkg/app/piped/apistore/eventstore/store.go NewStore -- 0.00% +0.00%
pkg/app/piped/apistore/eventstore/store.go store.Run -- 0.00% +0.00%
pkg/app/piped/apistore/eventstore/store.go store.sync -- 0.00% +0.00%
pkg/app/piped/apistore/eventstore/store.go store.Getter -- 0.00% +0.00%
pkg/app/piped/apistore/eventstore/store.go store.GetLatest -- 0.00% +0.00%
pkg/model/event.go EventDefinitionID -- 100.00% +100.00%
pkg/app/api/grpcapi/piped_api.go PipedAPI.ListEvents 0.00% 0.00% +0.00%

Comment on lines 23 to 25
// EventDefinitionID builds a unique identifier based on the given name and labels.
// It returns the exact same string as long as both are the same.
func EventDefinitionID(name string, labels map[string]string) string {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've been thinking about what we should call something whose uniqueness is consists of its Name and Labels. I just settled on calling it Event Definition.

An Event is like an Instance In Object-oriented world, and an event definition is like a Class. Each Event has its own ID and at the same time, each Event definition has its own ID. The event definition ID must be the exact same string as long as Name and Labels are the same.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just figured out it would be nice if we build this when saving and keep it as a field of Event.


// EventDefinitionID builds a unique identifier based on the given name and labels.
// It returns the exact same string as long as both are the same.
func EventDefinitionID(name string, labels map[string]string) string {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will be used only by EventStore right. So I think it should be moved to that package as an internal function.
So because of an internal function, we will not have to care about its name.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually it will be used by #1406. It could be increased the need to build a unique identity made of Name and Labels, that's why I decided to make a helper function.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. Now we are having our cache/store for the event. So maybe we don't need to merge them for simplicity.

The user can merge by changing their configuration if needed.
We implicitly merge them and combine their changes into a single commit that may cause the behavior they were not expected.
(I am not sure that I understood the merging correctly.)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(I am not sure that I understood the merging correctly.)

Your understanding is correct. I attempted to merge it to combine their changes into a single commit.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I agree with all of them. So let's this function private and think it if needed in the future!

return nil, false
}

// NOTE: Don't update the cache to prevent it from being overwritten by slightly older ones.
Copy link
Member

@nghialv nghialv Jan 14, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why we don't update the in-memory map?
The event has CreatedAt field, so we can use that to check whether the event should be updated or not.

And I think in the microservices list of our users, there are always some applications that have not been actively developed compared to others. Their events will be very sparse. So without saving into the list, piped will always make the call to control-plane to get the same one.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The sync function periodically updates the cache, and it works in another goroutine. What I'm most worried about is, while processing GetLatest, sync can update the cache. I mean:

	// NOTE: Don't update the cache to prevent it from being overwritten by slightly older ones.
	eventMap = s.latestEventMap.Load().(map[string]*model.Event)
	eventMap[id] = resp.Event        // I'm most worried about the "sync()" function updates the cache at this point.
	s.latestEventMap.Store(eventMap) // Then the latest cache is overwritten by the old one here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If Piped was run on the multi-core processor, I think it can happen, but let me know if I'm wrong.

Copy link
Member

@nghialv nghialv Jan 14, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The data race can be resolved in some ways:

  • change from using atomic to using a mutex. (fetch the list from control-plane, lock, and update). I think it is fine enough for our case: 1 call/m and EventWatcher.
  • use 2 separate maps, one for LIST API, one for GET API

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, not fancy 😄 The first one looks enough! Only two goroutines can access this and it rarely happens to wait for unlocking. Thank you for telling me!

NONE = 0;
FROM_OLDEST = 1;
FROM_NEWEST = 2;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about ListOrder enum that contains NONE, DESC, ASC?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That sounds good, that's more versatile. I initially thought about making it but I wasn't sure which should be general, CreatedAt or UpdatedAt, that's why I made a special enum for now. but what do you think?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is fine to make ListOrder. It can be used by other RPCs too.
Let's do it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fine, I'll do so!

func (s *store) GetLatest(ctx context.Context, name string, labels map[string]string) (*model.Event, bool) {
eventMap := s.latestEventMap.Load().(map[string]*model.Event)
id := model.EventDefinitionID(name, labels)
event, ok := eventMap[id]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One thing I want to ask about our event matching, in case we have an event

name = foo
labels: { label-1: value-1, label-2: value-2}

That event will match the following case or not?

piVersion: pipecd.dev/v1beta1
kind: EventWatcher
spec:
  events:
    - name: foo
      labels:
           label-1: value-1
      replacements:
        - file: dev/app1/deployment.yaml
          yamlField: $.spec.template.spec.containers[0].image

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or all labels must be specified to be matched.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I totally assume exact matches are needed, but on second thought, seems like that case should be treated as a matching case. Let me think, just a moment please...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nghialv I'm still debating but which one do you prefer?

Copy link
Member

@nghialv nghialv Jan 14, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If possible, the event should be matched when the name is equal AND the event's labels contain the specified ones.
(Because they are labels (not ID) and users will have more options.)

So I think we can change the map to a new data structure like map[event-name]ListOfEventsHaveSameName.

But one thing we have to consider is what happens when multiple events match the specified label list.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But one thing we have to consider is what happens when multiple events match the specified label list.

So in order to avoid this complexity and avoid adding the dynamic label by users (e.g. version number),
I think we should not allow subset matching.

Name and all labels must be matched.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My feeling is exactly the same as yours. Some people can be helped by subset matching, but too complicated to handle for us. Thank you for a great opportunity to re-think it and letting me know your cool thought!

pipecd-bot pushed a commit that referenced this pull request Jan 14, 2021
**What this PR does / why we need it**:
Just removed a TODO comment. 

We just settled that we don't have to merge events when committing, on this discussion: #1429 (comment)

**Which issue(s) this PR fixes**:

Fixes #1406

**Does this PR introduce a user-facing change?**:
<!--
If no, just write "NONE" in the release-note block below.
-->
```release-note
NONE
```

This PR was merged by Kapetanios.
@nakabonne
Copy link
Member Author

/hold

@pipecd-bot
Copy link
Collaborator

COVERAGE

Code coverage for golang is 33.87%. This pull request decreases coverage by -0.09%.

File Function Base Head Diff
pkg/app/piped/apistore/eventstore/store.go NewStore -- 0.00% +0.00%
pkg/app/piped/apistore/eventstore/store.go store.Run -- 0.00% +0.00%
pkg/app/piped/apistore/eventstore/store.go store.sync -- 0.00% +0.00%
pkg/app/piped/apistore/eventstore/store.go store.Getter -- 0.00% +0.00%
pkg/app/piped/apistore/eventstore/store.go store.GetLatest -- 0.00% +0.00%
pkg/app/piped/apistore/eventstore/store.go eventDefinitionID -- 100.00% +100.00%
pkg/app/api/grpcapi/piped_api.go PipedAPI.ListEvents 0.00% 0.00% +0.00%

@nakabonne
Copy link
Member Author

/hold cancel

@pipecd-bot
Copy link
Collaborator

COVERAGE

Code coverage for golang is 33.87%. This pull request decreases coverage by -0.09%.

File Function Base Head Diff
pkg/app/piped/apistore/eventstore/store.go NewStore -- 0.00% +0.00%
pkg/app/piped/apistore/eventstore/store.go store.Run -- 0.00% +0.00%
pkg/app/piped/apistore/eventstore/store.go store.sync -- 0.00% +0.00%
pkg/app/piped/apistore/eventstore/store.go store.Getter -- 0.00% +0.00%
pkg/app/piped/apistore/eventstore/store.go store.GetLatest -- 0.00% +0.00%
pkg/app/piped/apistore/eventstore/store.go eventDefinitionID -- 100.00% +100.00%
pkg/app/api/grpcapi/piped_api.go PipedAPI.ListEvents 0.00% 0.00% +0.00%

// Make the cache up-to-date by traversing events sorted by oldest first.
var latestTime int64
s.mu.Lock()
for _, event := range resp.Events {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

resp.Events may contain multiple events for the same key/definition-id, so I think we should build their key/definition-id and filter duplicates before locking the mutex to update the list.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. I initially attempted to do so, but I didn't. That's necessary to keep the time to lock mutex as short as possible. But don't we really have to care about that performance? I guess we should prioritize to keep simplicity than performance at this time, that's why I settled on just sorting by oldest first and traversing all events.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But don't we really have to care about that performance?

Actually, I didn't want to say that we don't really care about the performance.
What I wanted to say before is that the complexity of the concurrency model is not worth the performance it brings.

But calculating the ID and filtering the duplicates before locking can be done easily. So it's worth it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're quite right. Adding filtering is not complicated enough to worry about complexity.

s.mu.Lock()
for _, event := range resp.Events {
id := eventDefinitionID(event.Name, event.Labels)
s.latestEventMap[id] = event
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are doing asynchronous operations so this event can be older than the one stored in the map.
Let's check the CreatedAt before overriding.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At the beginning of the loop, it is more likely to happen as you mentioned. But eventually, all Events on the cache will be up-to-date because the list of events is sorted by oldest first. I'd say it's enough but what do you think?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, now I could get what you're most worried about. Okay, let me fix them.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. The list is updated by GetLatest too.

}

s.mu.Lock()
s.latestEventMap[id] = resp.Event
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here, check CreatedAt too.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For this case, I totally agree with you 👍

@nakabonne
Copy link
Member Author

Thank you for your polite review. I've fixed all of them.

@pipecd-bot
Copy link
Collaborator

COVERAGE

Code coverage for golang is 33.83%. This pull request decreases coverage by -0.13%.

File Function Base Head Diff
pkg/app/piped/apistore/eventstore/store.go NewStore -- 0.00% +0.00%
pkg/app/piped/apistore/eventstore/store.go store.Run -- 0.00% +0.00%
pkg/app/piped/apistore/eventstore/store.go store.sync -- 0.00% +0.00%
pkg/app/piped/apistore/eventstore/store.go store.Getter -- 0.00% +0.00%
pkg/app/piped/apistore/eventstore/store.go store.GetLatest -- 0.00% +0.00%
pkg/app/piped/apistore/eventstore/store.go makeEventKey -- 100.00% +100.00%
pkg/app/api/grpcapi/piped_api.go PipedAPI.ListEvents 0.00% 0.00% +0.00%

if err != nil {
return fmt.Errorf("failed to list events: %w", err)
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Return fast if no events.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, the previous one could override the milestone to zero. I appreciate you.

}

// Eliminate events that have duplicated key.
var latestTime int64
Copy link
Member

@nghialv nghialv Jan 15, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: latestTime := resp.Events[len(resp.Events)-1].CreatedAt

s.mu.Unlock()

// Set the latest one within the result as the next time's "from".
s.milestone = latestTime + 1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: s.milestone := resp.Events[len(resp.Events)-1].CreatedAt + 1

@nghialv
Copy link
Member

nghialv commented Jan 15, 2021

Nice. Thank you.
Looks good.

@nakabonne
Copy link
Member Author

Fixed done.

@pipecd-bot
Copy link
Collaborator

COVERAGE

Code coverage for golang is 33.83%. This pull request decreases coverage by -0.13%.

File Function Base Head Diff
pkg/app/piped/apistore/eventstore/store.go NewStore -- 0.00% +0.00%
pkg/app/piped/apistore/eventstore/store.go store.Run -- 0.00% +0.00%
pkg/app/piped/apistore/eventstore/store.go store.sync -- 0.00% +0.00%
pkg/app/piped/apistore/eventstore/store.go store.Getter -- 0.00% +0.00%
pkg/app/piped/apistore/eventstore/store.go store.GetLatest -- 0.00% +0.00%
pkg/app/piped/apistore/eventstore/store.go makeEventKey -- 100.00% +100.00%
pkg/app/api/grpcapi/piped_api.go PipedAPI.ListEvents 0.00% 0.00% +0.00%

@nghialv
Copy link
Member

nghialv commented Jan 15, 2021

Nice.
/lgtm

@khanhtc1202
Copy link
Member

🚀
/approve

@pipecd-bot
Copy link
Collaborator

APPROVE

This pull request is APPROVED by khanhtc1202.

Approvers can cancel the approval by writing /approve cancel in a comment. Any additional commits also will change this pull request to be not-approved.

@pipecd-bot pipecd-bot merged commit 4814c5d into master Jan 15, 2021
@pipecd-bot pipecd-bot deleted the eventstore branch January 15, 2021 05:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants