Implement list interface for filedb #3329

khanhtc1202 · 2022-03-01T13:49:50Z

What this PR does / why we need it:

Which issue(s) this PR fixes:

Fixes #

Does this PR introduce a user-facing change?:

NONE

pipecd-bot

Some issues were detected while linting go source files in your changes.

pipecd-bot · 2022-03-01T13:50:47Z

pkg/datastore/filedb/iterator.go

+	data []interface{}
+}
+
+func (it *Iterator) Next(dst interface{}) error {


dst is unused in Next

pipecd-bot · 2022-03-01T14:15:45Z

Code coverage for golang is 32.44%. This pull request decreases coverage by -0.08%.

File	Function	Base	Head	Diff
pkg/datastore/filedb/filedb.go	makeHotStorageDirPath	`--`	`0.00%`	`+0.00%`
pkg/datastore/filedb/filter.go	filter	`--`	`0.00%`	`+0.00%`
pkg/datastore/filedb/iterator.go	Iterator.Next	`--`	`0.00%`	`+0.00%`
pkg/datastore/filedb/iterator.go	Iterator.Cursor	`--`	`0.00%`	`+0.00%`
pkg/datastore/filedb/filedb.go	FileDB.Find	`0.00%`	`0.00%`	`+0.00%`

nghialv · 2022-03-02T01:36:29Z

pkg/datastore/filedb/filedb.go

+		parts, err := f.backend.List(ctx, dpath)
+		if err != nil {
+			f.logger.Error("failed to find entities",
+				zap.String("kind", kind),
+				zap.Error(err),
+			)
+			return nil, err
+		}
+
+		if objects == nil {
+			objects = make(map[string][][]byte, len(parts))
+		}
+		for _, obj := range parts {
+			id := filepath.Base(obj.Path)
+
+			data, err := f.fetch(ctx, obj.Path)


Oops. I was thinking that the List function also returns data of file objects so that we can have their contents without requiring any extra requests.
By this way, a lot of requests number_of_shards * (1 + number_of_objects) will be required in a short time and I think it will not be realistic.

Do we have any better idea for that problem?

I feel you, the (n+1) problem, right. Tbh, I was thinking as you at first but when I read the place we use this filestore List interface (ref: planpreview cleaner) I got the same surprise as you have now 😄

I tried to work around with the "list" support API from our supporting filestore (gcs, s3 and minio), and looks like they only support the get object by key as the only way to fetch the raw data of an object, all list API seems only returns the attributes and meta which we can use to fetch the content. I will investigate more about that, but in the worst case, I think we have 2 points to rely on which are:

the number of the object in hot storage of each kind is expected to be small enough, we may need to do some kind of middleware fetching objects parts in parallel here if necessary

cache storage in API layer will reduce the number of times we have to list this in a direct way.

Wdyt?

Thanks for your explanation.

the number of the object in hot storage of each kind is expected to be all enough, we may need to do some kind of middleware fetching objects parts in parallel here if necessary

I'm afraid this way is not appropriate since we will make a burst of requests on external services so we could get the limit error.

cache storage in API layer will reduce the number of times we have to list this in a direct way.

Yes, that could be an approachable way. I think it is time to think about that before continuing implementation.
What cache solution do you think about now?

@nghialv thank you so much for your comment 🙏 I added logic to check whether the raw data of the object part is updated or not based on the etag value returned from List objects request. For now, we will only fetch for the object part stored under the given path in case there is no version of it found in filedb.cache. PTAL when you have time 🙌

Thank you. Let me take a look.

khanhtc1202 · 2022-03-02T05:35:55Z

/hold

…interface

pipecd-bot · 2022-03-03T05:16:45Z

Code coverage for golang is 32.44%. This pull request decreases coverage by -0.08%.

File	Function	Base	Head	Diff
pkg/datastore/filedb/filedb.go	makeHotStorageDirPath	`--`	`0.00%`	`+0.00%`
pkg/datastore/filedb/filter.go	filter	`--`	`0.00%`	`+0.00%`
pkg/datastore/filedb/iterator.go	Iterator.Next	`--`	`0.00%`	`+0.00%`
pkg/datastore/filedb/iterator.go	Iterator.Cursor	`--`	`0.00%`	`+0.00%`
pkg/datastore/filedb/filedb.go	FileDB.Find	`0.00%`	`0.00%`	`+0.00%`

…interface

khanhtc1202 · 2022-03-03T06:19:42Z

/hold cancel

pipecd-bot · 2022-03-03T06:29:44Z

Code coverage for golang is 32.42%. This pull request decreases coverage by -0.09%.

File	Function	Base	Head	Diff
pkg/datastore/filedb/filedb.go	makeHotStorageDirPath	`--`	`0.00%`	`+0.00%`
pkg/datastore/filedb/filter.go	filter	`--`	`0.00%`	`+0.00%`
pkg/datastore/filedb/iterator.go	Iterator.Next	`--`	`0.00%`	`+0.00%`
pkg/datastore/filedb/iterator.go	Iterator.Cursor	`--`	`0.00%`	`+0.00%`
pkg/datastore/filedb/filedb.go	FileDB.Find	`0.00%`	`0.00%`	`+0.00%`

nghialv · 2022-03-03T06:32:03Z

pkg/datastore/filedb/filedb.go

+			cdata, err := f.cache.Get(obj.Etag)
+			if err == nil {
+				objects[id] = append(objects[id], cdata.([]byte))
+				continue
+			}


How about grouping and storing all models of a kind using HashCache https://github.com/pipe-cd/pipecd/blob/master/pkg/cache/rediscache/hashcache.go?
Then we can fetchAll to check instead of calling every time like this.

And I think we need to design the key name carefully instead of directly using the Etag value to avoid conflict with other keys in Redis. At other places we're prefixing the name.

I feel you, in that case, it's possible to have some objects updated while the whole others are not, meaning that we still for loop to check which one is updated and which one is not. In that case, we have a minor point to care about: Since I store the whole object content as the value for etag key, if we group it as a single value and use GetAll, it's a bit dangerous in case there are a lot of objects. Of course, as the downside of the current implementation, we may have a bunch of requests to cache to check whether the object is updated or not based on its etag. Wdyt about this trade-off 🤔

And I think we need to design the key name carefully instead of directly using the Etag value to avoid conflict with other keys in Redis. At other places we're prefixing the name.

Nice catch, lets me address it 🙆‍♂️ tbh, I made the key as id_shard_etag at first, but just feel like it's overdone, lets me add etag_ prefix to this key as other places.

it's possible to have some objects updated while the whole others are not

Yes, it is. For the outdated ones, we call Cache HSET to update after fetching directly from the file store.

if we group it as a single value on use GetAll, it's a bit dangerous in case there are a lot of objects

I see. Storing by HashCache, I mean the Field Key is the object ID instead of eTag, eTag is included in the value with object content. So the number of objects will not increase when the object is updated.
When storing by ETag the number of objects in cache will increase quickly whenever an object has been updated. In that case, we will need TTL or something like that to deal with that problem.

Wdyt?

How about this

Normal cache (not hashcache)

FieldKey: entity_id_shard

FieldValue: {etag: etag_value, data: data}

HashCache only helps us to reduce the number of fetch cache request, but the downside is that we have 2 for loop to handle which should be updated, and the size of the value on the hashkey can be too large since we store the whole object content.

What I am concerned about that way is that it still requires N requests to our Redis.
But definitely, it is simpler so let's apply that way for now and see how it goes. 👍

What I am concerned about that way is that it still requires N requests to our Redis.

I feel you, then why not both 🤔
I mean we can make our cache a bit more complicated to handle this case, we will have:

hashcache as your drafted with:

hashkey: List_{kind}

field key: entityId_shard

field value: etagValue

cache to store object data

key: etag_entityId (I mean etag_ is the prefix not the value of etag)

value: {etag: etagValue, data: data}

And whenever we update the etagValue in the hashcache, we update the cache storing object content as well. Wdyt 👀

oops, I forgot that we still need to fetch cache to get the content by separated cache request. Please forget the above suggestion 🙏

Updated, PTAL when you have time 😉

pipecd-bot · 2022-03-03T06:56:29Z

Code coverage for golang is 32.42%. This pull request decreases coverage by -0.09%.

File	Function	Base	Head	Diff
pkg/datastore/filedb/filedb.go	makeHotStorageDirPath	`--`	`0.00%`	`+0.00%`
pkg/datastore/filedb/filedb.go	makeCacheEtagKey	`--`	`0.00%`	`+0.00%`
pkg/datastore/filedb/filter.go	filter	`--`	`0.00%`	`+0.00%`
pkg/datastore/filedb/iterator.go	Iterator.Next	`--`	`0.00%`	`+0.00%`
pkg/datastore/filedb/iterator.go	Iterator.Cursor	`--`	`0.00%`	`+0.00%`
pkg/datastore/filedb/filedb.go	FileDB.Find	`0.00%`	`0.00%`	`+0.00%`

pipecd-bot · 2022-03-03T09:53:00Z

Code coverage for golang is 32.39%. This pull request decreases coverage by -0.12%.

File	Function	Base	Head	Diff
pkg/datastore/filedb/filedb.go	makeHotStorageDirPath	`--`	`0.00%`	`+0.00%`
pkg/datastore/filedb/filter.go	filter	`--`	`0.00%`	`+0.00%`
pkg/datastore/filedb/iterator.go	Iterator.Next	`--`	`0.00%`	`+0.00%`
pkg/datastore/filedb/iterator.go	Iterator.Cursor	`--`	`0.00%`	`+0.00%`
pkg/datastore/filedb/objectcache/cache.go	NewCache	`--`	`0.00%`	`+0.00%`
pkg/datastore/filedb/objectcache/cache.go	objectCache.Get	`--`	`0.00%`	`+0.00%`
pkg/datastore/filedb/objectcache/cache.go	objectCache.Put	`--`	`0.00%`	`+0.00%`
pkg/datastore/filedb/objectcache/cache.go	makeKey	`--`	`0.00%`	`+0.00%`
pkg/datastore/filedb/filedb.go	FileDB.Find	`0.00%`	`0.00%`	`+0.00%`

pipecd-bot · 2022-03-03T10:00:00Z

Code coverage for golang is 32.39%. This pull request decreases coverage by -0.12%.

File	Function	Base	Head	Diff
pkg/datastore/filedb/filedb.go	makeHotStorageDirPath	`--`	`0.00%`	`+0.00%`
pkg/datastore/filedb/filter.go	filter	`--`	`0.00%`	`+0.00%`
pkg/datastore/filedb/iterator.go	Iterator.Next	`--`	`0.00%`	`+0.00%`
pkg/datastore/filedb/iterator.go	Iterator.Cursor	`--`	`0.00%`	`+0.00%`
pkg/datastore/filedb/objectcache/cache.go	NewCache	`--`	`0.00%`	`+0.00%`
pkg/datastore/filedb/objectcache/cache.go	objectCache.Get	`--`	`0.00%`	`+0.00%`
pkg/datastore/filedb/objectcache/cache.go	objectCache.Put	`--`	`0.00%`	`+0.00%`
pkg/datastore/filedb/objectcache/cache.go	makeKey	`--`	`0.00%`	`+0.00%`
pkg/datastore/filedb/filedb.go	FileDB.Find	`0.00%`	`0.00%`	`+0.00%`

nghialv · 2022-03-03T10:10:49Z

pkg/datastore/filedb/objectcache/cache.go

+}
+
+func makeKey(shard datastore.Shard, id string) string {
+	return fmt.Sprintf("filedb_object_%s_%s", id, shard)


Let's follow our key name convention.
https://github.com/pipe-cd/pipecd/blob/master/pkg/app/server/unregisteredappstore/store.go#L103

Sure 👍 Addressed by 2b6d569 🙏

pipecd-bot · 2022-03-03T12:19:00Z

Code coverage for golang is 32.39%. This pull request decreases coverage by -0.12%.

File	Function	Base	Head	Diff
pkg/datastore/filedb/filedb.go	makeHotStorageDirPath	`--`	`0.00%`	`+0.00%`
pkg/datastore/filedb/filter.go	filter	`--`	`0.00%`	`+0.00%`
pkg/datastore/filedb/iterator.go	Iterator.Next	`--`	`0.00%`	`+0.00%`
pkg/datastore/filedb/iterator.go	Iterator.Cursor	`--`	`0.00%`	`+0.00%`
pkg/datastore/filedb/objectcache/cache.go	NewCache	`--`	`0.00%`	`+0.00%`
pkg/datastore/filedb/objectcache/cache.go	objectCache.Get	`--`	`0.00%`	`+0.00%`
pkg/datastore/filedb/objectcache/cache.go	objectCache.Put	`--`	`0.00%`	`+0.00%`
pkg/datastore/filedb/objectcache/cache.go	makeObjectKey	`--`	`0.00%`	`+0.00%`
pkg/datastore/filedb/filedb.go	FileDB.Find	`0.00%`	`0.00%`	`+0.00%`

pkg/datastore/filedb/objectcache/cache.go

Co-authored-by: Le Van Nghia <[email protected]>

pipecd-bot · 2022-03-03T12:35:45Z

Code coverage for golang is 32.39%. This pull request decreases coverage by -0.12%.

File	Function	Base	Head	Diff
pkg/datastore/filedb/filedb.go	makeHotStorageDirPath	`--`	`0.00%`	`+0.00%`
pkg/datastore/filedb/filter.go	filter	`--`	`0.00%`	`+0.00%`
pkg/datastore/filedb/iterator.go	Iterator.Next	`--`	`0.00%`	`+0.00%`
pkg/datastore/filedb/iterator.go	Iterator.Cursor	`--`	`0.00%`	`+0.00%`
pkg/datastore/filedb/objectcache/cache.go	NewCache	`--`	`0.00%`	`+0.00%`
pkg/datastore/filedb/objectcache/cache.go	objectCache.Get	`--`	`0.00%`	`+0.00%`
pkg/datastore/filedb/objectcache/cache.go	objectCache.Put	`--`	`0.00%`	`+0.00%`
pkg/datastore/filedb/objectcache/cache.go	makeObjectKey	`--`	`0.00%`	`+0.00%`
pkg/datastore/filedb/filedb.go	FileDB.Find	`0.00%`	`0.00%`	`+0.00%`

nghialv · 2022-03-03T12:40:59Z

Here you go.
/lgtm

pkg/datastore/filedb/filedb.go

pipecd-bot · 2022-03-03T13:15:08Z

The following ISSUES will be created once got merged. If you want me to skip creating the issue, you can use /todo skip command.

Details

1. Implement filterable interface for each collection.

pipecd/pkg/datastore/filedb/filter.go

Lines 21 to 24 in 8a13375

    
           // TODO: Implement filterable interface for each collection. 
        
           type filterable interface { 
        
           	Match(e interface{}, filters []datastore.ListFilter) (bool, error) 
        
           }

This was created by todo plugin since "TODO:" was found in `8a13375` when #3329 was merged. cc: @khanhtc1202.

pipecd-bot · 2022-03-03T13:21:15Z

Code coverage for golang is 32.39%. This pull request decreases coverage by -0.12%.

File	Function	Base	Head	Diff
pkg/datastore/filedb/filedb.go	makeHotStorageDirPath	`--`	`0.00%`	`+0.00%`
pkg/datastore/filedb/filter.go	filter	`--`	`0.00%`	`+0.00%`
pkg/datastore/filedb/iterator.go	Iterator.Next	`--`	`0.00%`	`+0.00%`
pkg/datastore/filedb/iterator.go	Iterator.Cursor	`--`	`0.00%`	`+0.00%`
pkg/datastore/filedb/objectcache/cache.go	NewCache	`--`	`0.00%`	`+0.00%`
pkg/datastore/filedb/objectcache/cache.go	objectCache.Get	`--`	`0.00%`	`+0.00%`
pkg/datastore/filedb/objectcache/cache.go	objectCache.Put	`--`	`0.00%`	`+0.00%`
pkg/datastore/filedb/objectcache/cache.go	makeObjectKey	`--`	`0.00%`	`+0.00%`
pkg/datastore/filedb/filedb.go	FileDB.Find	`0.00%`	`0.00%`	`+0.00%`

knanao · 2022-03-04T01:09:07Z

Here you go!
/approve

pipecd-bot · 2022-03-04T01:09:10Z

This pull request is APPROVED by knanao.

Approvers can cancel the approval by writing /approve cancel in a comment. Any additional commits also will change this pull request to be not-approved.

Implement list interface for filedb

b9531dd

pipecd-bot added release-note-none area/piped size/L labels Mar 1, 2022

pipecd-bot reviewed Mar 1, 2022

View reviewed changes

Remove unrelated

122b613

nghialv reviewed Mar 2, 2022

View reviewed changes

pipecd-bot added the do-not-merge/hold label Mar 2, 2022

khanhtc1202 added 3 commits March 2, 2022 16:33

Merge branch 'master' of github.com:pipe-cd/pipe into implement-list-…

84ac312

…interface

Merge branch 'master' of github.com:pipe-cd/pipe into implement-list-…

bceb766

…interface

Add cache to filedb

833609f

khanhtc1202 added 2 commits March 3, 2022 15:17

Merge branch 'master' of github.com:pipe-cd/pipe into implement-list-…

2806d47

…interface

Add etag check logic

24a72fe

pipecd-bot removed the do-not-merge/hold label Mar 3, 2022

remove unused code

05d76c5

nghialv reviewed Mar 3, 2022

View reviewed changes

Add prefix to etag cache key

149f852

Add objectcache layer

75da18e

Add shard as key

d37524f

nghialv reviewed Mar 3, 2022

View reviewed changes

Update key form

2b6d569

nghialv reviewed Mar 3, 2022

View reviewed changes

pkg/datastore/filedb/objectcache/cache.go Outdated Show resolved Hide resolved

Update pkg/datastore/filedb/objectcache/cache.go

cfbc42d

Co-authored-by: Le Van Nghia <[email protected]>

pipecd-bot added the lgtm label Mar 3, 2022

khanhtc1202 commented Mar 3, 2022

View reviewed changes

pkg/datastore/filedb/filedb.go Outdated Show resolved Hide resolved

Update pkg/datastore/filedb/filedb.go

8a13375

pipecd-bot removed the lgtm label Mar 3, 2022

pipecd-bot added the approved label Mar 4, 2022

pipecd-bot merged commit 102821c into master Mar 4, 2022

pipecd-bot deleted the implement-list-interface branch March 4, 2022 01:17

This was referenced Mar 4, 2022

Implement filterable interface for each collection. #3338

Closed

Move launcher config to config package #3362

Merged

This was referenced Mar 17, 2022

Fix bug that the specified kubeConfigPath in Piped config was not used while handling manifests #3410

Merged

Use the latest codegen #3414

Merged

Delete unneeded CODEOWNERS file #3416

Merged

Implement list interface for filedb #3329

Implement list interface for filedb #3329

Uh oh!

Conversation

khanhtc1202 commented Mar 1, 2022

Uh oh!

pipecd-bot left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

pipecd-bot commented Mar 1, 2022

Uh oh!

nghialv Mar 2, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

khanhtc1202 Mar 2, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

khanhtc1202 commented Mar 2, 2022

Uh oh!

pipecd-bot commented Mar 3, 2022

Uh oh!

khanhtc1202 commented Mar 3, 2022

Uh oh!

pipecd-bot commented Mar 3, 2022

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

khanhtc1202 Mar 3, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

khanhtc1202 Mar 3, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

khanhtc1202 Mar 3, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

khanhtc1202 Mar 3, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

pipecd-bot commented Mar 3, 2022

Uh oh!

pipecd-bot commented Mar 3, 2022

Uh oh!

pipecd-bot commented Mar 3, 2022

Uh oh!

Choose a reason for hiding this comment

Uh oh!

khanhtc1202 Mar 3, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

pipecd-bot commented Mar 3, 2022

Uh oh!

Uh oh!

pipecd-bot commented Mar 3, 2022

Uh oh!

nghialv Mar 2, 2022 •

edited

Loading

khanhtc1202 Mar 2, 2022 •

edited

Loading

khanhtc1202 Mar 3, 2022 •

edited

Loading

khanhtc1202 Mar 3, 2022 •

edited

Loading

khanhtc1202 Mar 3, 2022 •

edited

Loading

khanhtc1202 Mar 3, 2022 •

edited

Loading

khanhtc1202 Mar 3, 2022 •

edited

Loading

This was created by todo plugin since "TODO:" was found in `8a13375` when #3329 was merged. cc: @khanhtc1202.