Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Creating new table in peewee database #68

Open
nicolae-stroncea opened this issue Aug 25, 2018 · 8 comments
Open

Creating new table in peewee database #68

nicolae-stroncea opened this issue Aug 25, 2018 · 8 comments

Comments

@nicolae-stroncea
Copy link
Member

I've created a new table in the peewee.py file. I've tried both make build and make install in aw-core, yet it doesn't seem like the peewee file ever runs the code. I've looked into the source code and couldn't find when the PeeweeStorage was initialized. Any idea on how to get the file to run manually?

Code below:

if not BucketModel.table_exists():
    BucketModel.create_table()
if not EventModel.table_exists():
    EventModel.create_table()
if not TestModel.table_exists():
    TestModel.create_table()
self.update_bucket_keys()
@ErikBjare
Copy link
Member

ErikBjare commented Aug 25, 2018

What you did should be enough, if you put a print statement in the PeeweeStorage.__init__ does it properly print?

If you're going to write a cache I'd suggest you'd use something else than the datastore storage strategies like PeeweeStorage (which shouldn't be complicated by caching, and caching should be storage-method independent).

@johan-bjareholt is working on implementing a storage method using plain SQL for better performance (see PR in aw-core), yet another reason why you'd want to be storage method independent.

I'm still not quite sure how you're planning to do the caching, how would it work with queries?

@johan-bjareholt
Copy link
Member

I'm still not quite sure how you're planning to do the caching, how would it work with queries?

When I attempted this previously I had an ID for each query and saved the result. My plan for it was to later query it in a hierarchy like hour->day->week->month->year (which is likely very hard due to timezone issues like you said) but never got that far since I thought that a good query system was of higher priority (query2). It would also be good to save down the last time of access of the cache so we can auto-clean the cache so it doesn't grow too much.

@ErikBjare
Copy link
Member

@johan-bjareholt Yeah but queries can return all kinds of results, how would you know how to merge query results?

@johan-bjareholt
Copy link
Member

johan-bjareholt commented Aug 27, 2018

Yeah but queries can return all kinds of results, how would you know how to merge query results?

That didn't use to be the case in query1, good point.
We could do some kind of index on some fields on some buckettypes which are useful and not very unique (such as appname in on currentwindow or language in app.editor.current). We probably shouldn't hardcode that though, so not sure how to do that properly (an extra field when creating a bucket?). Possibly also different types of aggregation types also, such as average on numbers (average tabs open for example). Since the fields in "data" are not hardcoded though this could be a bit sketchy.

EDIT: I do not think this is a good solution, just throwing around ideas

@nicolae-stroncea
Copy link
Member Author

nicolae-stroncea commented Aug 27, 2018

@ErikBjare @johan-bjareholt

This is the structure I was considering:
Store the data according to the following columns:
Key(URLs, domains, app_events, title_events), Value(i.e github.com, localhost:5600), Duration, and Date.
It would make it easy to insert and perform queries on it.

You can essentially run the same functions for summary(browserSummaryQuery, windowQuery), and then just insert the data from them into the table.
When you'd do the summary over a time period, you'd select the key(depending on what you want to summarize), and the date period.

What do you guys think?
EDIT: Instead of Duration you can have "Duration+AFK" and "Duration-AFK" depending on whether the user wants to filter afk time or not.

@johan-bjareholt
Copy link
Member

I still don't believe that this is a good idea though because it's not flexible enough.

On the other hand, we could just invalidate the cache every time the user upgrades activitywatch and we have a new format.
Still doesn't seem like a clean solution to me though.

@nicolae-stroncea
Copy link
Member Author

What other features would a more flexible cache have?

@johan-bjareholt
Copy link
Member

A flexible cache would not hardcode the columns of the DB table.
We want to allow third-party buckettypes to be cached as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants