Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Diff and Extract View Filtering Options #1749

Open
wants to merge 39 commits into
base: master
Choose a base branch
from

Conversation

jetchirag
Copy link
Contributor

@jetchirag jetchirag commented Jul 11, 2023

Description

WIP. Adds filtering option(s) to Vorta diff and extract view.

  • Add a Search button to avoid resource issue with large tree
  • Add syntax search
  • View specific search options
  • Tests

Related Issue

Fixes #1674

Screenshots (if appropriate):

image

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)

Checklist:

  • I have read the CONTRIBUTING guide.
  • My code follows the code style of this project.
  • My change requires a change to the documentation.
  • I have updated the documentation accordingly.
  • I have added tests to cover my changes.
  • All new and existing tests passed.

I provide my contribution under the terms of the license of this repository and I affirm the Developer Certificate of Origin.


Follow ups

  • Document search syntax in Vorta docs
  • Link documentation somewhere near the search bar
  • Implement helper widget for inserting the search syntax
  • Show error message when search syntax is violated

Signed-off-by: Chirag Aggarwal <[email protected]>
@jetchirag
Copy link
Contributor Author

Any suggestions how I can expand parent from QSortFilterProxyModel?

@jetchirag jetchirag changed the title Initial Search box Diff and Extract View Filtering Options Jul 11, 2023
@real-yfprojects
Copy link
Collaborator

Any suggestions how I can expand parent from QSortFilterProxyModel?

What do you mean by 'expand' and 'parent'?

@jetchirag
Copy link
Contributor Author

@real-yfprojects When a match is found in an item, I want the parent items of that item to expand to reveal the matched child.

I've also spent un-necessarily high time thinking about the search syntax. There are lot of good suggestions in the linked issue. I'm thinking this currently:

Search input must start with either a normal string or re: depending on which the name will be searched using either in or re.search. The search input will be split into dictionary afterwards which can contain path: and other other syntax as needed. All of this will be joined and filtered using and condition.

It will also be easier to create a dialog box using this pattern as suggested in the issue and include regex search. Just wanted to run this by you before I start working on this.

@real-yfprojects
Copy link
Collaborator

When a match is found in an item, I want the parent items of that item to expand to reveal the matched child.

You could use QTreeView::scrollTo. I like the idea but what about multiple matches?

@real-yfprojects
Copy link
Collaborator

Just wanted to run this by you before I start working on this.

Sounds good! Can you describe the syntax in detail and present examples before implementing it?

src/vorta/views/partials/treemodel.py Show resolved Hide resolved
src/vorta/views/partials/treemodel.py Outdated Show resolved Hide resolved
src/vorta/views/partials/treemodel.py Outdated Show resolved Hide resolved
@jetchirag
Copy link
Contributor Author

It should give you an idea what I'm thinking

Valid:
important.txt path:/home/chirag/ <--- if search_pattern in name
re:^doc_.*.pdf$ path:/home/chirag/work size<:1MB <--- if re.search(search_pattern, name)

Invalid:
something re:^doc_.*.pdf$
path:/home/chirag/ important.txt

I've not look at implementation of size though so I'll have to figure this out when I start implementing this.

@real-yfprojects
Copy link
Collaborator

It should give you an idea what I'm thinking

Valid:
important.txt path:/home/chirag/ <--- if search_pattern in name
re:^doc_.*.pdf$ path:/home/chirag/work size<:1MB <--- if re.search(search_pattern, name)

Invalid:
something re:^doc_.*.pdf$
path:/home/chirag/ important.txt

Why is the last one invalid? And isn't path: the same as re:^?

I've not look at implementation of size though so I'll have to figure this out when I start implementing this.

Yes, start with the path-based filters. Then you could proceed with implemeting the same for the extract view. In the process it would be great if you moved the common GUI and code into a superclass implementation to reduce redundancy. Afterwards you can think about implemeting filtering by other fields.

@jetchirag
Copy link
Contributor Author

jetchirag commented Jul 14, 2023

Why is the last one invalid?

I was thinking allowing path: to come only after the name would give flexibility and not require us to use name: syntax. Otherwise it would be difficult to identify what part of search is path and what is name.

And isn't path: the same as re:^?

Was going to use path: to search the path either using in operator or using glob.glob to allow more flexibility. While re: would just search the name. Reason being regex would require escaping special characters in path so knowledge of regex would be needed for just filtering by path.

I think we will have to construct the path by going up and getting name from parent and then joining to form a path. I'm trying to find if there's a more effecient way. Do you have any clue?

@real-yfprojects
Copy link
Collaborator

I see, so by default only the subpath is matched. Does that mean when entering a, path/a will be shown but not path/a/b ?

@real-yfprojects
Copy link
Collaborator

Are you familiar with the concept of user stories?

@jetchirag
Copy link
Contributor Author

jetchirag commented Jul 14, 2023

I see, so by default only the subpath is matched. Does that mean when entering a, path/a will be shown but not path/a/b ?

No, if something is matched, its parents (path/) should be revealed but that item should be expandable to reveal child /b as well

Are you familiar with the concept of user stories?

I'm not but I did a quick lookup.

@real-yfprojects
Copy link
Collaborator

real-yfprojects commented Jul 16, 2023

While finished user stories compose a full description of a (sub)feature from a user perspective, I like to work with their basis that is describing a situation in which our feature might help. This allows us to better understand possible needs for our software and to design our software to better serve those needs. That's why I propose we think now think of some user (need) stories first before we continue thinking about the details of the search syntax. In the mean time you can work on this step (see previous comment):

In the process it would be great if you moved the common GUI and code into a superclass implementation to reduce redundancy.

I will start with a story:

Story 1: exact path

As someone that uses Vorta for accessing private backups I want to extract the old version of a specific file which I just modified erroneously. While I know the exact file path I find expanding each parent item in the path by clicking on it, very tedious. Expanding all items doesn't improve the situation since now there are so many elements that I need a lot of patience for finding the right path. I wish Vorta could help me with that.

This user need could be solved in several ways. This new filtering option might not be the best option to address this user story. However we should still describe this story although in the end we might decide that this isn't relevant enough to be addressed or needs a different feature entirely. Let's first write down these stories and think about how to solve them afterwards. Now its your turn!

@jetchirag
Copy link
Contributor Author

Pushed commit to move methods I could find similar to base class.

A story based on my last experience:

I was working on an app sometimes ago where I needed to restore .configuration file from backups. I provided the path as app/.configuration to borgmatic which was able to fetch the correct file for me despite not knowing/providing absolute path. I don't want to provide just the file name either as there maybe several files with same name in this project like __init__.py. So path option should be flexible.

Another story: Size based filters

I have a cronjob which deletes all files from Downloads folder which haven't been modified for more than 14 days. I realised a month later I had some important documents I downloaded but never moved. I now want to recover them from my Vorta backup but there are lot of unnecessary stuff in Downloads folder like large application files which I don't need.

They were simple documents and spreadsheet files which I am sure were small, surely under 50MB. I'd want a simple filter in Vorta where I can enter the size range and set path to /home/chirag/Downloads. I also haven't read the documentation and don't know how to use correct size format. Is it MB or M or does it only support bytes? It would be great if there was a dropdown to select the size format and perhaps a path selector.

Some small points

  • It would be even great if it supports unix like pathname (/home/*/Documents) but for personal use which is Vorta I personally don't need it.
  • I am not sure how resource intensive / fast adding these features would be as they are essentially run on every row. As the backup grows it would get slower and slower.
  • This made me realise, recovering a file requires knowing archive which has it or atleast the dates so you can filter through them. Can we have a feature to search all archives? Though this would be a bigger and seperate feature.

@real-yfprojects
Copy link
Collaborator

  • I am not sure how resource intensive / fast adding these features would be as they are essentially run on every row. As the backup grows it would get slower and slower.

There is nothing we can do about that. That's part of the nature of filtering/searching. However Qt does implement some optimizations afaik. Those should include only calculating the filter for expanded items.

Story 5: Name

As a user who has recently moved a file between backups. I want to find out where was located in earlier backups. Borg and Vorta do not detect the renaming or moving of a file (afaik). So filtering diff changes to only show the ones with a specific file name would be useful.

Story 6: Removed files

As a normal user I want to get an overview of all the files I deleted from my harddrive (between two backups). However sorting by change type still clutters the list (in tree view) with added or modified files. This is because it is only sorted in the sublist of each parent.

Story 7: Search for parent

As a user who wants to extract a document regarding GSoC I deleted previoiusly, I know it is located in a folder somewhere that contains GSoC in its name. Now I want to enter something simple like GSoC in the search bar and have the folder appear. Then I want to be able to expand the correct folder and look for the document in its children.

@real-yfprojects
Copy link
Collaborator

real-yfprojects commented Jul 18, 2023

Can you now lay out the current iteration of your syntax and describe in which way it satisfies the needs of each story? Please also think about how conveniently it solved each problem and whether the resulting order ov convenience is justified.

@jetchirag
Copy link
Contributor Author

jetchirag commented Jul 20, 2023

Here's my final draft of syntax. I chose to use command like syntax as it feels easier to read to me eyes and overall better. But let me know.

[ string ] [ -m ] [ -i ] [ -p ] [ -c ] [ -s ]
Note: Search String and other options are all optional

Options/Flags:

	--match -m <type>
	Type of match query
	Supported values:
		in (default): Filename should contain search string
		exact: Filename must match the search string exactly
		re: Use regex to match filename                        <-- Do we need this?

	--ignore-case -i    <-- inspired by grep
	Ignore Case. Supported by `in` and `exact` match type only.	

	--path -p <path>
	Specify path to match. Supports wildcard. Parsed using glob (?)
	Example:
		--path /home/vorta/Documents
		--path /var/www/public/*
		--path */src/*
	
	--change -c <type>
	Only available in Diff View. Supported values:
		A: Match only Added files
		R: Match only Removed files
		...
	
	--size -s <relation[<|>]number[B|KB|MB|GB>]>
	Match by size. Can be combined once to create a range. Range must be bounded.
	Example:
		--size <10MB
		--size >1GB
		--size >1GB --size <10GB

I am planning on using argparser for this. I was looking into implementing custom with startsWith, in and so but that would be just reinventing the wheel. I also think many users will be familiar with this type of syntax than the earlier proposed one.

Matching with the stories

1. exact path

importantfile.txt --path /home/chirag/work/

Or

--path /home/chirag/work/importantfile.txt

Satisfies the need with a clear syntax

2. A story based on my last experience

.configuration --path */app
# Or
--path */app/.configuration

Allows wildcards hence I can match relative path.

3. Size based filters

They were simple documents and spreadsheet files which I am sure were small, surely under 50MB. I'd want a simple filter in Vorta where I can enter the size range and set path to /home/chirag/Downloads.

--size <50GB --path /home/chirag/Downloads

4. Name

mylogins.txt --match exact

5. Removed files

--change R

6. Search for parent

I know it is located in a folder somewhere that contains GSoC in its name

GSoC

# was it GSoC, GSOC, gsoc, Gsoc?
gsoc -i

Overall I feel this fits every use case, is easier to read and organised. Took some time to write this, let me know what you think.

Some other special use specific options "may" add (from grep):
-v, --invert-match
--exclude-dir

@jetchirag
Copy link
Contributor Author

let me know what you think

I've pushed some code to match filenames and path to stay on the timeline. It's working well but can be easily modified shall we decide to change syntax.

@real-yfprojects
Copy link
Collaborator

I will have a look at this tomorrow.

I've pushed some code to match filenames and path to stay on the timeline. It's working well but can be easily modified shall we decide to change syntax.

You can implement the search button until we have settled on a syntax.

Signed-off-by: Chirag Aggarwal <[email protected]>
@real-yfprojects
Copy link
Collaborator

Can we use an icon for the search button?

@jetchirag
Copy link
Contributor Author

@real-yfprojects
Copy link
Collaborator

Yes, that works! @m3nu how do we have to handle licensing?

@real-yfprojects
Copy link
Collaborator

real-yfprojects commented Jul 22, 2023

I like the syntax you proposed since it can be (partly) implemented by using argparse.

--size >1GB --size <10GB

Why not combine those to a single argument: --size >1GB,<10GB and allow --size <=10GB?

Specify path to match. Supports wildcard. Parsed using glob (?)

Glob support is great for not so technical users but one should also be able to use a regex to match the path.

re: Use regex to match filename <-- Do we need this?

I don't think we need this if one can use a regex for path.

.configuration --path */app
# Or
--path */app/.configuration

Why have to ways of doing the same. If we make name and path exclusive we could add a flag to toggle between them and use match for choosing the pattern format for both options.
Like so:

[ <search_pattern> [ ( -m | --match ) ( ex | in | re | fm ) ] [ -p | --path ] [ -i | --ignore-case ] ] [ ( -c | --change ) <type> ] ...

The naming of the pattern types is inspired by borg exclude patterns since the user has to learn less when using the same design patterns as borg. The user would need to write less when using the borg pattern prefixes like pf:, re:, ... directly.

For the diff view there also should be an option to filter by balance. For extract view there should be options to filter by health, modification time and size.

I feel like when the path with a regex, vorta shouldn't show all children of a matched file even though they don't match as well. And I wondered whether we only want to support joining the filters like path or size with AND or with OR also?

@jetchirag
Copy link
Contributor Author

And I wondered whether we only want to support joining the filters like path or size with AND or with OR also?

I actually pondered over this for quite a while but settled on AND because that seems more intuitive and I couldn't find personal use case where I would need to use OR.

Why have to ways of doing the same. If we make name and path exclusive we could add a flag to toggle between them and use match for choosing the pattern format for both options.

I like it but there's a usecase I thought of. If I want to find a file in a given path, I thought we would require to use regex but the wildcard here seems to support nested directories so it should be fine.

print(fnmatch.fnmatch("/home/chirag/docs/latest/test", "/home/chirag/*/test"))
# True

What do you think about this?

@jetchirag
Copy link
Contributor Author

Thanks for patience.

Added tests for both views. We can add a info icon beside search button that directs user to a syntax page on vorta documentation which explain it. Unfortunately argparse issue is still open so we can't add error info yet.

@real-yfprojects
Copy link
Collaborator

We can add a info icon beside search button that directs user to a syntax page on vorta documentation which explain it. Unfortunately argparse issue is still open so we can't add error info yet.

Yes, that would be good!

@jetchirag
Copy link
Contributor Author

I didn't notice macos tests are failing. Couldn't reproduce locally, still looking.

@real-yfprojects
Copy link
Collaborator

I didn't notice macos tests are failing. Couldn't reproduce locally, still looking.

They time out on github. I think I saw the same behaviour in another PR.

@m3nu
Copy link
Contributor

m3nu commented Nov 8, 2023

The timeout could be a new macos popup. I need to connect to the CI instance and take screenshots to debug this. It's pretty annoying.

Copy link
Collaborator

@real-yfprojects real-yfprojects left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's almost done!

src/vorta/views/extract_dialog.py Outdated Show resolved Hide resolved
src/vorta/views/extract_dialog.py Outdated Show resolved Hide resolved
src/vorta/views/partials/treemodel.py Outdated Show resolved Hide resolved
src/vorta/views/partials/treemodel.py Outdated Show resolved Hide resolved
src/vorta/views/extract_dialog.py Outdated Show resolved Hide resolved
src/vorta/views/diff_result.py Outdated Show resolved Hide resolved
src/vorta/views/partials/treemodel.py Outdated Show resolved Hide resolved
src/vorta/views/partials/treemodel.py Outdated Show resolved Hide resolved
@jetchirag
Copy link
Contributor Author

Hi, thanks for review. I'm having my final exams currently, will update after that.

@real-yfprojects
Copy link
Collaborator

Thanks and good luck!

@real-yfprojects
Copy link
Collaborator

@jetchirag Nice to see you active again. Did the exams go well?

@jetchirag
Copy link
Contributor Author

@real-yfprojects Thanks, yes. They went pretty well!

@jetchirag
Copy link
Contributor Author

@m3nu @real-yfprojects Wanted to see if we can merge this up before new contributor(s) start their project so there's less conflict.

@m3nu
Copy link
Contributor

m3nu commented Mar 30, 2024

Can you add some examples as placeholder text in the search bar? I find this feature overly complicated to use considering what it does. This may help a little.

@real-yfprojects
Copy link
Collaborator

I noticed two bugs:

  1. The fm match doesn't seem to work. For some reason test -m fm is rejected as invalid syntax.
  2. When entering an invalid regex with -m re the re lib will through an uncached exception.

@real-yfprojects
Copy link
Collaborator

I find this feature overly complicated to use considering what it does.

How would design it to be less complicated? I was thinking about adding a context menu that adds syntax elements by clicking on menu items. Autosuggestion could be implemented as well.

@jetchirag
Copy link
Contributor Author

I noticed two bugs:

  1. The fm match doesn't seem to work. For some reason test -m fm is rejected as invalid syntax.
  2. When entering an invalid regex with -m re the re lib will through an uncached exception.

I am not sure why I added condition that fnmatch must only work with --path. I've removed it now. Also added check for invalid regex along with a test case for it.

Any idea on how to make it easier to use? I think the default search works well without any flag which most people can use like searching for file name without having to read docs. I can't decide on simple placeholder text.

@m3nu
Copy link
Contributor

m3nu commented Apr 8, 2024

How would design it to be less complicated?

I would focus on filtering by (partial) filename mainly. That's intuitive and doesn't need a separate manual. Maybe maybe the size too.

I realize my feedback is late though. 😬 So never mind. I just hope we will be able to maintain this feature in the future.

Do add some examples as placeholder, please. Since nobody will read the manual to filter a file list.

@real-yfprojects
Copy link
Collaborator

Any idea on how to make it easier to use?

One could add a button opening a context menu that allows inserting the syntax in a user friendly way.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

FR: Search a file in diff and extract view
3 participants