Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature Request] Optionally never write personally identifiable information to metadata files #2640

Open
Scripter17 opened this issue May 30, 2022 · 5 comments

Comments

@Scripter17
Copy link
Contributor

Scripter17 commented May 30, 2022

Sometimes when sharing archives of accounts, the metadata can snitch to people what account you're using or if you're following the person and whatnot

Since this would affect so many extractors it should probably be added as a postprocessor

A basic implementation of the idea:

sanitizeRules=[
    {"key":"is_blocked"   , "value":False, "category":"deviantart", "subcategory":""},
    {"key":"is_favourited", "value":False, "category":"deviantart", "subcategory":""},
    {"key":"is_watching"  , "value":False, "category":"deviantart", "subcategory":""},
]
def sanitizeMetadata(metadata):
    for rule in sanitizeRules:
        catMatch   =not rule["category"   ] or rule["category"   ]==metadata["category"   ]
        subCatMatch=not rule["subcategory"] or rule["subcategory"]==metadata["subcategory"]
        if catMatch and subCatMatch:
            if rule["key"] in metadata:
                metadata[rule["key"]]=rule["value"]
    return metadata

Of course the hard part is getting a list of all metadata keys that can be used to identify you

@mikf
Copy link
Owner

mikf commented May 30, 2022

I've already had the idea of adding options to add, modify/change, or remove metadata fields to the existing metadata post processor, so this would pretty much fit right in, wouldn't it?

Something like this:

{
    "name": "metadata",

    "delete": ["thumbnails", "preview"],
    "set"   : {
        "is_blocked"   : false,
        "is_favourited": false,
        "is_watching"  : false
    },
}

@Scripter17
Copy link
Contributor Author

It's possible there's an edge case where, for example, one site has user be your username and another has user be the person being downloaded

But until someone runs into that this should work perfectly!

@Scripter17
Copy link
Contributor Author

Just realized you said this was an idea and not already added

Gonna go get some tea to wake myself up

@nisehime
Copy link

Couldn't you just remove metadata fields with custom content format?

@mikf
Copy link
Owner

mikf commented Dec 2, 2022

This has been implemented quite some time ago, but I never got back to this issue and I'm also not sure if this implementation is any good. It's needlessly verbose but rather flexible because of that.

f3de6b7 and 0c73914 add two additional modees for the metadata post processor that allow to delete or modify/add fields from the current metadata dict.

"deviantart": {
    "postprocessors": [
        {
            "name": "metadata",
            "mode": "delete",
            "fields": [
                "is_blocked",
                "is_favourited"
            ]
        },
        {
            "name": "metadata",
            "mode": "modify",
            "fields": {
                "is_watching": "\fE False",
                "custom_field": "{index!s}-{extension}"
            }
        },
        {
            "name": "metadata",
            "#": "write metadata"
        }
    ]
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants