Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feature request: zotero integration #68

Open
andreifoldes opened this issue May 16, 2023 · 6 comments
Open

feature request: zotero integration #68

andreifoldes opened this issue May 16, 2023 · 6 comments

Comments

@andreifoldes
Copy link

Thanks for creating this.
Would be amazing if it was able to leverage the metadata already existing in e.g.: Zotero library

@davidmezzetti
Copy link
Member

Following up here, would you be able to explain the benefits of Zotero? What are the ways Python can integrate with it? Licensing? I will only integrate with projects that are FOSS and permissively licensed for a number of reasons.

@andreifoldes
Copy link
Author

andreifoldes commented Sep 19, 2023

Following up here, would you be able to explain the benefits of Zotero?

Yes, happy to. Zotero is FOSS, is actively developed and has the largest user-base in terms of opensource reference management https://github.com/zotero. It is released under the Affero General Public License (AGPL). Consequently it also has strongest plugin ecosystem in the ref management sphere as far as I know.

The main benefit of using Zotero is that when researchers download new articles during a web browsing session, Zotero browser plugin ensures very neat extraction and maintenance of reference metadata and pdf. Users are also able to monitor and modify references if this extraction fails through API or the user interface. Additionally Zotero can also store and organize "related" files like notes written by the user in regards to an article and attach supplements.

What are the ways Python can integrate with it?

There is a python interface for Zotero maintained here: https://pyzotero.readthedocs.io/en/latest/
I also believe that developers can access the sqlite database that Zotero relies on: https://www.zotero.org/support/dev/client_coding/direct_sqlite_database_access

@sdrakulich
Copy link

bump, very interested

@SoenkevL
Copy link

I just stumbled across this and actually, I work on such an integration for my own personal use right now as I also think it would be a huge advancement. I don't have it fully working yet but if you are interested in it let me know. It works purely based on accessing the freely available sqlite database of zotero. Using the current pdf extractor works well for the pdfs of the zotero library but indeed the meta data is mostly missing or wrong.

@andreifoldes
Copy link
Author

Amazing! Would be great if you could share it maybe via a github repo?

@SoenkevL
Copy link

SoenkevL commented May 12, 2024

Took me a little longer than I thought,

https://github.com/SoenkevL/paperetl_zotero_integration.git

Here is a a link to a github repo which has the added functionality. Changes were made mainly in the added Zotero_extractor file but also some adaptions in the pdf reading and parsing had to be done. I didnt add any new meta information fields as of now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants