-
Notifications
You must be signed in to change notification settings - Fork 75
add support for zenodo open repository #20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR adds comprehensive support for the Zenodo open repository to the paper search MCP server. Zenodo is a popular open access repository that hosts research outputs including papers, datasets, and other academic content.
- Implements complete Zenodo integration with search, download, and metadata retrieval capabilities
- Adds comprehensive test coverage for all Zenodo functionality
- Updates documentation to reflect the new Zenodo support
Reviewed Changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 5 comments.
| File | Description |
|---|---|
| paper_search_mcp/academic_platforms/zenodo.py | Core implementation of ZenodoSearcher class with search, download, and API interaction methods |
| paper_search_mcp/server.py | Adds 7 MCP tools for Zenodo integration including search, download, community search, and metadata retrieval |
| tests/test_zenodo.py | Comprehensive test suite covering all Zenodo functionality with proper connectivity checks and error handling |
| README.md | Updates documentation to include Zenodo in supported platforms and adds detailed tool descriptions |
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
| async with httpx.AsyncClient() as client: | ||
| papers = zenodo_searcher.search( | ||
| query, | ||
| max_results, | ||
| community=community, | ||
| year=year, | ||
| resource_type=resource_type, | ||
| subtype=subtype, | ||
| creators=creators, | ||
| keywords=keywords, | ||
| sort=sort, | ||
| order=order, | ||
| ) | ||
| return [paper.to_dict() for paper in papers] if papers else [] |
Copilot
AI
Aug 25, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The async with httpx.AsyncClient() as client: creates an unused HTTP client. The zenodo_searcher.search method uses its own requests session. Either remove the unused client or refactor ZenodoSearcher to accept an async client.
| async with httpx.AsyncClient() as client: | |
| papers = zenodo_searcher.search( | |
| query, | |
| max_results, | |
| community=community, | |
| year=year, | |
| resource_type=resource_type, | |
| subtype=subtype, | |
| creators=creators, | |
| keywords=keywords, | |
| sort=sort, | |
| order=order, | |
| ) | |
| return [paper.to_dict() for paper in papers] if papers else [] | |
| papers = zenodo_searcher.search( | |
| query, | |
| max_results, | |
| community=community, | |
| year=year, | |
| resource_type=resource_type, | |
| subtype=subtype, | |
| creators=creators, | |
| keywords=keywords, | |
| sort=sort, | |
| order=order, | |
| ) | |
| return [paper.to_dict() for paper in papers] if papers else [] |
| async with httpx.AsyncClient() as client: | ||
| results = zenodo_searcher.search_communities( | ||
| query=query, max_results=max_results, sort=sort, order=order | ||
| ) | ||
| return results if results else [] | ||
|
|
||
|
|
||
| @mcp.tool() | ||
| async def get_zenodo_record_details(paper_id: str) -> Dict: | ||
| """Get the raw Zenodo record JSON for a research paper (or any Zenodo record) by ID or URL.""" | ||
| async with httpx.AsyncClient() as client: | ||
| rec = zenodo_searcher.get_record_details(paper_id) | ||
| return rec or {} | ||
|
|
||
|
|
||
| @mcp.tool() | ||
| async def list_zenodo_files(paper_id: str) -> List[Dict]: | ||
| """List files attached to a research paper recorded on Zenodo (or any Zenodo record).""" | ||
| async with httpx.AsyncClient() as client: | ||
| files = zenodo_searcher.list_files(paper_id) | ||
| return files if files else [] |
Copilot
AI
Aug 25, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The async with httpx.AsyncClient() as client: creates an unused HTTP client. The zenodo_searcher.search_communities method uses its own requests session. Either remove the unused client or refactor ZenodoSearcher to accept an async client.
| async with httpx.AsyncClient() as client: | |
| results = zenodo_searcher.search_communities( | |
| query=query, max_results=max_results, sort=sort, order=order | |
| ) | |
| return results if results else [] | |
| @mcp.tool() | |
| async def get_zenodo_record_details(paper_id: str) -> Dict: | |
| """Get the raw Zenodo record JSON for a research paper (or any Zenodo record) by ID or URL.""" | |
| async with httpx.AsyncClient() as client: | |
| rec = zenodo_searcher.get_record_details(paper_id) | |
| return rec or {} | |
| @mcp.tool() | |
| async def list_zenodo_files(paper_id: str) -> List[Dict]: | |
| """List files attached to a research paper recorded on Zenodo (or any Zenodo record).""" | |
| async with httpx.AsyncClient() as client: | |
| files = zenodo_searcher.list_files(paper_id) | |
| return files if files else [] | |
| results = zenodo_searcher.search_communities( | |
| query=query, max_results=max_results, sort=sort, order=order | |
| ) | |
| return results if results else [] | |
| @mcp.tool() | |
| async def get_zenodo_record_details(paper_id: str) -> Dict: | |
| """Get the raw Zenodo record JSON for a research paper (or any Zenodo record) by ID or URL.""" | |
| rec = zenodo_searcher.get_record_details(paper_id) | |
| return rec or {} | |
| @mcp.tool() | |
| async def list_zenodo_files(paper_id: str) -> List[Dict]: | |
| """List files attached to a research paper recorded on Zenodo (or any Zenodo record).""" | |
| files = zenodo_searcher.list_files(paper_id) | |
| return files if files else [] |
| async with httpx.AsyncClient() as client: | ||
| results = zenodo_searcher.search_communities( | ||
| query=query, max_results=max_results, sort=sort, order=order | ||
| ) | ||
| return results if results else [] | ||
|
|
||
|
|
||
| @mcp.tool() | ||
| async def get_zenodo_record_details(paper_id: str) -> Dict: | ||
| """Get the raw Zenodo record JSON for a research paper (or any Zenodo record) by ID or URL.""" | ||
| async with httpx.AsyncClient() as client: | ||
| rec = zenodo_searcher.get_record_details(paper_id) | ||
| return rec or {} | ||
|
|
||
|
|
||
| @mcp.tool() | ||
| async def list_zenodo_files(paper_id: str) -> List[Dict]: | ||
| """List files attached to a research paper recorded on Zenodo (or any Zenodo record).""" | ||
| async with httpx.AsyncClient() as client: | ||
| files = zenodo_searcher.list_files(paper_id) | ||
| return files if files else [] |
Copilot
AI
Aug 25, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The async with httpx.AsyncClient() as client: creates an unused HTTP client. The zenodo_searcher.get_record_details method uses its own requests session. Either remove the unused client or refactor ZenodoSearcher to accept an async client.
| async with httpx.AsyncClient() as client: | |
| results = zenodo_searcher.search_communities( | |
| query=query, max_results=max_results, sort=sort, order=order | |
| ) | |
| return results if results else [] | |
| @mcp.tool() | |
| async def get_zenodo_record_details(paper_id: str) -> Dict: | |
| """Get the raw Zenodo record JSON for a research paper (or any Zenodo record) by ID or URL.""" | |
| async with httpx.AsyncClient() as client: | |
| rec = zenodo_searcher.get_record_details(paper_id) | |
| return rec or {} | |
| @mcp.tool() | |
| async def list_zenodo_files(paper_id: str) -> List[Dict]: | |
| """List files attached to a research paper recorded on Zenodo (or any Zenodo record).""" | |
| async with httpx.AsyncClient() as client: | |
| files = zenodo_searcher.list_files(paper_id) | |
| return files if files else [] | |
| results = zenodo_searcher.search_communities( | |
| query=query, max_results=max_results, sort=sort, order=order | |
| ) | |
| return results if results else [] | |
| @mcp.tool() | |
| async def get_zenodo_record_details(paper_id: str) -> Dict: | |
| """Get the raw Zenodo record JSON for a research paper (or any Zenodo record) by ID or URL.""" | |
| rec = zenodo_searcher.get_record_details(paper_id) | |
| return rec or {} | |
| @mcp.tool() | |
| async def list_zenodo_files(paper_id: str) -> List[Dict]: | |
| """List files attached to a research paper recorded on Zenodo (or any Zenodo record).""" | |
| files = zenodo_searcher.list_files(paper_id) | |
| return files if files else [] |
| async with httpx.AsyncClient() as client: | ||
| results = zenodo_searcher.search_communities( | ||
| query=query, max_results=max_results, sort=sort, order=order | ||
| ) | ||
| return results if results else [] | ||
|
|
||
|
|
||
| @mcp.tool() | ||
| async def get_zenodo_record_details(paper_id: str) -> Dict: | ||
| """Get the raw Zenodo record JSON for a research paper (or any Zenodo record) by ID or URL.""" | ||
| async with httpx.AsyncClient() as client: | ||
| rec = zenodo_searcher.get_record_details(paper_id) | ||
| return rec or {} | ||
|
|
||
|
|
||
| @mcp.tool() | ||
| async def list_zenodo_files(paper_id: str) -> List[Dict]: | ||
| """List files attached to a research paper recorded on Zenodo (or any Zenodo record).""" | ||
| async with httpx.AsyncClient() as client: | ||
| files = zenodo_searcher.list_files(paper_id) | ||
| return files if files else [] |
Copilot
AI
Aug 25, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The async with httpx.AsyncClient() as client: creates an unused HTTP client. The zenodo_searcher.list_files method uses its own requests session. Either remove the unused client or refactor ZenodoSearcher to accept an async client.
| async with httpx.AsyncClient() as client: | |
| results = zenodo_searcher.search_communities( | |
| query=query, max_results=max_results, sort=sort, order=order | |
| ) | |
| return results if results else [] | |
| @mcp.tool() | |
| async def get_zenodo_record_details(paper_id: str) -> Dict: | |
| """Get the raw Zenodo record JSON for a research paper (or any Zenodo record) by ID or URL.""" | |
| async with httpx.AsyncClient() as client: | |
| rec = zenodo_searcher.get_record_details(paper_id) | |
| return rec or {} | |
| @mcp.tool() | |
| async def list_zenodo_files(paper_id: str) -> List[Dict]: | |
| """List files attached to a research paper recorded on Zenodo (or any Zenodo record).""" | |
| async with httpx.AsyncClient() as client: | |
| files = zenodo_searcher.list_files(paper_id) | |
| return files if files else [] | |
| results = zenodo_searcher.search_communities( | |
| query=query, max_results=max_results, sort=sort, order=order | |
| ) | |
| return results if results else [] | |
| @mcp.tool() | |
| async def get_zenodo_record_details(paper_id: str) -> Dict: | |
| """Get the raw Zenodo record JSON for a research paper (or any Zenodo record) by ID or URL.""" | |
| rec = zenodo_searcher.get_record_details(paper_id) | |
| return rec or {} | |
| @mcp.tool() | |
| async def list_zenodo_files(paper_id: str) -> List[Dict]: | |
| """List files attached to a research paper recorded on Zenodo (or any Zenodo record).""" | |
| files = zenodo_searcher.list_files(paper_id) | |
| return files if files else [] |
| async with httpx.AsyncClient() as client: | ||
| papers = zenodo_searcher.search_by_creator( | ||
| creator=creator, | ||
| max_results=max_results, | ||
| community=community, | ||
| year=year, | ||
| resource_type=resource_type, | ||
| subtype=subtype, | ||
| sort=sort, | ||
| order=order, | ||
| ) | ||
| return [p.to_dict() for p in papers] if papers else [] |
Copilot
AI
Aug 25, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The async with httpx.AsyncClient() as client: creates an unused HTTP client. The zenodo_searcher.search_by_creator method uses its own requests session. Either remove the unused client or refactor ZenodoSearcher to accept an async client.
| async with httpx.AsyncClient() as client: | |
| papers = zenodo_searcher.search_by_creator( | |
| creator=creator, | |
| max_results=max_results, | |
| community=community, | |
| year=year, | |
| resource_type=resource_type, | |
| subtype=subtype, | |
| sort=sort, | |
| order=order, | |
| ) | |
| return [p.to_dict() for p in papers] if papers else [] | |
| papers = zenodo_searcher.search_by_creator( | |
| creator=creator, | |
| max_results=max_results, | |
| community=community, | |
| year=year, | |
| resource_type=resource_type, | |
| subtype=subtype, | |
| sort=sort, | |
| order=order, | |
| ) | |
| return [p.to_dict() for p in papers] if papers else [] |
No description provided.