Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

It depends on what you mean by "recent" #243

Open
mark-donoghue opened this issue May 2, 2019 · 1 comment
Open

It depends on what you mean by "recent" #243

mark-donoghue opened this issue May 2, 2019 · 1 comment
Labels
api Tickets specifically about the metadata/query API backed by search bug enhancement

Comments

@mark-donoghue
Copy link

When requesting the most recent articles via the /metadata endpoint it appears the definition of "recent" is not sufficiently documented :-)

The following request:

https://api.arxiv.org/metadata/?size=10&include=comments&include=journal_ref&include=submitted_date&include=submitted_date_all&include=announced_date_first&include=updated_date&include=modified_date

Elicits a response that puts articles with a submission date of 2018-12-03 first in the list as shown below. Interspersed and out-of-order are records from November.

Included in the request are parameters for all of the dates specified in the DocumentMetadata.json file. Unfortunately the fields are not returned so, it's unclear what date is being used to sort the result set.

To confound matters further if start_date=2018-12-04 is added as a parameter then the result set starts with articles having a submission date of 2018-12-04.

It will be useful to know what date is being used to decide recency.

Additionally helpful would be the ability to specify which date should be used to sort, i.e.: sort_by=submitted_date.

My intention is to request all articles submitted to arXiv since a specific date (typically since the last time my script ran).

Here is the JSON response from the above request:

{
    "metadata": {
        "end": 10,
        "query": [
            {
                "parameter": "include",
                "value": "comments"
            },
            {
                "parameter": "include",
                "value": "journal_ref"
            },
            {
                "parameter": "include",
                "value": "submitted_date"
            },
            {
                "parameter": "include",
                "value": "submitted_date_all"
            },
            {
                "parameter": "include",
                "value": "announced_date_first"
            },
            {
                "parameter": "include",
                "value": "updated_date"
            },
            {
                "parameter": "include",
                "value": "modified_date"
            }
        ],
        "size": 10,
        "start": 0,
        "total": 1479853
    },
    "results": [
        {
            "announced_date_first": "2018-12",
            "canonical": "https://api.arxiv.org/abs/1812.00974v1",
            "comments": "",
            "href": "https://api.arxiv.org/1812.00974v1",
            "journal_ref": "",
            "paper_id": "1812.00974",
            "paper_id_v": "1812.00974v1",
            "submitted_date": "2018-12-03T13:49:29-05:00",
            "title": "Online Graph-Adaptive Learning with Scalability and Privacy",
            "version": 1
        },
        {
            "announced_date_first": "2018-12",
            "canonical": "https://api.arxiv.org/abs/1812.00983v1",
            "comments": "20 pages",
            "href": "https://api.arxiv.org/1812.00983v1",
            "journal_ref": "",
            "paper_id": "1812.00983",
            "paper_id_v": "1812.00983v1",
            "submitted_date": "2018-12-03T13:58:22-05:00",
            "title": "A General Axiomatization for the logics of the Hierarchy ${\\mathbb{I}}^n {\\mathbb{P}}^k$",
            "version": 1
        },
        {
            "announced_date_first": "2018-12",
            "canonical": "https://api.arxiv.org/abs/1812.00976v1",
            "comments": "",
            "href": "https://api.arxiv.org/1812.00976v1",
            "journal_ref": "",
            "paper_id": "1812.00976",
            "paper_id_v": "1812.00976v1",
            "submitted_date": "2018-12-03T13:50:08-05:00",
            "title": "The Gelfand-Tsetlin Realisation of Simple Modules and Monomial Bases",
            "version": 1
        },
        {
            "announced_date_first": "2018-12",
            "canonical": "https://api.arxiv.org/abs/1812.00894v1",
            "comments": "",
            "href": "https://api.arxiv.org/1812.00894v1",
            "journal_ref": "",
            "paper_id": "1812.00894",
            "paper_id_v": "1812.00894v1",
            "submitted_date": "2018-11-14T11:33:33-05:00",
            "title": "Compact Graphene Plasmonic Slot Photodetector on Silicon-on-insulator with High Responsivity",
            "version": 1
        },
        {
            "announced_date_first": "2018-12",
            "canonical": "https://api.arxiv.org/abs/1812.00883v1",
            "comments": "Machine Learning for Health (ML4H) Workshop at NeurIPS 2018",
            "href": "https://api.arxiv.org/1812.00883v1",
            "journal_ref": "",
            "paper_id": "1812.00883",
            "paper_id_v": "1812.00883v1",
            "submitted_date": "2018-11-23T09:51:55-05:00",
            "title": "Relation Networks for Optic Disc and Fovea Localization in Retinal Images",
            "version": 1
        },
        {
            "announced_date_first": "2018-12",
            "canonical": "https://api.arxiv.org/abs/1812.00892v1",
            "comments": "",
            "href": "https://api.arxiv.org/1812.00892v1",
            "journal_ref": "",
            "paper_id": "1812.00892",
            "paper_id_v": "1812.00892v1",
            "submitted_date": "2018-12-03T11:46:23-05:00",
            "title": "Topologically Enabled Ultra-high-Q Guided Resonances Robust to Out-of-plane Scattering",
            "version": 1
        },
        {
            "announced_date_first": "2018-12",
            "canonical": "https://api.arxiv.org/abs/1812.00858v1",
            "comments": "9 pages",
            "href": "https://api.arxiv.org/1812.00858v1",
            "journal_ref": "",
            "paper_id": "1812.00858",
            "paper_id_v": "1812.00858v1",
            "submitted_date": "2018-12-03T11:03:39-05:00",
            "title": "Evolution for Khovanov polynomials for figure-eight-like family of knots",
            "version": 1
        },
        {
            "announced_date_first": "2018-12",
            "canonical": "https://api.arxiv.org/abs/1812.00881v1",
            "comments": "13 pages, 3 figures. Contribution to the proceedings of the XIII Quark Confinement and the Hadron Spectrum conference",
            "href": "https://api.arxiv.org/1812.00881v1",
            "journal_ref": "",
            "paper_id": "1812.00881",
            "paper_id_v": "1812.00881v1",
            "submitted_date": "2018-12-03T11:36:06-05:00",
            "title": "Forward particle production in proton-nucleus collisions at NLO",
            "version": 1
        },
        {
            "announced_date_first": "2018-12",
            "canonical": "https://api.arxiv.org/abs/1812.00900v1",
            "comments": "16 pages",
            "href": "https://api.arxiv.org/1812.00900v1",
            "journal_ref": "",
            "paper_id": "1812.00900",
            "paper_id_v": "1812.00900v1",
            "submitted_date": "2018-12-03T11:56:35-05:00",
            "title": "Improved bounds for box dimensions of potential singular points to the Navier--Stokes equations",
            "version": 1
        },
        {
            "announced_date_first": "2018-12",
            "canonical": "https://api.arxiv.org/abs/1812.00793v1",
            "comments": "55 pages. arXiv admin note: text overlap with arXiv:1710.02736",
            "href": "https://api.arxiv.org/1812.00793v1",
            "journal_ref": "Advances in Neural Information Processing Systems 31 (2018)",
            "paper_id": "1812.00793",
            "paper_id_v": "1812.00793v1",
            "submitted_date": "2018-11-29T14:27:33-05:00",
            "title": "Simulated Tempering Langevin Monte Carlo II: An Improved Proof using Soft Markov Chain Decomposition",
            "version": 1
        }
    ]
}
@erickpeirson erickpeirson added api Tickets specifically about the metadata/query API backed by search bug enhancement labels May 2, 2019
@erickpeirson
Copy link
Contributor

Thanks @mark-donoghue , this is excellent. We'll work through this feedback and update this thread as we do.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api Tickets specifically about the metadata/query API backed by search bug enhancement
Projects
None yet
Development

No branches or pull requests

2 participants