Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incorrect error message raised when underlying error is different Error: <Numer> appears to be an invalid tax name. This could be due to unexpected special characters. #448

Open
corneliusroemer opened this issue Feb 5, 2025 · 9 comments
Labels
bug Something isn't working

Comments

@corneliusroemer
Copy link

corneliusroemer commented Feb 5, 2025

Describe the bug
Using version 16.40.1, Linux x86_64 from conda-forge, I'm getting the error:

Error: 3048448 appears to be an invalid tax name.  This could be due to unexpected special characters.

Interestingly, I cant reproduce locally (macOS, ARM).

Command run: datasets download virus genome taxon 3048448 --no-progressbar --filename results/ncbi_dataset.zip --api-key None

Also getting this for other taxon ids, e.g. Error: 3052518 appears to be an invalid tax name. This could be due to unexpected special characters.

@corneliusroemer corneliusroemer added the bug Something isn't working label Feb 5, 2025
@corneliusroemer
Copy link
Author

Here's with debug logs:

POST /datasets/v2/taxonomy/taxon_suggest HTTP/1.1
Host: api.ncbi.nlm.nih.gov
User-Agent: OpenAPI-Generator/1.0.0/go
Content-Length: 130
Accept: application/json
Content-Type: application/json
Ncbi-Phid: 18B066ABAA0DE0B4CC70A25C
X-Datasets-Client: datasets-cli
X-Datasets-Client-Arch: amd64
X-Datasets-Client-Cmd: download virus genome taxon 3052518 --no-progressbar --filename results/ncbi_dataset.zip --api-key None --debug
X-Datasets-Client-Os: linux
X-Datasets-Client-Version: 16.40.1
Accept-Encoding: gzip
{"exact_match":true,"tax_rank_filter":"higher_taxon","taxon_query":"3052518","taxon_resource_filter":"TAXON_RESOURCE_FILTER_ALL"}
Error: 3052518 appears to be an invalid tax name.  This could be due to unexpected special characters.
Use datasets download virus genome taxon <command> --help for detailed help about a command.

We're only seeing this on one server, is it possible you're actually doing 429 rate limiting but somehow reporting it as "invalid tax name"?

@theosanderson
Copy link

This does seem to be about an error message that doesn't always reflect the underlying situation

Repro:

datasets download virus genome taxon 3048448  --gateway-url https://nonexistent                     
Error: 3048448 appears to be an invalid tax name.  This could be due to unexpected special characters.

@RobertFalk
Copy link

Yes, that error currently is displayed when the taxon lookup fails to return anything, so it could occur for different problems as you demonstrated. I'll create a ticket to improve the error message. I was unfortunately not able to reproduce the error using 16.40.1 from conda-forge on my machine, and I don't see any results in our logging system based on the debugging information you provided. Maybe it failed to connect entirely? I can't say for sure. If it's still occurring, could you provide that same debug information, along with the time it was run?

@theosanderson
Copy link

(I'm on the same team as Cornelius) - we have now found we can't even connect w/ curl api.ncbi.nlm.nih.gov from the server in question, so it doesn't seem a datasets specific issue

@RobertFalk
Copy link

Thanks for the update - I created a ticket to improve that message so you'll better know what's going on if similar issues occur.

@corneliusroemer corneliusroemer changed the title Get Error: 3048448 appears to be an invalid tax name. This could be due to unexpected special characters. on some clients Incorrect error message raised when underlying error is different Error: <Numer> appears to be an invalid tax name. This could be due to unexpected special characters. Feb 6, 2025
@corneliusroemer
Copy link
Author

Thanks @RobertFalk! By the way, thanks for releasing the client code, it allowed us to look where this specific error message was raised:

return nil, false, fmt.Errorf("%s appears to be an invalid tax name. This could be due to unexpected special characters.", taxName)

@aboffin
Copy link

aboffin commented Mar 12, 2025

Hi, I get a somewhat similar error:

When I tried to download using NCBI taxon id with or without --api-key option, I get either:

Error: Download error: stream error: stream ID 7; INTERNAL_ERROR; received from peer
Use datasets download genome taxon <command> --help for detailed help about a command.

or

Error: <taxonid> appears to be an invalid tax name.  This could be due to unexpected special characters.
Use datasets download genome taxon <command> --help for detailed help about a command.

Any pointers on how to resolve this is appreciated, thanks!

@ericcox1
Copy link
Collaborator

Hi @aboffin,

Thanks for your report. Could you please share the exact command that you used so that we can try to reproduce this?

Best,
Eric

@aboffin
Copy link

aboffin commented Mar 12, 2025

@ericcox1 the exact command was:

datasets download genome taxon "160490" --dehydrated --include protein --filename s_pyogenes_m1.zip --no-progressbar

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

5 participants