Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Question]: uie-base微调模型无法下载文件 #9833

Open
hymumu opened this issue Feb 9, 2025 · 9 comments
Open

[Question]: uie-base微调模型无法下载文件 #9833

hymumu opened this issue Feb 9, 2025 · 9 comments
Assignees
Labels
question Further information is requested

Comments

@hymumu
Copy link

hymumu commented Feb 9, 2025

请提出你的问题

python环境:
Python 3.9.21
paddlenlp 2.8.1
paddlepaddle-gpu 2.6.2.post120

操作系统
ubuntu 24.04

微调模型:
使用的https://github.com/PaddlePaddle/PaddleNLP/blob/v2.8.1/applications/information_extraction/text/finetune.py这个文件。
python finetune.py
--device gpu
--logging_steps 10
--save_steps 100
--eval_steps 100
--seed 1000
--output_dir ./checkpoint/model_best
--train_path /home/ubuntu/uie-data/data/train.txt
--dev_path /home/ubuntu/uie-data/data/dev.txt
--max_seq_len 512
--per_device_train_batch_size 16
--per_device_eval_batch_size 16
--num_train_epochs 20
--learning_rate 1e-5
--do_train
--do_eval
--do_export
--export_model_dir ./checkpoint/model_best
--overwrite_output_dir
--disable_tqdm True
--metric_for_best_model eval_f1
--load_best_model_at_end True
--save_total_limit 1

执行上面命令是,下载文件报错。
错误信息:
	  warnings.warn(
		[2025-02-09 14:32:34,028] [ WARNING] - evaluation_strategy reset to IntervalStrategy.STEPS for do_eval is True. you can also set evaluation_strategy='epoch'.
		[2025-02-09 14:32:34,028] [    INFO] - The default value for the training argument `--report_to` will change in v5 (from all installed integrations to none). In v5, you will need to use `--report_to all` to get the same behavior as now. You should start updating your code and make this info disappear :-).
		[2025-02-09 14:32:34,028] [    INFO] - ============================================================
		[2025-02-09 14:32:34,028] [    INFO] -      Model Configuration Arguments
		[2025-02-09 14:32:34,028] [    INFO] - paddle commit id              :8ce0de584c570589117e403322f3d1a0de6554e5
		[2025-02-09 14:32:34,029] [    INFO] - export_model_dir              :./checkpoint/model_best
		[2025-02-09 14:32:34,029] [    INFO] - model_name_or_path            :uie-base
		[2025-02-09 14:32:34,029] [    INFO] - multilingual                  :False
		[2025-02-09 14:32:34,029] [    INFO] -
		[2025-02-09 14:32:34,029] [    INFO] - ============================================================
		[2025-02-09 14:32:34,029] [    INFO] -       Data Configuration Arguments
		[2025-02-09 14:32:34,029] [    INFO] - paddle commit id              :8ce0de584c570589117e403322f3d1a0de6554e5
		[2025-02-09 14:32:34,030] [    INFO] - dev_path                      :/home/ubuntu/uie-data/data/dev.txt
		[2025-02-09 14:32:34,030] [    INFO] - dynamic_max_length            :None
		[2025-02-09 14:32:34,030] [    INFO] - max_seq_length                :512
		[2025-02-09 14:32:34,030] [    INFO] - train_path                    :/home/ubuntu/uie-data/data/train.txt
		[2025-02-09 14:32:34,030] [    INFO] -
		[2025-02-09 14:32:34,030] [ WARNING] - Process rank: -1, device: gpu, world_size: 1, distributed training: False, 16-bits training: False
		[2025-02-09 14:32:34,030] [    INFO] - We are using (<class 'paddlenlp.transformers.ernie.tokenizer.ErnieTokenizer'>, False) to load 'uie-base'.
		Traceback (most recent call last):
		  File "/data/miniconda3/envs/uie/lib/python3.9/site-packages/paddlenlp/utils/download/common.py", line 597, in raise_for_status
		    response.raise_for_status()
		  File "/data/miniconda3/envs/uie/lib/python3.9/site-packages/requests/models.py", line 1024, in raise_for_status
		    raise HTTPError(http_error_msg, response=self)
		requests.exceptions.HTTPError: 404 Client Error: Not Found for url: https://bj.bcebos.com/paddlenlp/models/transformers/ernie_3.0/ernie_3.0_base_zh_vocab.txt

		The above exception was the direct cause of the following exception:

		Traceback (most recent call last):
		  File "/data/miniconda3/envs/uie/lib/python3.9/site-packages/paddlenlp/utils/download/__init__.py", line 169, in resolve_file_path
		    cached_file = bos_download(
		  File "/data/miniconda3/envs/uie/lib/python3.9/site-packages/paddlenlp/utils/download/bos_download.py", line 238, in bos_download
		    http_get(
		  File "/data/miniconda3/envs/uie/lib/python3.9/site-packages/paddlenlp/utils/download/common.py", line 138, in http_get
		    r = _request_wrapper(
		  File "/data/miniconda3/envs/uie/lib/python3.9/site-packages/paddlenlp/utils/download/common.py", line 369, in _request_wrapper
		    raise_for_status(response)
		  File "/data/miniconda3/envs/uie/lib/python3.9/site-packages/paddlenlp/utils/download/common.py", line 601, in raise_for_status
		    raise EntryNotFoundError(message, None) from e
		huggingface_hub.errors.EntryNotFoundError: 404 Client Error.

		Entry Not Found for url: https://bj.bcebos.com/paddlenlp/models/transformers/ernie_3.0/ernie_3.0_base_zh_vocab.txt.

		During handling of the above exception, another exception occurred:

		Traceback (most recent call last):
		  File "/home/ubuntu/work/uie/information_extraction/text/finetune.py", line 243, in <module>
		    main()
		  File "/home/ubuntu/work/uie/information_extraction/text/finetune.py", line 129, in main
		    tokenizer = AutoTokenizer.from_pretrained(model_args.model_name_or_path)
		  File "/data/miniconda3/envs/uie/lib/python3.9/site-packages/paddlenlp/transformers/auto/tokenizer.py", line 317, in from_pretrained
		    return actual_tokenizer_class.from_pretrained(
		  File "/data/miniconda3/envs/uie/lib/python3.9/site-packages/paddlenlp/transformers/tokenizer_utils.py", line 709, in from_pretrained
		    tokenizer, tokenizer_config_file_dir = super().from_pretrained(pretrained_model_name_or_path, *args, **kwargs)
		  File "/data/miniconda3/envs/uie/lib/python3.9/site-packages/paddlenlp/transformers/tokenizer_utils_base.py", line 1495, in from_pretrained
		    resolved_vocab_files[file_id] = resolve_file_path(
		  File "/data/miniconda3/envs/uie/lib/python3.9/site-packages/paddlenlp/utils/download/__init__.py", line 275, in resolve_file_path
		    raise EnvironmentError(f"Does not appear one of the {filenames} in {repo_id}.")
		OSError: Does not appear one of the ['https://bj.bcebos.com/paddlenlp/models/transformers/ernie_3.0/ernie_3.0_base_zh_vocab.txt'] in uie-base

文件要如何才能下载

@hymumu hymumu added the question Further information is requested label Feb 9, 2025
@BeginningOne
Copy link

There is an error in downloading the pre-trained model from the official website

Image

@zlyir
Copy link

zlyir commented Feb 10, 2025

所有的https://bj.bcebos.com/paddlenlp/models/transformers/* 下的文件都失效了,官方有人看下嘛

@ttangcc
Copy link

ttangcc commented Feb 10, 2025

the same question

environment:
paddlenlp 3.0.0b3
paddlepaddle-cuda 11.8

error message:
Traceback (most recent call last): File "D:\software\dev-env\conda\envs\paddle_csc\lib\contextlib.py", line 137, in __exit__ self.gen.throw(typ, value, traceback) File "D:\software\dev-env\conda\envs\paddle_csc\lib\site-packages\paddlenlp\taskflow\utils.py", line 126, in dygraph_mode_guard yield File "D:\software\dev-env\conda\envs\paddle_csc\lib\site-packages\paddlenlp\taskflow\task.py", line 341, in _get_inference_model self._construct_model(self.model) File "D:\software\dev-env\conda\envs\paddle_csc\lib\site-packages\paddlenlp\taskflow\text_correction.py", line 121, in _construct_model ernie = ErnieModel.from_pretrained(TASK_MODEL_MAP[model]) File "D:\software\dev-env\conda\envs\paddle_csc\lib\site-packages\paddlenlp\transformers\model_utils.py", line 2424, in from_pretrained resolved_archive_file, resolved_sharded_files, sharded_metadata, is_sharded = cls._resolve_model_file_path( File "D:\software\dev-env\conda\envs\paddle_csc\lib\site-packages\paddlenlp\transformers\model_utils.py", line 1807, in _resolve_model_file_path resolved_archive_file = resolve_file_path( File "D:\software\dev-env\conda\envs\paddle_csc\lib\site-packages\paddlenlp\utils\download\__init__.py", line 275, in resolve_file_path raise EnvironmentError(f"Does not appear one of the {filenames} in {repo_id}.") OSError: Does not appear one of the ['https://bj.bcebos.com/paddlenlp/models/transformers/ernie/ernie_v1_chn_base.pdparams'] in ernie-1.0.

@DrownFish19
Copy link
Collaborator

DrownFish19 commented Feb 11, 2025

@hymumu @BeginningOne

The pre-trained models related to information extraction are now available for download. Please try again.

@ooooye21
Copy link

@hymumu @BeginningOne

与信息提取相关的预训练模型现在可供下载。请重试。

Image
您好,打扰了,https://paddlenlp.bj.bcebos.com/models/transformers/skep/skep_ernie_1.0_large_ch.pdparams这个文件目前还下载不了

@DrownFish19
Copy link
Collaborator

@hymumu @BeginningOne
与信息提取相关的预训练模型现在可供下载。请重试。

Image 您好,打扰了,https://paddlenlp.bj.bcebos.com/models/transformers/skep/skep_ernie_1.0_large_ch.pdparams这个文件目前还下载不了

已经可以下载,在对应issue中已经回复

@MattixLee
Copy link

MattixLee commented Feb 12, 2025

@DrownFish19 您好,Entry Not Found for url: https://bj.bcebos.com/paddlenlp/models/transformers/unified_transformer/plato-mini.pdparams
这个文件不能下载,您帮忙看下,谢谢您!
祝好!

@DrownFish19
Copy link
Collaborator

@DrownFish19 您好,Entry Not Found for url: https://bj.bcebos.com/paddlenlp/models/transformers/unified_transformer/plato-mini.pdparams 这个文件不能下载,您帮忙看下,谢谢您! 祝好!

done

@MattixLee
Copy link

@DrownFish19 您好,Entry Not Found for url: https://bj.bcebos.com/paddlenlp/models/transformers/unified_transformer/plato-mini.pdparams 这个文件不能下载,您帮忙看下,谢谢您! 祝好!

done

可以下载了,谢谢您!
祝好

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

8 participants