Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How can I get the Model-based/TinyLlama (1.1B) /Always Retrieval match is 28.2 ? #1

Open
HNUZCC opened this issue Oct 9, 2024 · 1 comment

Comments

@HNUZCC
Copy link

HNUZCC commented Oct 9, 2024

I run the bash run_lm.sh

show:
======= estimate no retrieval (q) API cost: 0.017889500000000003, total tokens #: 35779 ================
======= estimate always retrieval (q+context) API cost: 0.892045, total tokens #: 1784090 ================
======= total retrieval: [2785/2785] ================

{'data_source': 'retrievalqa', 'total_data_count': 2785, 'retrieval_frequency': 2785, 'retrieval_rate': 100.0, 'match_score': 59.9, 'f1_score': 15.2, 'em_score': 0.1, 'accuracy_score': 34.3, 'match_total': 1667, 'f1_total': 424.5294026557026, 'em_total': 4.0, 'accuracy_total': 954.0, 'total_q_tokens': 35779, 'total_context_tokens': 1748311, 'total_no_retrieval_tokens': 35779, 'total_always_retrieval_tokens': 1748311, 'estimate_no_retrieval_cost': 0.017889500000000003, 'estimate_always_retrieval_cost': 0.892045, 'saved_cost_rate': 0.9799455184435762, 'args': {'openai_config_path': './openai_config.txt', 'data_source': 'retrievalqa', 'retrieval_mode': 'always_retrieval', 'input_data_path': './data/retrievalqa.jsonl', 'output_score_path': './results/always_retrieval/TinyLlama/TinyLlama-1.1B-Chat-v1.0/m=vanilla/t=0.0/score_retrievalqa_seed20.json', 'output_prediction_path': './results/always_retrieval/TinyLlama/TinyLlama-1.1B-Chat-v1.0/m=vanilla/t=0.0/predict_retrievalqa_seed20.jsonl', 'model_name': 'TinyLlama/TinyLlama-1.1B-Chat-v1.0', 'max_tokens': 100, 'batch_size': 1, 'doc_top_n': 5, 'limit_input': 0, 'prompt_method': 'vanilla', 'seed': 20, 'temperature': 0.0, 'top_p': 1.0, 'world_size': 1}}
./results/always_retrieval/TinyLlama/TinyLlama-1.1B-Chat-v1.0/m=vanilla/t=0.0
./results/always_retrieval/TinyLlama/TinyLlama-1.1B-Chat-v1.0/m=vanilla/t=0.0

but the article shows that Model-based TinyLlama (1.1B) Always Retrieval match is 28.2. what is the match mean? The reproduced data seems to be inconsistent with it, is it my misunderstanding or my operational error?

@ZhangzihanGit
Copy link
Collaborator

ZhangzihanGit commented Nov 2, 2024

Hi HNUZCC,

I'm sorry for not getting back to you sooner. I just saw this message.

In your experiments, you run the overall data, which has 2,785 data instances, with 1,271 labelled as required retrieval and 1,514 labelled as do not require retrieval. We have presented these results in Table 10 in the Appendix (see Appendix A.6). On the other hand, the 28.2 Always Retrieval match score for Model-based TinyLlama (1.1B) in Table 1 was only evaluated on the 1,271 questions that require retrieval.

As stated in Section 3.2, different from strict matching, match score measures whether gold answers are included in the model predictions. So, for example, if "Canada" is the gold answer, and the model prediction is "The answer is Canada". Then, the match score would be 1, but the strict match score would be 0.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants