You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I just run below code and find that the examples need to be evaluated by LLM are not equivalent to your papers.
`rule_based_source = ["E2E", "WIKIEVENTS", "CONLL2003",
"text_editing", "cnn_dailymail", "xsum", "samsum", "gigaword", "arxiv",
"BBH_logical", "BBH_time", "self_made_space", "gsm_8k"]
for type in ["content", "situation", "format", "example", "mixed"]:
data = json.load(open(f"./data/{type}_constraints.json"))
rule, llm = 0, 0
for d in data:
level = d["level"]
if level == 0:
continue
source = d["source"]
if source in rule_based_source:
rule += 1
else:
llm += 1
print(f"type: {type}, rule: {rule}, llm:{llm}")`
Is there any misunderstanding in your paper of code?
Thanks for your reply.
The text was updated successfully, but these errors were encountered:
I have checked my gpt4_discriminative_eval_input and find that the number of examples that need to be evaluated by LLMs are:
content: 65 | mixed: 45 | format: 140 | situation: 70
but your paper just reports:
content: 50 | mixed: 10 | format: 120 | situation: 55
I am very confused and kindly request your help, thank you so much.
I just run below code and find that the examples need to be evaluated by LLM are not equivalent to your papers.
`rule_based_source = ["E2E", "WIKIEVENTS", "CONLL2003",
"text_editing", "cnn_dailymail", "xsum", "samsum", "gigaword", "arxiv",
"BBH_logical", "BBH_time", "self_made_space", "gsm_8k"]
for type in ["content", "situation", "format", "example", "mixed"]:
data = json.load(open(f"./data/{type}_constraints.json"))
rule, llm = 0, 0
for d in data:
level = d["level"]
if level == 0:
continue
source = d["source"]
if source in rule_based_source:
rule += 1
else:
llm += 1
print(f"type: {type}, rule: {rule}, llm:{llm}")`
Is there any misunderstanding in your paper of code?
Thanks for your reply.
The text was updated successfully, but these errors were encountered: