Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

References directories to compare apples and apples #53

Open
jnothman opened this issue Jun 23, 2014 · 4 comments · May be fixed by #57
Open

References directories to compare apples and apples #53

jnothman opened this issue Jun 23, 2014 · 4 comments · May be fixed by #57

Comments

@jnothman
Copy link
Member

I propose that under references/ we divide the system outputs into directories representing the different task settings. I propose that we split references into:

  • references/gold-mentions: the system attempted to link all (including NILs) gold mentions (?schwa-linkable)
  • references/gold-linked-mentions: the system attempted to link only gold linked mentions (aida, houlsby)
  • `references/system-mentions': the system identified its own mentions (schwa, tagme)

There's still the potential for the entries in the directories not to be altogether comparable with one another. For example, we could subdivide system-mentions into those that generate NEs only (schwa), and those that include other wikilinks (tagme); we could subdivide gold-mentions according to whether the system had access to CoNLL 2003 type annotations (although this may be harder to infer).

There is also the question of whether the directory structure should similarly be utilised to label (a) the corpus being evaluated (e.g. CoNLL vs ?IITB; testa vs testb), and (b) the ID mapping.

@benhachey
Copy link
Contributor

Also:

  • references/gold-linked-aidacandidates: Same as references/gold-linked-mentions, but uses aida_means.tsv.bz2 for candidate generation. I.e., the precise Hoffart et al. (2011) task setting.

@jnothman
Copy link
Member Author

I still don't see the difference between that and the setting where a
system's input is those mentions in the gold that are linked... assuming
this version of the gold, which for now is all we have.

On 23 June 2014 21:27, Ben Hachey [email protected] wrote:

Also:

  • references/gold-linked-aidacandidates: Same as
    references/gold-linked-mentions, uses YAGO means/label relationships
    for candidate generation. I.e., the precise Hoffart et al. (2011) task
    setting.


Reply to this email directly or view it on GitHub
#53 (comment)
.

@wejradford
Copy link
Contributor

I agree with the first structure points.

I think we keep the means dataset, as the goal is to demystify the evaluation (and its knobs and levers).

There is also the question of whether the directory structure should similarly be utilised to label (a) the corpus being evaluated (e.g. CoNLL vs ?IITB; testa vs testb), and (b) the ID mapping.

I favour putting in conll or similar, but am not sure about ID mappings. They're nice regression test fodder, but we shouldn't really need them as a user can run the appropriate commands to generate.

@jnothman jnothman linked a pull request Jun 24, 2014 that will close this issue
@benhachey
Copy link
Contributor

@jnothman - The difference is in the candidates (not the mentions).

On Tue, Jun 24, 2014 at 2:34 PM, jnothman [email protected] wrote:

I still don't see the difference between that and the setting where a
system's input is those mentions in the gold that are linked... assuming
this version of the gold, which for now is all we have.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants