You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
So, it appears that (2) and (4) are the same condition.
Experimental results check out - (3) is more effective than (1), but (4) ~ (2), differences due to noise.
Here's my plan - let's replace (1) with (3) - i.e., just overwrite, but retain (2); so we'll discard (4). Thus, we will not be creating new YAML files for regressions.
msmarco-passage-unicoil: remains exactly the same
msmarco-passage-unicoil-noexp: updated to title/segment encoding
The text was updated successfully, but these errors were encountered:
Background, ref #1850
On MS MARCO v2, in addition to these previous corpora:
@MXueguang also created these:
The first set is encoded only with segment; the second prepended doc title.
We wanted do the same with V1. These are the existing corpora:
And these are the new ones @MXueguang prepared:
However, we later discovered:
msmarco-passage-unicoil-noexp.tar
- appears to be just segment encodedmsmarco-passage-unicoil.tar
- actually appears to be title/segment encodedmsmarco-passage-unicoil-noexp-v2.tar
- title/segment encodedmsmarco-passage-unicoil-v2.tar
- title/segment encodedSo, it appears that (2) and (4) are the same condition.
Experimental results check out - (3) is more effective than (1), but (4) ~ (2), differences due to noise.
Here's my plan - let's replace (1) with (3) - i.e., just overwrite, but retain (2); so we'll discard (4). Thus, we will not be creating new YAML files for regressions.
msmarco-passage-unicoil
: remains exactly the samemsmarco-passage-unicoil-noexp
: updated to title/segment encodingThe text was updated successfully, but these errors were encountered: