Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bad @DATA instance format, cannot handle data instance end with comma #101

Open
Anjin-Liu opened this issue Jan 6, 2020 · 2 comments
Open

Comments

@Anjin-Liu
Copy link

Hi,
Happy new year!

I recently used weka to generate some .arff files.
The files looks like

@relation 'SEA'

@Attribute attrib1 numeric
@Attribute attrib2 numeric
@Attribute attrib3 numeric
@Attribute class {groupA,groupB}

@DaTa

7.30967787376657,2.4053641567148585,6.374174253501082,groupB,
1.1700660880722513,7.815346320453048,2.5277616657598587,groupB,
9.84841540199809,8.791825178724801,9.412491794821143,groupB,
3.1293596519376554,3.6797575871052812,7.051747444754559,groupA,

which has a comma at the end of each row.
These files can be read by weka correctly, but cannot be loaded by liac-arff.
liac-arff will report
"Bad @DaTa instance format in line 10: 7.30967787376657,2.4053641567148585,6.374174253501082,groupB,"

after removing the comma, it works fine.

So, I think this might be an inconsistency with weka and submit this issue.

@jnothman
Copy link
Contributor

jnothman commented Jan 7, 2020

Can you give more information on how you generated this? This appears to contradict the specs

@Anjin-Liu
Copy link
Author

Hi jnothman,

Sorry for the late reply.
Actually I used the MOA machine learning for stream software (https://moa.cms.waikato.ac.nz/) to generate the arff files.

I used the SEAGenerator
SEAGenerator seaG1 = new SEAGenerator();
seaG1.nextInstance().getData().toString();

The comma at the end of instance can be easily removed by modifying the generated instance string.
The main concern is that Weka can load arff files with comma at the end of each instance, but liac-arff cannot.
This is not a big issue.
But I think maybe liac-arff should be able to load such arff files as the same as Weka.

Best,

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants