-
Notifications
You must be signed in to change notification settings - Fork 31
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Represent problem data in atomspace #8
Represent problem data in atomspace #8
Conversation
@@ -458,11 +473,13 @@ FIND_PACKAGE(Doxygen) | |||
ADD_SUBDIRECTORY(doc EXCLUDE_FROM_ALL) | |||
|
|||
# Show a summary of what we got | |||
SUMMARY_ADD("AtomSpace" "A weighted and typed hypergraph database" HAVE_ATOMSPACE) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think you need to add that since AtomSpace is a requirement anyway.
if(_tpp.use_atomese) | ||
read_atomese_table(pms); | ||
else | ||
read_combo_table(pms); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's a mixture of spaces and tabs here. You want to respect the same indentation format within the same file. So you may either use whitespaces or reformat the entire table-problems.cc to use tabs, it's up to you (as long as such reformatting takes place in a different commit, within that PR, it's OK).
class atomese_reprUTest : public CxxTest::TestSuite | ||
{ | ||
private: | ||
AtomSpace *as; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All your new code is tab indented, why not have this one tab indented too?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh I see, it's from @masrb but still.
{ | ||
eval->eval( | ||
"(load-from-path \"tests/comboreduct/atomese_representation/real_data_result.scm\")"); | ||
Handle expected = eval->eval_h("(cog-execute! real_data_repr)"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think you need to execute real_data_repr
as it's supposed to have already been executed at loading time.
{ | ||
eval->eval( | ||
"(load-from-path \"tests/comboreduct/atomese_representation/boolean_data_result.scm\")"); | ||
Handle expected = eval->eval_h("(cog-execute! boolean_data_repr)"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same remark about executing.
* @param repr | ||
* @return | ||
*/ | ||
Handle& get_atomese_representation(std::istream& in, Handle& repr); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see any benefit in having the Handle both returned and taken as argument. Usually streaming functions do return the stream as well as taking it in argument in order to chain streaming, but seeing its usage I don't even think you would need to chain streaming. So basically you I suggest either of the following signatures
Handle get_atomese_representation(std::istream& in)
std::istream& get_atomese_representation(std::istream& in, Handle& repr)
The first one because it is simpler, and the second one because it allows, will you ever need it, to chain streaming. I probably go with 1 (cause I don't see chaining streaming in the future since csv info usually live in a standalone file) but it's up to you to evaluate which one is best.
type_tree tt; | ||
bool has_header, is_sparse; | ||
const string& target_feature = ""; | ||
const string& timestamp_feature = ""; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, it's actually valid C++, what happens is that ""
is converted to a string before initializing the string reference, as explained here http://en.cppreference.com/w/cpp/language/reference_initialization. However I don't see what benefits it brings, you might just as well remove &
, wouldn't you?
|
||
// TODO: find a better way of finding the domain (boolean, real). | ||
// For instance, the inputs could be contins where as the outputs boolean. | ||
type_node otype = get_type_node(get_signature_output(tt)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As you note it is a temporary shortcoming, but it's OK for now.
using namespace std; | ||
using namespace combo; | ||
|
||
Handle& create_repr(const vector<string>& labels, HandleSeq rows, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be more efficient to pass rows
as a const ref, like labels
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, same remark as for get_atomese_representation
, why not just return a Handle
instead of taking the output Handle as ref argument?
repr = createLink(SIMILARITY_LINK, | ||
parse_header(labels), | ||
createLink(rows, SET_LINK)); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, and repr
is not returned.
} | ||
|
||
// TODO: Add other i/o varieties here. | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
repr
is not returned either.
@Yidnekachew I understand that in the long run it is preferable to load data directly from a file to Atomese. However for now it might be simpler to load the data first into a table or ctable (or whatever is at your disposal) and then convert it into Atomese. For instance your loader doesn't support optional header, and maybe other things that I am overseeing. The other option, perhaps the best one, would be to refactor the table loaders so you can reuse as much as possible from that (as you've done for instance with |
@Yidnekachew I think it's better if you close this PR and create a new one, as there is enough changes to apply, and there's a conflict (I know github may resolve that at the press of button but it's still introduces some junk in the git history). |
Yes, so what I mean, is that you would rebase your branch onto the master, then remove your commits via a soft reset (to not loose the changes), then apply your modifications, and then create cohesive commits possible by selecting hunks of code rather whole files (if necessary). |
@ngeiswei Thanks for the comments! I'm closing this PR now. |
Implemented as per #3