Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pht: Fix wrong output from Prefix::toString #183

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

sim590
Copy link
Contributor

@sim590 sim590 commented Apr 3, 2017

dht::indexation::blobToString had been written from older code. Therefore, minor refactoring errors led to wrong output from the function. This PR fixes those errors and also limits the output of Prefix::toString to the bits of Prefix::content_. An additional method for bits of Prefix::flags_ is now provided as Prefix::flagsToString.

toString function should only output the string representation of the prefix in
bits (0s and 1s). More information is confusing and not useful as it requires to
parse the string.
Copy link
Member

@aberaud aberaud left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good, just a few minor things

*/
static std::string blobToString(const Blob &bl) {
static std::string blobToString(const Blob &bl, size_t nbr = 0) {
Copy link
Member

@aberaud aberaud Apr 3, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a method that returns a string representation of n first bits of a Blob.

  • method name is not accurate. It doesn't convert a Blob to a string.
  • nbr should not be optional (we manipulate bits, not bytes), allowing to simplify line 37. One could argue nbr is not a good argument name (type size_t already tells it's a number).
  • optional, at your preference: might be private, non-static in Prefix, using size_ directly (use and design are specific to Prefix).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

method name is not accurate. It doesn't convert a Blob to a string.

May be firstBitsToString?

nbr should not be optional (we manipulate bits, not bytes), allowing to simplify line 37. One could argue nbr is not a good argument name (type size_t already tells it's a number).

I agree.

optional, at your preference: might be private, non-static in Prefix, using size_ directly (use and design are specific to Prefix).

If you look at the historic of the files pht.cpp and pht.h, you'll see that it was first the case (see 2d8fe50). Not much is said as to why it was changed. I guess that we found that the function was not depending on the state of the instance of Prefix and could apply to other objects directly like Blob. However, I think that we can limit ourselves to the Prefix class for our needs as of now, like you suggest.

A refactoring error led to blobToString outputing garbage. This patch fixes the
issue and enables output of a number of bits other than multiple of 8.
*/
static std::string blobToString(const Blob &bl) {
std::string Prefix::firstBitsToString(const Blob &bl, size_t len) const {
Copy link
Member

@aberaud aberaud Apr 4, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you choose to make it a non-static method, you should remove the second argument and checks, and use size_ directly (otherwise shall remain static). Also compute len/8 out of loops like in previous implementation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@aberaud: In the case of toString, the ouput is the bit string up to size_ bits since this is the most expected behavior. However, one could like to get the whole bit string of the prefix for debugging purposes. To be honest, this function is mostly being used in debugging purposes. That's why I left it this way.

Copy link
Collaborator

@kaldoran kaldoran left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks ok for me :)
But I would probably just move the BlobToString with the Blob definition in order to be able to print a Blob if needed ( without rewriting this function ).
Then just call BlobToString from firstBitsToString

@aberaud
Copy link
Member

aberaud commented Apr 4, 2017

Manipulating bits one-by-one is not common in IT, we usually manipulate bytes (groups of 8 bits). Prefix exists to represent a serie of bits and bitsToString is the method to print content or mask of a Prefix.
I suggest "bitsToString" because it doesn't print the "first" bits, it prints all the bits in the prefix. Using a Blob that might have additional bits for data storage is an implementation detail. Actually the signature of the static version could be something like bitsToString(uint8_t* data, size_t num_bits). Prefix manages storage and has the responsibility to makes sure that data is always long enough.
So bitsToString is not a generic method related to Blob but a specific method related to Prefix. For this reason I think it should be part of Prefix and not exposed elsewhere.

@sim590
Copy link
Contributor Author

sim590 commented Apr 4, 2017

Manipulating bits one-by-one is not common in IT, we usually manipulate bytes (groups of 8 bits). Prefix exists to represent a serie of bits and bitsToString is the method to print content or mask of a Prefix. I suggest "bitsToString" because it doesn't print the "first" bits, it prints all the bits in the prefix.

If you want to argue on this, I'll ask you then: "What is the data of which the Prefix class is the prefix of?" The answer is the Key under which is indexed an element. Then, in word theory, a word w of length n has w_0 as prefix, but also w_1, w_2, up to w_n = w where w_i is the first letters of the word w up to the ith. Therefore, you understand that an instance of the Prefix class doesn't represent only one prefix, but all the prefixes of a certain key. Limiting toString method to the "current" prefix represented (when size_ is considered at a given time) is fine as this is the most expected behavior. However, the private method which I name firstBitsToString could be useful to other extents than only yielding the bits up to size_. In other words, it can be extended to be used not only on the given state of the instance, but also on another state that it could extend to (by simply changing value of size_).

Using a Blob that might have additional bits for data storage is an implementation detail.

I don't think that it's an implementation "detail". It's an important information. It contains the whole key for which we yield all prefixes.

So bitsToString is not a generic method related to Blob but a specific method related to Prefix. For this reason I think it should be part of Prefix and not exposed elsewhere.

This is already the case in my last uploaded version.

Finally, keep in mind that another valuable information is part of Prefix for which firstBitsToString is required. I speak of flags_ member. Therefore, here's another reason why firstBitsToString should keep its signature.

Copy link
Member

@aberaud aberaud left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Prefix is the abstraction to represent a specific serie of bits, using Blob inside is an implementation detail, that's just the current design, not really up for interpretation. If you want to print a different prefix, then build this other Prefix and print it.

Anyway at this point I don't really care. The only actual issue to solve is to make sure that either the method is static or removes the len parameter.

@sim590
Copy link
Contributor Author

sim590 commented May 11, 2017

@kaldoran, @aberaud, I haven't forgotten about this PR. In fact, I think about it everyday. I'll jump back on this as soon as I can.

@sim590
Copy link
Contributor Author

sim590 commented Sep 5, 2019

I realize that this PR has been opened for a very long time and I have not found time to work on this again yet. Since I cannot say as of now when I will find the time to work on this again, I think that the best is to close this for consistency of the list of ongoing PRs. Let anyone be free to take this work and produce a mergeable version of it.

@sim590 sim590 closed this Sep 5, 2019
@aberaud aberaud reopened this Sep 5, 2019
@aberaud
Copy link
Member

aberaud commented Sep 5, 2019

@sim590 Thanks I'll follow up on this PR

@aberaud aberaud force-pushed the master branch 7 times, most recently from f05b60d to dfd0a09 Compare March 23, 2022 22:32
@aberaud aberaud force-pushed the master branch 14 times, most recently from 966e620 to fe6676f Compare April 4, 2022 00:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants