Skip to content

Path.hasSameTextualContentAs #375

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
robstoll opened this issue Feb 18, 2020 · 21 comments · Fixed by #458
Closed

Path.hasSameTextualContentAs #375

robstoll opened this issue Feb 18, 2020 · 21 comments · Fixed by #458
Assignees
Milestone

Comments

@robstoll
Copy link
Owner

robstoll commented Feb 18, 2020

Platform (jvm, js, android): jvm
Extension (none, kotlin 1.3, jdk8): jdk8

Code related feature

expect(path).hasSameTextualContentAs(otherPath)

per default UTF-8 should be used when reading both files.
However, one shall be able to overwrite the default, something like:

expect(path).hasSameContentAs(otherPath, sourceCharset=Charsets.ISO_8859_1, targetCharset=Charsets.UTF_16)

Bonus add also hasSameBinaryContentAs

@jGleitz
Copy link
Contributor

jGleitz commented Feb 18, 2020

should we rather call this hasSameStringContent for better clarity?

@robstoll
Copy link
Owner Author

AssertJ has hasSameTextualContentAs (they actually deprecated hasSameContentAs in favour of it) but IMO it is kind of the default case that one would want to compare textual content and thus would prefer a shorter version. But I am also not against using hasSameTextualContentAs if you think this is really necessary

@jGleitz
Copy link
Contributor

jGleitz commented Feb 18, 2020

What should be considered the default depends on what those assertions do. I expected hasSameStringContent to fail if the content cannot be parsed as a string, even if the two contents are identical. In that case, I think we should point it out in the name of the function.

Alternatively, we could also just implement hasSameContentAs, which does binary comparison, and tries to parse the content as UTF8 in reporting. In that case, we would not even need two different assertions. The function could, of course, still take an optional argument to change the encoding.

@robstoll
Copy link
Owner Author

robstoll commented Feb 18, 2020

I expected hasSameStringContent to fail if the content cannot be parsed as a string

You lost me there, I never encountered a problem reading a binary file even if the resulting text is gibberish. I also think that doing only binary comparison would not work if the source and target file is stored in a different encoding.

We could implement a heuristic à la grep reading the first 1024 and search for \0 to determine whether the file is binary or not but I think we are better of with hasSameTextualContentAs in this case

@jGleitz
Copy link
Contributor

jGleitz commented Feb 18, 2020

You lost me there

Not all sequences of bits are valid code points in all encodings. For example, in UTF-8, code points always start with 00 or 11, if I remember correctly.

I also think that doing only binary comparison would not work if the source and target file is stored in a different encoding.

Oh, wow, okay, you also want to support comparison between different encodings? I missed that (but now I see it clearly from your proposal)

We could implement a heuristic…

I only suggested using a heuristic in reporting, not in the actual assertion logic. But maybe that is still too complicated.

I personally prefer precision over conciseness, but my opinion is not very strong on this. I would prefer hasSameTextualContentAs, but I would also accept hasSameContentAs.

@robstoll
Copy link
Owner Author

So, let's go with hasSameTextualContentAs then

@robstoll robstoll changed the title Path.hasSameContentAs Path.hasSameTextualContentAs Feb 18, 2020
@tkech17
Copy link
Contributor

tkech17 commented Apr 14, 2020

where has to be added this method?

@robstoll
Copy link
Owner Author

atrium/apis/fluent-en_GB/extensions/jdk8/atrium-api-fluent-en_GB-jdk8-jvm/src/main/kotlin/ch/tutteli/atrium/api/fluent/en_GB/jdk8/pathAssertions.kt

@tkech17
Copy link
Contributor

tkech17 commented Apr 14, 2020

so i have to write method with 3 arguments, where last two are optionals with value UTF-8. is that right?

@tkech17
Copy link
Contributor

tkech17 commented Apr 14, 2020

and one simple hasSameBinaryContentAs method

@robstoll
Copy link
Owner Author

You got it right, let me know in case you need pointers.

@tkech17
Copy link
Contributor

tkech17 commented Apr 14, 2020

should i write @SInCE [version] in doc? or not?

@robstoll
Copy link
Owner Author

Yes please, we started to add the tag since v. 0.9.0 to everything which is new and public API

@tkech17
Copy link
Contributor

tkech17 commented Apr 14, 2020

should i add this method for infix part, too?

@robstoll
Copy link
Owner Author

I have not yet thought about how the API should look like, we cannot have default arguments in the infix API. Therefore I would say no unless you have an idea at hand.

@tkech17
Copy link
Contributor

tkech17 commented Apr 14, 2020

maybe automatically determining encoding, but don't think that there is any library that can guess the encoding 100% correctly.
the reason i was asking was that (ch.tutteli.atrium.specs.integration.PathAssertionsSpec)
is parent of both fluent and inflix PathAssertionsSpec classes.
and if i add (ch.tutteli.atrium.specs.integration.PathAssertionsSpec) these two method(from this issue), that inflix part of it should have these methods, too.
what should i do here?

@robstoll
Copy link
Owner Author

Workaround it similar to this one apis/cc-infix-en_GB/atrium-api-cc-infix-en_GB-jvm/src/test/kotlin/ch/tutteli/atrium/api/cc/infix/en_GB/CharSequenceAssertionsSpec.kt -> getContainsDefaultTranslationOfPair

use a text something like: "not supported in this API - hasSameTextualContentAs" to ...

@robstoll
Copy link
Owner Author

alternatively you can implement the following API for infix:

expect(path) hasSameTextualContentAs path
expect(path) hasSameTextualContentAs withEncoding(path, sourceEncoding=..., targetEncoding=...)

where withEncoding creates a PathWithEncoding similar to PathWithCreator

@jGleitz
Copy link
Contributor

jGleitz commented Apr 15, 2020

Could we do something like

expect(path) hasSameTextualContentAs otherPath withEncoding (UTF_8 to UTF_16)

That would feel more intuitive to me and I think it should be possible with infix?

@robstoll
Copy link
Owner Author

how would the version look like where one omits the encodings?

expect(path) hasSameTextualContentAs otherPath

would not work any more. IMO not providing encodings is the default case and should be simple, simpler than the above is probably not possible.
Also, I think it should fit into the existing infix API. For instance, resolve is:

expect(path) resolve otherPath
expect(path) resolve path(otherPath) { ... }

in this sense I would continue with helper functions dealing with use cases where there are more than 1 parameter.
But I am curious what idea you have for the default case

@jGleitz
Copy link
Contributor

jGleitz commented Apr 15, 2020

I cannot speak to consistency of the API since I don’t know the API well. I generally don’t see the point of an infix API if we don’t try to make it “as infix-ish as possible”.

To how I would realise this: I didn’t think my proposal through, I just liked it. I did not think of the problem that the hasSameTextualContentAs matcher doesn’t “know” that another method call will follow to specify the encoding. So I have no solution 🤷

Another, possible, solution, would be something like

expect(path) withEncoding UTF_8 hasSameTextualContentAs (otherPath withEncoding UTF_16)

But I am not sure that that is better.

tkech17 pushed a commit to tkech17/atrium that referenced this issue Apr 20, 2020
…ion.PathAssertionsSpec tests is what you intentded, and please also check ch.tutteli.atrium.api.infix.en_GB.jdk8.PathAssertionsSpec.
tkech17 pushed a commit to tkech17/atrium that referenced this issue Apr 22, 2020
@robstoll robstoll added this to the 0.12.0 milestone Apr 23, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants