Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Grains: clarify what "total grains" means #1936

Closed
bhagerty opened this issue Jan 31, 2022 · 20 comments
Closed

Grains: clarify what "total grains" means #1936

bhagerty opened this issue Jan 31, 2022 · 20 comments

Comments

@bhagerty
Copy link

In the Grains exercise, the specification says code should show "the total number of grains on the chessboard." This is not entirely clear.

I read this to mean "the total number of grains on the chessboard, assuming it is full." And code returning this number passes the tests (at least in Python).

But from the community solutions, it's clear many people read this to mean "the total number of grains on the chessboard up to and including square X," where X is a number between 1 and 64.

I suggest clarifying the instructions to clearly reflect whether total grains means (1) the grains on a completed 64-square chess board, or (2) the grains up to and including an arbitrary square on the chess board.

@NobbZ
Copy link
Member

NobbZ commented Feb 1, 2022

The exercise has 2 parts.

  1. Given a number from 1 through 64, calculate the grains on the given field, other fields not included.
  2. Called without arguments, return the total number of grains on the chessboard. As we do not receive any argument here, we ca not calculate "total for n fields covered, but have to decide between the 65 possible ways to fill the board, where putting grains on all 64 fields is the most obvious interpretation for "total". And also the one expected from the canonical data

I'd be interested in seeing the solutions that calculate for arbitrary count of filled fields.

@siebenschlaefer
Copy link
Contributor

@NobbZ I've seen a few those solutions on the C++ track where total() takes a parameter with a default value of 64 and the function then returns the sum of grains up to that field. (But that always felt like the student didn't like a function that returns a fixed value or they wanted to "up the game".)

@ErikSchierboom
Copy link
Member

There are 64 squares on a chessboard (where square 1 has one grain, square 2 has two grains, and so on).

Write code that shows:

how many grains were on a given square, and
the total number of grains on the chessboard

It might be that I just know how to interpret this, but to me this description states that:

  • Each square has a number of grains
  • There are 64 squares

I'm not sure that ", assuming it is full" adds anything, because of the first line implying that each square has grains.
We're of course open to improvements, I just don't think the suggested changes offer much value.
But others might disagree.

p.s. taking a parameter for total does not offer much value I think. For many languages, the value of the total property is that it lies outside the range of a 32-bit integer.

@NobbZ
Copy link
Member

NobbZ commented Feb 1, 2022

@siebenschlaefer Yes, I never liked that many tracks force to write a function where more appropriate tools like a const would have been available, but thats the world we are living in…

@bhagerty
Copy link
Author

bhagerty commented Feb 1, 2022

Thanks for all of these comments. To be clear, I am saying that there are, in fact—as shown by the reactions of other human beings—different ways to read the problem specification. I'm making a claim about the world, supported by evidence. People are, in fact, reading the problem statement differently, which shows that clarification would be helpful.

It seems like the commenters here read this problem statement as I do, to mean: Assuming every square is full, return the total number of grains. And I agree, that's the most-obvious reading.

But the evidence shows that this is not how everyone reads it. I saw many community solutions that reflected the second reading of the problem provided in my original comment—that is, return the total number of grains up to square X of an arbitrary number. And that is not a manifestly crazy reading of the problem. It's actually a reasonable, defensible inference from the fact that the first part of the statement says to calculate the number of grains in an arbitrary square. If I'm being asked to calculate grains in an arbitrary square, maybe I'm also being asked to calculate grains up to that arbitrary square, and maybe the problem writer was a little sloppy.

This may not be the most-obvious reading of the specification. Again, it's not how I read it. But it's how other people read it. So adding "assuming it is full" does add something: it clarifies something that, from the evidence, is not clear. It is hard for me to understand why anyone would object to a clarification that (1) real-world evidence shows is needed by some people, and (2) does not make the problem less clear.

In short, "It's clear to me" is not the same thing as "it is in fact clear," and it doesn't actually contradict evidence showing "it is not clear to other people."

@ErikSchierboom
Copy link
Member

It is hard for me to understand why anyone would object to a clarification that (1) real-world evidence shows is needed by some people, and (2) does not make the problem less clear.

I don't object to the improvement, I'm just not entirely sure I like the "assuming it is full" Maybe there's an alternative way to phrase this that makes it more clear?

It is hard for me to understand why anyone would object to a clarification that (1) real-world evidence shows is needed by some people

We likely have different experiences. I've not yet come across people making this mistake, but that may just be a coincidence.

@kotp
Copy link
Member

kotp commented Feb 1, 2022

"Assuming it is full" also had a weird tinge to me. The fact is that it does not have to be full in order for it to have the total at the point of where that should be. Indeed the board that the grains are on may have very large squares. Indeed, the first 2 squares are unlikely to be full at all. The fact that the problem is a doubling of the grains at each location, and the statement that there are 64 locations, does not suggest that anything is full, only that there is a total.

Perhaps "total grains on the board" makes it clear, without inferring some capacity potential of the board upon which it is on.

@bhagerty
Copy link
Author

bhagerty commented Feb 1, 2022

Thanks. I am not married to "assuming it is full." That was meant to convey the concept. Other options:

  • "the total number of grains on the chessboard after placing grains on every square."
  • "the total number of grains on the chessboard after grains have been placed on every square."
  • "the total number of grains on the chessboard, assuming every square has the required number of grains."
  • "the total number of grains on the chessboard once the process of filling squares is complete."

Other options are also possible. Again, I'm not married to "full," I just want to address the misunderstanding I saw reflected in community solutions to this problem.

@junedev
Copy link
Member

junedev commented Feb 1, 2022

I am pro clarification and I like this one from above:

"the total number of grains on the chessboard after placing grains on every square."

@SleeplessByte
Copy link
Member

I also am in favour of the clarification @junedev picked. Thanks @bhagerty!

@iHiD
Copy link
Member

iHiD commented Feb 1, 2022

I am saying that there are, in fact—as shown by the reactions of other human beings—different ways to read the problem specification. I'm making a claim about the world, supported by evidence. People are, in fact, reading the problem statement differently, which shows that clarification would be helpful.

I try to stay out of these discussions, but I felt like such a strong statement needed addressing, as I don't believe it to be true. While I don't dispute that some people may be misled by this, I don't believe we have evidence to suggest that they are. We have a correlation between that statement in the instructions and a plethora of community solutions that take a param for total. But I do not believe we have any evidence that the README is a causation for the solutions.

Looking back in time, the version of the exercise that the top Python community solutions was solving actively required a parameter to be passed, and that the result should be calculated based on it (this was changed here:
https://github.com/exercism/python/pull/1794/files). This is much more likely to be why the community solutions show this. Looking at the most recent community solutions for Python, it's extremely rare for people to require an argument.

I'd postulate when people do require an argument, they're probably extending the exercise for themselves. That postulation may be wrong and they may indeed be confused, but I don't believe we have evidence to show that.


My stance on the actual issue is that more clarity is always better, so if changing that statement makes the exercise clearer, then I'm for it, but I don't believe we have any evidence that it's necessary, and thanks @bhagerty from wanting to improve things!

@SaschaMann
Copy link
Contributor

SaschaMann commented Feb 1, 2022

These suggestions would be a breaking change to the exercise. On the Julia track, we interpret the task as

Calculate the total number of grains after square square.

where square is the argument. In other words, the total number of grains on the board after square squares have been filled. [Example solution]

This covers the case where all squares have been filled but isn't limited to that case.

I feel like there's an agreement in this issue that the current wording can be interpreted differently and evidently maintainers have done so in the past, too. Therefore I am against an Exercism-wide change that forces one interpretation as it would needlessly break implementations of the exercise. Instead, tracks should be encouraged to clarify this locally based on their test suite.

@bhagerty
Copy link
Author

bhagerty commented Feb 1, 2022

So this is fascinating. The entire Julia track interprets the problem differently from how most of the posters here (including me) interpret it. Surely this is evidence that the problem statement is not clear, if any more is needed.

I don't, however, think it makes sense to leave it unclear to avoid "breaking" the Julia track. If the idea is that these exercises are language-agnostic, then they should be clear across every language, not be susceptible of different interpretations in different language tracks.

I actually think the Julia-track interpretation of the problem—give us total grains up to square X—is more interesting than the other interpretation—give us total grains if every square has grains.

I don't really know how best to resolve this. I only think that it should be resolved (otherwise the problem is unclear, and its use is not language-agnostic), and that the debate over whether the problem statement is unclear should be considered closed. The problem statement is unclear to a non-trivial number of people.

Though I appreciate @iHiD clarifying that what I saw as evidence of unclarity could have been evidence of a change in the problem statement. I wasn't aware of that change. That makes sense. So there may be less evidence than I thought, because I was misinterpreting the evidence based on a faulty supposition (that the solutions I saw answered the current problem statement). I still think clarification is needed, though, especially in light of @SaschaMann's comment.

@junedev
Copy link
Member

junedev commented Feb 1, 2022

@SaschaMann To clarify, does that mean you would veto the discussed clarification PR if someone would create it?

@SaschaMann
Copy link
Contributor

SaschaMann commented Feb 1, 2022

I don't, however, think it makes sense to leave it unclear to avoid "breaking" the Julia track. If the idea is that these exercises are language-agnostic, then they should be clear across every language, not be susceptible of different interpretations in different language tracks.

IMO exercises being language-agnostic doesn't require that they're identical across languages. Each track can use the shared spec as a foundation to build an interesting exercise that makes sense in the language. To me it merely means that ideally they're written in a way that enables them to work well on the largest numbers of tracks (even if that requires them to be a bit more vague) and that they should avoid language-specific concepts or jargon unless strictly required.

I also don't think it should be left unclear, I just don't think that prob-specs is where this change needs to happen as it would restrict the exercise pointlessly. It should happen in the tracks based on their take on the common data of the exercise.

I want to point out that this conflict between people who just want to auto-generate exercises blindly based on the shared prob-spec, therefore requiring a more precise and exhaustive description and set of tests, and people who take prob-spec as a pool of interesting exercises that can and should be adapted by the maintainers has lead to endless discussions in the past. I'm very much in the latter camp; the vagueness is (to some extent) a feature of prob-specs.

I actually think the Julia-track interpretation of the problem—give us total grains up to square X—is more interesting than the other interpretation—give us total grains if every square has grains.

I agree. I'd like to claim that was intended but I think we just copied that from Python's implementation before it was changed.

@SaschaMann To clarify, does that mean you would veto the discussed clarification PR if someone would create it?

Yes.

I'd be happy with a change similar to #1856, though, or a comment in the canonical data that points out that maintainers have to decide what approach they prefer. There is a dicussion somewhere in an Exercism repo on how to handle optional parts in the description that become necessary due to implementing certain cases, which could also solve this. However, I can't find that. #1749

@petertseng
Copy link
Member

In the other direction, if there were to be agreement that the problem is indeed made more interesting by having this totalGrains take the argument, then the description and canonical data can both be made that way.

@ErikSchierboom
Copy link
Member

These suggestions would be a breaking change to the exercise. On the Julia track, we interpret the task as

Calculate the total number of grains after square square.

This is interesting, because this does not match the canonical data:

{
  "uuid": "6eb07385-3659-4b45-a6be-9dc474222750",
  "description": "returns the total number of grains on the board",
  "property": "total",
  "input": {},
  "expected": 18446744073709551615
}

The test case does not have any arguments and also has a fixed, expected value, which seems to be in disagreement with how Julia has interpreted it.
That said, I can see the argument why it might be more interesting to calculate up until a certain number.
I think there are a couple of options:

  1. Do what @petertseng suggested and add a totalGrains property that either reimplements the total property or is a new property that tracks can choose to implement. This would mean that all tracks have to then decide whether they want to implement the new property, or keep implemented the existing one.
  2. Change the description to one of the suggested options. This is problematic for tracks (like Julia) that have interpreted total as taking an argument, as their implementation then does not match the description.
  3. Do nothing. The OP suggests that this the current description is problematic, but in practice it might not be very significant. No one has to change anything.
  4. Some other option?

I also don't think it should be left unclear, I just don't think that prob-specs is where this change needs to happen as it would restrict the exercise pointlessly. It should happen in the tracks based on their take on the common data of the exercise.

So there's definitely a tension here. Less restrictions could mean more work for tracks, as they'd need to e.g. create an instructions.append.md file instead of being able to "just" use the description from prob-specs. That doesn't mean that we should always allow for adding restrictions, but I do feel that imposing more work on track maintainers when implementing an exercise is something we need to consider (especially as newer maintainers might not be familiar with the appends system).

Taking a step back and looking at the bigger picture, we'll keep having these textual discussions in the future. Could we perhaps alleviate the issue by allowing tracks to customize the description a bit more? Right now, we support text being appended, but maybe this is not enough?


@SaschaMann To clarify, does that mean you would veto the discussed clarification PR if someone would create it?

Yes.

p.s. I must admit that I was slightly taken aback by veto-ing being mentioned as an option here. By its shared nature, prob-specs is often a repositories of compromises (and thus discussion), whereas veto-ing is a rather uncompromising approach. I think there's a big step from being opposed to a change to veto-ing the change. Just my 2 cents 🤷

@SaschaMann
Copy link
Contributor

SaschaMann commented Feb 2, 2022

So there's definitely a tension here. Less restrictions could mean more work for tracks, as they'd need to e.g. create an instructions.append.md file instead of being able to "just" use the description from prob-specs. That doesn't mean that we should always allow for adding restrictions, but I do feel that imposing more work on track maintainers when implementing an exercise is something we need to consider (especially as newer maintainers might not be familiar with the appends system).

That effort is trivial compared to the effort of coming up with an idiomatic test suite and implementation, going through tests one by one and deciding which cases to implement.

I'd (obviously) prefer option 1 or 3. 2 removes an interesting implementation of this exercise. And at least imo, defining a function that returns a pre-calculated constant is covered well enough by hello-world already.

This is interesting, because this does not match the canonical data:

I was wondering why that is and it turns out that the original grains spec did not specify that there shouldn't be an input, it only specified a result. So it seems that #1191 inadvertently removed an ambiguity from the canonical data that was there previously while that ambiguity was still kept in the description.


p.s. I must admit that I was slightly taken aback by veto-ing being mentioned as an option here. By its shared nature, prob-specs is often a repositories of compromises (and thus discussion), whereas veto-ing is a rather uncompromising approach. I think there's a big step from being opposed to a change to veto-ing the change. Just my 2 cents 🤷

I understood veto'ing as in requesting changes in the review (which obviously admins can overrule anyway), which seems like the proper way when there isn't a compromise but rather one implementation/interpretation is forced upon everyone.

@ErikSchierboom
Copy link
Member

That effort is trivial compared to the effort of coming up with an idiomatic test suite and implementation, going through tests one by one and deciding which cases to implement.

True, although there's far less of that if a track uses a test generator. We'll also have different tracks likely using different texts for the same exercise, which is not bad, but could be slightly confusing to a student doing multiple tracks (but this is not a big issue).

I understood veto'ing as in requesting changes in the review (which obviously admins can overrule anyway), which seems like the proper way when there isn't a compromise but rather one implementation/interpretation is forced upon everyone.

Well, that's still veto-ing and I still get a "this is not going to happen" feeling, but could be me of course 🤷

@iHiD
Copy link
Member

iHiD commented Feb 2, 2022

As per my policy on not letting problem-spec issues that don't have consensus linger, I'm going to close this with two decisions:

  • We won't make a change to the description.
  • I would welcome a PR for @petertseng's suggestion (Erik's point 1) but am agnostic to it.

I'm making this decision because I don't think the description is ambiguous when paired with a particular track's stub and test-suite. If the stub provides a param, then the sentence is interpreted as "based on the number of squares passed". If the stub doesn't provide a param, the sentence is interpreted to mean the whole board. To my knowledge, we've never had a student complain about being confused about this, and the context of this thread is us trying to guess what others are thinking. I don't see data to show that people are confused.

I have considered that 1191 clarified that the intention of this exercise is for the total board to be counted, but left the description ambiguous. We could clarify this and use track-inserts for Julia to fix this, but it feels unnecessary and a bit messy in this case (However, as a normal rule, if only one or two tracks were different to the rest, I'd optimise for the majority and expect the minority to add inserts). Due to the ambiguity, I think adding an extra optional test case that a maintainer can choose is then reasonable. If/when we solve #1749, I'd then suggest we have optional descriptions for the two possibilities.

Finally to clarify the "veto" issue, even though most maintainers understandably have their personal tracks at the forefront of their mind, our responsibility is to do what's best for Exercism and our students as a whole, not just for their individual tracks. Problem-specs in particular is a space where we make Exercism-wide choices - sometimes those mean making choices that go against our own best wishes (e.g. they making more work for individual track maintainer), but that's part of our collective responsibility. Therefore I'm pretty uncomfortable with the idea that anyone would veto a change that benefits 50 tracks for their own individual track. I don't think anyone has said that here, but I want to reiterate the principle that collective responsibility and pragmatism are key parts of making this repository, and problem-specs in general, work.


@bhagerty Thanks for opening this. And everyone else, thanks for the discussion 🙂

@iHiD iHiD closed this as completed Feb 2, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

10 participants