-
Notifications
You must be signed in to change notification settings - Fork 193
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Consistent random identifier names #185
Consistent random identifier names #185
Conversation
Thanks for the PR. I think this is a worthwhile change, but I wonder if it could be a little simpler. This is a tough change to test, and probably an easy one to break accidentally, so I'd like it to be "obviously correct" absent a good regression test. What if |
So it's OK to slap an argument on |
Yeah, go for it. Just keep in mind that inlining means you'll have to use a fully-qualified field name (something like |
Had another attempt at this by seeding with nearby names. I left two debug lines in, commented out, in case you want to see what it uses for the seed. Here's the output on mine after running
I thought it might also be worth reducing the collision chance by increasing the length of the identifier or adding a set to the idgen that prevents duplicates. What do you think? |
I've been experimenting with hashing each file's filename, and appending to that an incrementing counter for each gensym. Seems to work pretty well. The filename hash prevent collisions in the same directory. https://github.com/glycerine/greenpack/blob/master/gen/cry.go#L29 |
@glycerine, that doesn't look like it makes the identifiers stable for unmodified functions - this is what I was hoping to accomplish with this PR. |
@shabbyrobe kindly define the term |
Oops, should've said "unmodified structs". If I have a file with two structs in it and change a field in the first one, the random identifier names in the Decode/Marshal/etc functions for the second struct will change (provided it is lower in an alphabetical sort than the first). As I commit all my generated code, some of the diffs are getting a bit out of control just from random identifiers changing. |
@shabbyrobe I see. Yes, files are processed top-to-bottom, so a sequential numbering of gensyms is stable from the top down until the change point. While this isn't the minimal change possible, it is quite stable and easy to observe correct. Anyway, its free to try. |
Sorry for the bump, @philhofer, but every day our diffs get more and more crazy. It's nudging 3,000 lines per generation just to add one field! Is there any chance we could get this up and running soon? I'm happy to do whatever is required to make it a reality. |
gen/elem.go
Outdated
idxChars = "abcdefghijlkmnopqrstuvwxyz" | ||
idxLen = 3 | ||
idxChars = "abcdefghijlkmnopqrstuvwxyz" | ||
idxCharsLen = int64(len(idxChars)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this constant added but never used?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Previous version of the same idea. Didn't clean it up properly. Fixed now.
@shabbyrobe how well does the approach here work? Can you quantify? I only see implementation for Array, Map, and Slice. What about Struct, Ptr, and BaseElem? Additionally, It sounds like you aren't running your own fork in production... not exactly a vote of confidence in it. |
Had another stab at this. It's easier to follow what has been generated and why this way. I ran into trouble with the fact that the |
Awesome. I like this approach. This change now seems much more "obviously correct." Want to add a test in the /gen directory? You'll have to write it like the issue94 test, where you generate code and run a shell script both through
|
Awesome thanks for the feedback. I've made a first pass at this, I went a slightly different route by pulling the declarations out of the AST. I could clean it up a bit and use the same building blocks to run a stack more tests against the identifier generation. Let me know if you're happy enough with the approach and I'll expand it to cover more cases, otherwise I'll have another stab if you'd rather I do it a different way. |
gen/issue185_test.go
Outdated
|
||
// Ensure that consistent identifiers are generated on a per-method basis by msgp. | ||
// | ||
// Also eusnre that no duplicate identifiers appear in a method. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/eusnre/ensure/
Makefile
Outdated
@@ -28,6 +28,7 @@ $(MGEN): ./msgp/defs_test.go | |||
|
|||
test: all | |||
go test -v ./msgp | |||
go test -v ./gen |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You should add this line to the travis
rule as well (sorry; the Makefile is a mess, as it always is...)
This approach is fine with me too. It's about time we had better test harnesses. |
I fixed up the makefile, the spelling error and separated aspects of the test case and added another more complex generation template. Can you think of any other scenarios I can add? |
Do you test for identifier collision between two or more files generated in the same directory? I had to add a per-file hash of the filename to avoid that. (But then, I generate global vars and I'm not sure if you are...) |
gen/issue185_test.go
Outdated
|
||
// Loose sanity check to make sure the tests expectations aren't broken. | ||
// If the prefix ever changes, this needs to change. | ||
for i := 0; i < len(vars); i++ { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use a range loop here
msgp doesn't generate global variables, so I'm not worried about conflicts across generated files. Can we move the test to the root of the tree and have it call |
Seems much nicer to me with |
issue185_test.go
Outdated
return err | ||
} | ||
|
||
var mode = gen.Encode | gen.Decode | gen.Size | gen.Marshal | gen.Unmarshal |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
mode := ...
issue185_test.go
Outdated
} | ||
|
||
tfile := filepath.Join(tempDir, "msg.go") | ||
genFile := filepath.Join(tempDir, "msg_gen.go") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can just use newFilename
here, since it's already defined for you.
You can suppress the output by putting the following in the top of the test file: func init() {
os.Stdout = ioutil.Discard
} |
Unfortunately |
D'oh. Yeah. For now lets leave the output as-is and then entertain a refactoring of |
gen/issue185_test.go
Outdated
@@ -0,0 +1,308 @@ | |||
package gen |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't this file just a duplicate of the one in the root of the file tree? Remove it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops! Done.
Makefile
Outdated
@@ -27,7 +27,9 @@ $(MGEN): ./msgp/defs_test.go | |||
go generate ./msgp | |||
|
|||
test: all | |||
# keep in sync with 'make travis' | |||
go test -v ./msgp |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This and the travis
rule can probably just be go test ./...
now. At one point long ago I think there was an ordering dependency that needed to be satisfied, but I can't see why that wouldn't be a valid way to run the tests now.
Hello! Just a cheeky bump, is there anything else I need to do with this PR? |
Awesome! Thank you. |
We choose to commit our generated msgpack files to our git repository, but this can yield some pollution in our diffs when certain changes are made to our structs.
Considering these structs:
If you add a new member to
Bar
that causes the generator to callrandIdent()
(for examplePants []string
), every subsequent ident in the generated file will be changed, even in the generated methods forFoo
.We are leaning very heavily on this generator in a part of our project which is under heavy development. Some of our generated files get quite large and our diffs are getting a little silly: adding a single field towards the top can have the unwanted side effect of changing hundreds of totally unrelated lines in the diff as random identifiers are bumped!
This PR has a stab at making the identifier names consistent based on the method and depth, i.e. every scope at a particular depth will have exactly the same sequence of identifiers. I'd be very grateful if you'd consider it for inclusion, happy to make any revisions necessary.
This change is