Skip to content

Conversation

@dbtsai
Copy link
Member

@dbtsai dbtsai commented Aug 3, 2018

What changes were proposed in this pull request?

In Linux community, Co-authored-by and Signed-off-by git trailer have been used for awhile.

Until recently, Github adopted Co-authored-by to include the work of co-authors in the profile contributions graph and the repository's statistics. It's a convention for recognizing multiple authors, and can encourage people to collaborate in OSS communities.

Git provides a command line tools to read the metadata to know who commits the code to upstream, but it's not as easy as having Signed-off-by as part of the message so developers can find who is the relevant committers who can help with certain part of the codebase easier.

For a single author PR, I purpose to use Authored-by and Signed-off-by, so the message will look like

Authored-by: Author's name <[email protected]>
Signed-off-by: Committer's name <[email protected]>

For a multi-author PR, I purpose to use Lead-authored-by: and Co-authored-by: for the lead author and co-authors. The message will look like

Lead-authored-by: Lead Author's name <[email protected]>
Co-authored-by: CoAuthor's name <[email protected]>
Signed-off-by: Committer's name <[email protected]>

It's also useful to include Reviewed-by: to give credits to the people who participate on the code reviewing. We can add this in the next iteration.

@dbtsai
Copy link
Member Author

dbtsai commented Aug 3, 2018

There are couple other common used git trailers used in linux community, https://git.wiki.kernel.org/index.php/CommitMessageConventions

We may consider to add Reviewed-by: which can be helpful to know who participates the code review.

@dbtsai dbtsai changed the title [SPARK-25018] [Infra] Use Co-Authored-By git trailer in merge_spark_pr.py [SPARK-25018] [Infra] Use Co-authored-by and Signed-off-by git trailer in merge_spark_pr.py Aug 3, 2018
# The string "Closes #%s" string is required for GitHub to correctly close the PR
merge_message_flags += ["-m", "Closes #%s from %s." % (pr_num, pr_repo_desc)]

authors = "\n".join(["Co-authored-by: %s" % a for a in distinct_authors])
Copy link
Member Author

@dbtsai dbtsai Aug 3, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For single author commit, should we do Authored-by instead of Co-authored-by?

Or we can have the first author as Authored-by like the following.

Authored-by: David <[email protected]>
Co-authored-by: Bill <[email protected]>

Copy link
Member

@felixcheung felixcheung Aug 4, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or leave it as before Author: to be safe?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do these particular keys have a meaning to github or other tools? if "Author:" is what something is looking for, we'd want to leave it, I presume. But adding more lines for co-authors seems fine. I would have just listed multiple "Author:" lines, but sounds like there's another convention around "Co-authored-by:" that another tool looks for?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems that those keys are widely used in Linux community for a long time, and Github adopted Co-authored-by to include the work of cu-authoers in the profile contributions graph and the repository's statistics.

This has to be in the end of the message, and Github will just ignore those keys that are not recognized.

If we want to keep both multiple lines of Author: and Co-authored-by:, the information will be duplicated twice. I would like to purpose to just keep Co-authored-by:. For single developer commit, we can just use Authored-by: for consistency.

@SparkQA
Copy link

SparkQA commented Aug 3, 2018

Test build #94161 has finished for PR 21991 at commit 2563311.

  • This patch fails from timeout after a configured wait of `300m`.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Aug 4, 2018

Test build #94181 has finished for PR 21991 at commit 6732913.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@HyukjinKwon
Copy link
Member

+1

@felixcheung
Copy link
Member

Jenkins, retest this please

Copy link
Member

@felixcheung felixcheung left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SGTM

@SparkQA
Copy link

SparkQA commented Aug 4, 2018

Test build #94204 has finished for PR 21991 at commit 6732913.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Aug 6, 2018

Test build #94265 has finished for PR 21991 at commit fbd8cb4.

  • This patch fails due to an unknown error code, -9.
  • This patch merges cleanly.
  • This patch adds no public classes.

@HyukjinKwon
Copy link
Member

retest this please

@HyukjinKwon
Copy link
Member

@dbtsai, I sent a PR to your branch dbtsai#3 in case you want to test this and if I understood correctly.

dbtsai added 2 commits August 6, 2018 10:21
Empty commit to test 'Co-authored-by' and 'Signed-off-by'
@dbtsai
Copy link
Member Author

dbtsai commented Aug 6, 2018

@HyukjinKwon Thanks for pushing a test. I used this script to merge this PR into my own repo. Here is the result. https://github.com/dbtsai/spark/commits/PR_TOOL_MERGE_PR_21991_MASTER As you can see, three authors are properly shown in git status.

I also tested it with single author PR, and here is the result. dbtsai@16dd571

@SparkQA
Copy link

SparkQA commented Aug 6, 2018

Test build #94284 has finished for PR 21991 at commit fbd8cb4.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Aug 6, 2018

Test build #94295 has finished for PR 21991 at commit 9016851.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Aug 6, 2018

Test build #94305 has finished for PR 21991 at commit a69cfb3.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Add manually given primary author at the head of author list
primary_author = distinct_authors[0]
else:
# When primary author is specified manually, put it at the head of author list.
distinct_authors.insert(0, primary_author)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@viirya good catch. When we put it at the head of the author list, do we need to dedupe it again?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I gave it a thought, but didn't do it. If you think it is necessary, I will push another commit for it.

@SparkQA
Copy link

SparkQA commented Aug 6, 2018

Test build #94299 has finished for PR 21991 at commit 4500d60.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Aug 6, 2018

Test build #94310 has finished for PR 21991 at commit 483d182.

  • This patch fails from timeout after a configured wait of `300m`.
  • This patch merges cleanly.
  • This patch adds no public classes.

Copy link
Member

@HyukjinKwon HyukjinKwon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@HyukjinKwon
Copy link
Member

retest this please

@dbtsai
Copy link
Member Author

dbtsai commented Aug 7, 2018

@HyukjinKwon thanks! Is it possible to use this script to merge this PR which has many people involve? A good demonstration of collaboration in the community.

@HyukjinKwon
Copy link
Member

Yes, I think so. I was about to merge this in that way :-). Seems to me we are good to merge now since the current change is only checked by Python lint which is already passed.

@SparkQA
Copy link

SparkQA commented Aug 7, 2018

Test build #94322 has finished for PR 21991 at commit 11887ae.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@HyukjinKwon
Copy link
Member

Let me leave it to you @dbtsai. I thought you live in a timezone completely different with me.

@HyukjinKwon
Copy link
Member

Merged to master.

@HyukjinKwon
Copy link
Member

51bee7a looks good :-)

@asfgit asfgit closed this in 51bee7a Aug 7, 2018
@SparkQA
Copy link

SparkQA commented Aug 7, 2018

Test build #94324 has finished for PR 21991 at commit 272d8fd.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@viirya
Copy link
Member

viirya commented Aug 7, 2018

The failed test is FlatMapGroupsWithStateSuite.flatMapGroupsWithState. I saw it fails some times occasionally. I think it should not be related to this change. @HyukjinKwon @dbtsai

@SparkQA
Copy link

SparkQA commented Aug 7, 2018

Test build #94323 has finished for PR 21991 at commit b3347f8.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Aug 7, 2018

Test build #94329 has finished for PR 21991 at commit 272d8fd.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@dbtsai
Copy link
Member Author

dbtsai commented Aug 7, 2018

Thanks all!

@dbtsai dbtsai deleted the script branch August 14, 2018 21:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants