Project

General

Profile

Ceph contributors list maintenance guide » History » Version 19

« Previous - Version 19/32 (diff) - Next » - Current version
Yann Dupont, 10/18/2016 07:56 PM


Ceph contributors list maintenance guide

Introduction

The mailmap file fixes spelling mistakes in commit authors names or emails. It can also be used to map authors to organizations sponsoring their commits. For instance the .mailmap file found in the Ceph repository normalizes author names and is piped to the .organizationmap file to map them to the sponsoring organizations.
Immediately after each Ceph release, an acknowledgement of the contributors who participated can be published.

Published contributor credits

Display the contributor list

Note: The check-mailmap command requires git version 1.9 or higher.
The credits.sh script (attached to this page) can be called with

./credits.sh tags/v0.83..origin/next

to show contributors who authored commits between v0.83 and the master branch.

Fixing the contributor list

When looking at the Commits by organizations displayed by the credits.sh script. For instance if it shows

Commits by organizations
...
4 19 Intel <contact@intel.com>
...
9 5 xinxin shu <xinxin.shu@intel.com>
...
17 2 xinxinsh <xinxin.shu@intel.com>
18 2 Xiaoxi Chen <xiaoxi.chen@intel.com>

...

Because of the domain name, the contributions of Shu Xinxin and Chen Xiaoxi should be counted for Intel and that can be fixed by updating the .organizationmap . The two lines showing Shu Xinxin contributions instead of one come from the fact that two different names were used and that can be fixed by updating the .mailmap .
The following oneliner can be used to spot duplicates in the Commits by authors instead of looking for them manually:

git log --pretty='%aN <%aE>' $range | git -c mailmap.file=.peoplemap check-mailmap --stdin | sort | uniq | sed -e 's/\(.*\) \(<.*\)/\2 \1/' | uniq --skip-field=1 --all-repeated | sed -e 's/\(.*>\) \(.*\)/\2 \1/'

It will for instance show Ross Turk twice, once with his Inktank email and another with his Red Hat email. These must not be normalized in the .mailmap because the contributions are affiliated to two different organizations. Instead it can be added to the .peoplemap.

Commit conventions

The commit message starts with mailmap: prefix and it allows people working on Ceph to quickly identify the commits related to the maintenance of the contributor list. Although it may involve modifying more than just the .mailmap file, the prefix stays the same.
The subject of the commit message is usually mailmap: Firstname Lastname affiliation which is helpful for scripting.
There is one commit per author because this helps the author review the change related to her / his contribution without considering unrelated changes.
Here is an example of commit https://github.com/ceph/ceph/commit/...dd6e547828df41

Mail notification to the author about an update

It is best to send a pull request with one commit per author so that they can be reviewed independently. A mail should be sent to the author to review the update. For example:

Subject: Ceph contributions : organization affiliation
BCC: loic-bcc@dachary.org
Hi,
You are participating in the making of Ceph[1]. The mailmap files[2] were updated to reflect your affiliation and possibly to normalize your mail, in case it was misspelled. You can verify the update at:
https://github.com/dachary/ceph/commit/{HASH}
This work is done to publish a map of Ceph contributors, as well as the organizations to which they are affiliated.
Cheers
[1] http://ceph.com/
[2] https://github.com/ceph/ceph/blob/master/.mailmap and https://github.com/ceph/ceph/blob/master/.peoplemap and https://github.com/ceph/ceph/blob/ma...rganizationmap

Sending these emails can be automated with:

git log --pretty='%H %s' {u}.. | # assume all commit subjects contain the author@
sed -e 's/affiliation//' -e 's/name normalization//' | # remove subject noise
while read hash mailmap name ; do
to=$(sed -ne "s/.*$name/$name/p" < .organizationmap | head -1) # fetch the author mail
echo "$to" # debug information
( echo "To: $to"
sed -e "s/{HASH}/$hash/" < mailmap.txt # the template mail above
) | # create the mail from the template
sendmail -v -F 'Loic Dachary' -f 'loic@dachary.org' -t
done

The Reviewed-by: field is added to the commit message when the author answers.

Affiliation change over time

A given email reflects the affiliation of an author. There is no way to say, for instance, that is affiliated to Cloudwatt until August 2014 and to Red Hat after this date. To avoid duplicates in the author list, the .peoplemap file is updated and will be used by the credits.sh script above:

David Zafman <dzafman@redhat.com> David Zafman <david.zafman@inktank.com>
John Spray <jspray@redhat.com> John Spray <john.spray@inktank.com>

...

Sponsoring organizations

Searching internet to find the current employer of the author is likely to be the simplest method to discover the organization sponsoring the commit. Note that organizations can be either companies, non-profit organzations or government agencies. If nothing shows up or if the author does this as a hobby, she/he should be listed as unaffiliated in the .organizationmap file such as:

Unaffiliated <no@organization.net> Michael Nelson <mikenel@tnld.net>
Unaffiliated <no@organization.net> Michael Riederer <michael@riederer.org>

Sorting entries

The .organizationmap, .peoplemap and .mailmap content should be alphabetically sorted because it makes it easier to spot unintended duplicates and it reduces the risk of a merge conflict.

HOWTO maintain the contributor list

  1. Run ./credits.sh origin/firefly...origin/master on a fresh checkout of the master branch
  2. Check each entry in the Commits by authors section for lines that show the same author under two mails
    • If the duplicate is an accident update the .mailmap to map the two names into one
    • If the duplicate is intentional (change of organization) update the .peoplemap to map the two names into one
  3. Check each entry in the Commits by organizations section for lines that shows an author instead of an organization
    • Look for the name in git log --pretty='%aN <%aE>' | sort | uniq
    • If the author is found under a different name, update the .mailmap to map the two names into one
    • If the author is not found, update the .organizationmap with the organization sponsoring the commit
    • Commit the change with a subject mailmap: Firstname Lastname affiliation
    • Go back to step 1
  4. Send a pull request with all commits with a cover DNM: mailmap updates to avoid duplicated work
  5. Send a mail to each author with a link to the commit of their update for review
  6. When an author replies
    • If modifications are suggested, amend the commit
    • Update the commit message with Reviewed-by: Firstname Lastname <email>
    • git push --force so the change shows in the pull request
    • Reply with a link to the updated commit
  7. A week after sending the mails, merge all updates
  8. After the announcement of a Ceph release, send the mail to ceph-devel, including the output of ./credits.sh tags/previous-version...tags/released-version ( for instance ./credits.sh tags/v0.85...tags/v0.86 )
  9. Update the schedule at https://wiki.ceph.com/Community/Ceph...ntenance_guide

credits.sh View - display contributor credits (2.2 KB) Loïc Dachary, 07/25/2015 11:37 AM