Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HTSJDK requires defined processes and governance #871

Open
tfenne opened this issue May 3, 2017 · 31 comments
Open

HTSJDK requires defined processes and governance #871

tfenne opened this issue May 3, 2017 · 31 comments

Comments

@tfenne
Copy link
Member

tfenne commented May 3, 2017

HTSJDK is an open source project within the samtools organization that grew out of work performed by many individuals presently or previously at the Broad institute. The project has been operating informally since it's move to GitHub, and has no defined processes, no agreed upon ways to make decisions or resolve conflicts, etc. While the vast majority of work was being performed by individuals employed at the Broad this more or less worked. However, over the last few years things have changed. While the Broad still makes significant contributions to HTSJDK, several key contributors have left Broad (myself, @nh13), the CRAM team at EBI have become core contributors, and at least two people with no Broad affiliation have become significant contributors ( @magicDGS & @lindenb ).

As one of the largest contributors to the project over time, I find the lack of a clear decision making process makes me less and less willing to put in any effort. I would personally like this resolved, and I believe it would be greatly beneficial to the project to have clearly defined leadership and processes.

The project has five admins current: @alecw, @nh13, @tfenne (me), @ktibbett and @bradtaylor. In addition there are 32 members with write access, though many are not active contributors.

I propose two possible paths forward:

  1. HTSJDK - A Community Project: on this path we accept that HTSJDK was founded at Broad but that the contributorship is wide enough that it should no longer be controlled by the Broad. I would suggest that the existing project admins engage the remaining contributors, and are then responsible for developing or adopting a process for deciding who can be an admin, a member with write access, and how changes get approved and merged into the project.
  2. HTSJDK - A Broad Project: on this path the Broad takes more explicit ownership of HTSJDK. I would suggest that @nh13 and myself relinquish admin rights, and perhaps even that all non-Broad employees lose write privileges (i.e. have to fork and make PRs from forks). It would then be incumbent on Broad to select admins, and provide some guidance on how to contribute to the project (such as they do for GATK 4 development.

While I would personally prefer option 1, I absolutely believe that either option is better than where we are now.

Since I do not believe that the current set of admins is a good representation of those with stakes in the project, it's not even clear how to begin this. Perhaps the first question is, to those of you at Broad, what is your reaction to the two paths laid out above?

@alecw
Copy link
Contributor

alecw commented May 3, 2017

Hi @tfenne ,

I agree that some clarity about the process would be good. I have hardly touched HTSJDK in > 2 years, so I think I should no longer be an admin. Also, I think it's not up to me how to proceed. I don't know what the GATK/Picard team has in mind.

-Alec

@d-cameron
Copy link
Contributor

I couldn't agree more. If would be nice to know exactly what hoops one needs to jump through to contribute. The issues I had getting #576 accepted are a good example of the problems with the current state of affairs. Yes I can contribute to a nominally public open source project, but my contributions are conditional on a whole bunch of tests passing in non-public projects I have no visibility of.

@yfarjoun
Copy link
Contributor

yfarjoun commented May 4, 2017

👍

@magicDGS
Copy link
Member

magicDGS commented May 4, 2017

I vote for the first one too, and even if I'm starting to be repetitive with my suggestion, a brand-new htsjdk3 API and repository may be useful for clean-up the library and solve from the very beginning the issues and discussions between the teams...

@tfenne
Copy link
Member Author

tfenne commented May 5, 2017

@yfarjoun (and maybe also @ktibbett ?) would you be able to raise this for discussion internally at Broad and figure out who wants to be involved, and who can represent the Broad position? I'm happy for anyone who wants to be involved to do so, but it would be great if there were 1-2 people from Broad who could officially represent Broad for this discussion?

I'd like to drive this process to completion in such a way that most or all parties see the result as an improvement, and nobody feels the need to create their own fork of htsjdk and depart long term from the official project. One fear I have is a lack of engagement from the necessary parties, resulting in either a lack of decision or a de-facto decision to go option 1 that then leads to a bifurcation of the project or a later attempt to reverse direction.

I understand that we can't resolve this in a day, but I would like to put a timeframe on this. Is it reasonable to think that by 5/19 (two weeks from today) we could have agreement on:

  1. General direction (option 1 vs. option 2 vs. something totally different)
  2. A small set of people who will take that direction and define more detailed roles, policies, etc. (with community input) and the implement?

@droazen
Copy link
Contributor

droazen commented May 5, 2017

@tfenne If your fear is "lack of engagement from the necessary parties", then you're going to have to give us at the Broad, at least, a bit more than the proposed 2 weeks to engage in this topic. As you know, we're currently too swamped to deal even with a major deprecation in the codebase due to an internal conference deadline, so you can easily believe that we're too swamped to engage in a discussion like this! However, since this discussion affects our work intimately, we'd naturally like to be involved.

I'd suggest proposing a timeline that is actually realistic given our current time constraints here at the Broad, such as 1 month. In the meantime, @yfarjoun has proposed what seems to me like a reasonable way forward on your PR #868.

@tfenne
Copy link
Member Author

tfenne commented May 7, 2017

@droazen, a month is reasonable - that's why I asked if two weeks was reasonable. I figured I was more likely to get a response by proposing a timeline than just asking "what's a reasonable timeline".

I'd like to ask a few questions of you in order to help calibrate my own, and others', expectations. For context, while I would love to resolve this quickly, realistically I would prefer us to have a clear timeframe that we can stick to rather than an ambitous timeframe that we fail to keep. With that in mind:

  1. Are you able to represent/speak on behalf of Broad interests in HTSJDK, or are there others who also need to be directly invoved? I ask because I don't want for us to agree on a timeline only to be surprised later by someone else coming along and objecting.
  2. Is one month realistic either? I don't want to extend the timeline more than necessary, but on the other hand I'd rather set a timeline once and keep to it.
  3. Understanding that you are busy, could you give some insight into your availability to participate in the discussion? Is it effectively zero until the start of Bio IT on May 23rd, then ramping up? Or are you able to engage minimally before then? Something else?
  4. Do you have any objections to myself and others continuing the dicsussion in your absence and perhaps even drafting policies etc. E.g. I think it would be really useful for everyone who's interested to share and discuss goals for HTSJDK governance, preferences for how it might work etc, such that we are not starting from zero in 2-3 weeks.

If it seems like I'm impatient for this, it's because I am. I'm not sure how much of this you're aware of, but this has been a topic of discussion between myself and some Broad folks for more than two years. I also made an attempt about 18 months ago to engage more formally with folks at Broad to try and resolve this. That last attempt dead ended with the ball in the Broad's court. I'm not trying to lay blame; certainly I could have kept pushing more consistently. And it's hard to know, from the outside, why the last attempt fizzled. But I think it's also fair to interpret that outcome as a lack of prioritization, at Broad, of solving this particular problem . That's why I say I fear a lack of engagement, because it's how the last attempt failed. And that's why I'm trying to build some momentum and commit to a timeline that's not many months long.

I''m happy to slow things down by a couple of weeks if that's what it takes to get you engaged. I would also genuinely appreciate both or either of a thin thread of involvement sooner or openness to letting others move the conversation forward (but not to completion) and to play catch up when you're ready to join us.

@vdauwera
Copy link

vdauwera commented May 7, 2017 via email

@nh13
Copy link
Member

nh13 commented May 7, 2017

@vdauwera I think one of the things that is unclear is the difference from individuals contributing to this discussion who happen to work at the Broad versus having an explicit person (or 1-2 people) who represent the Broad as an organization and can make decisions or the like on their behalf. I think the latter is what @tfenne is looking for, and what the question (1) to @droazen is about. Would you be able to help clarify?

@droazen
Copy link
Contributor

droazen commented May 9, 2017

@tfenne To answer your question, unfortunately neither myself nor anyone else on our side has bandwidth to participate in this discussion until June. To be on the safe side, I'd propose a start date of June 10.

I know that you're impatient to resolve this (or at least understandably nervous that the topic will get dropped again), but I feel strongly that starting the discussion without all the stakeholders having an active seat at the table and fully available to participate from the beginning would be a pretty terrible way to start out, and would not bode well for the future of the project. I'd ask that you hold off until the requested date of June 10, and in return we'll promise to stick to that date and ensure that the topic doesn't get forgotten a second time. Sound reasonable?

@yfarjoun
Copy link
Contributor

yfarjoun commented May 9, 2017

In my opinion, the current status of non-active (@kt, @bradtaylor, @alecw) and non-empowered (@nh13, @tfenne) maintainers is unworkable. I suspect that the conflicts we have been seeing recently are due to this problem.

To me this is a "drop-everything" issue. We cannot afford to have htsjdk without active, and empowered maintainers any longer. There are folks at the broad that do have time to engage and waiting until "all the parties" have time is impossible unless folks are willing to change their schedule. Given that the maintainers have to be able to make time, I do not share the opinion that this should be postponed any longer.

@droazen
Copy link
Contributor

droazen commented May 9, 2017

@yfarjoun If we can't even agree on how to begin this discussion in a way that's fair to all parties, then that does not bode well for our ability to make collective decisions in the future! It seems to me that our request to hold this important discussion at a time that works for all major stakeholders is not an unreasonable one. Starting this discussion without a representative of the GATK project as an active participant would do great harm to our willingness to participate in htsjdk as a community project going forward.

@yfarjoun
Copy link
Contributor

yfarjoun commented May 9, 2017

(following some internal discussion at broad)
I move that there will be a new set of maintainers:

Who will start with equal veto power and set the "processes and governance" going forward.

This motion come with the implicit understanding that conflicts need to be discussed (perhaps offline) till resolution.

Given that there are currently 5 official maintainers and no clear governance model, I am not sure what the official rules are for such a change. So, in lieu of any guidance I will ask that someone second this motion and then the (current) maintainers will vote and I hope there will be no objections. (I already spoke in private with the 3 nominations and have their personal agreement to the arrangement).

I hope I'm not making too much of a fool of myself here....

cc:

@tfenne
@ktibbett
@alecw
@bradtaylor
@nh13

@bradtaylor
Copy link

bradtaylor commented May 10, 2017

I second this proposal. I am no longer in a position to offer administrative support to this very important software project, and I wish to be removed as an admin.

The proposed set of three maintainers are deeply knowledgable about this repository and its goals / design standards. I trust them to engage with the developer community propagate a governance model that ensures the improved health of this project and its role in the broader genomics software ecosystem.

Thank you for the suggestion @yfarjoun. This proposal has my vote.

@nh13
Copy link
Member

nh13 commented May 10, 2017

I agree.

@vdauwera
Copy link

vdauwera commented May 10, 2017 via email

@magicDGS
Copy link
Member

I agree to speed-up this issue. As a contributor and user of the library, I would like to have this solved soon because I have several projects depending on this....

@ktibbett
Copy link
Contributor

👍

2 similar comments
@tfenne
Copy link
Member Author

tfenne commented May 10, 2017

👍

@lbergelson
Copy link
Member

👍

@jacarey
Copy link
Contributor

jacarey commented May 10, 2017

adding my 👍

@alecw
Copy link
Contributor

alecw commented May 10, 2017 via email

@droazen
Copy link
Contributor

droazen commented May 11, 2017

👍

@tfenne
Copy link
Member Author

tfenne commented May 14, 2017

@yfarjoun I think I see 👍 from all the existing maintainers, and proposed future maintainers, and most of the folks who joined the conversation too. Are we ready to move forward, or were there any last people you wanted to make sure we heard from?

@yfarjoun
Copy link
Contributor

yfarjoun commented May 15, 2017 via email

@tfenne
Copy link
Member Author

tfenne commented May 15, 2017

Thanks @yfarjoun. I think the only major thing that needs to happen that I cannot make happen is that all three of @jacarey, @lbergelson and myself should have access to edit who has access to the project. This is currently administered through a pair of groups in the samtools org, Java admins and Java developers. It looks like @jacarey is already an "Owner" within samtools, but that @lbergelson and myself are only "Member"s. Can you or @jacarey upgrade @lbergelson and please?

After that I would suggest that:

  1. The three new maintainers have a couple of brief off-line conversations about boot-strapping the governance process
  2. We then put in place a very minimal process for the project, and use that process to reasonably rapidly evolve and flesh out the full set of governance processes for the project.

@jacarey
Copy link
Contributor

jacarey commented May 15, 2017

@tfenne and @lbergelson are now "Owner" as well.

@d-cameron
Copy link
Contributor

@tfenne has there been any progress on the defining the project governance? I'm particularly interested how the project roadmap (if any) is defined, and processes around accepting PRs from the community that do not have any historical links to the Broad (such as myself).

@magicDGS
Copy link
Member

I am actually quite interested on this too, because I am in the same position as @d-cameron. Something like what is happening at the Hadoop-BAM project would be nice, including an online meeting to take some decisions.

I would definetely like to move forward HTSJDK to an interface-based library and with SemVer (from v3 onwards), but this does not look to happen unless the governance process is defined and a roadmap written. For example, long-time standing PRs which might be useful for the community are not reviewed (some of my own examples, tribble writting support in #822 or compressed reference FASTA support in #1014) - even if they do not break compatibility with previous versions (and obviously is not part of v3).

From my point of view, I think that this project needs the same number of Broad and non-Broad maintainers and granted community reviewers for move forward some developments (although ultimate decisions might be taken by maintainers).

@droazen
Copy link
Contributor

droazen commented Apr 20, 2018

Speaking on behalf of the Broad, we've been waiting for some people to free up on our end (in particular, @lbergelson and @cmnbroad) before moving forward with this and starting the process of drafting a roadmap for HTSJDK 3.0 with input from the community. Obviously our participation in this effort has been delayed considerably by the recent GATK 4.0 release and its aftermath. The good news, though, is that @lbergelson is expected to free up sometime in May, and will be able to devote full-time effort to this project for the foreseeable future.

@magicDGS
Copy link
Member

@droazen - thanks for the update. It will be great if htsjdk 3 moves forward!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests