Opened 4 months ago

Last modified 3 weeks ago

#160 new task

Proposal to use GitHub instead of trac

Reported by: jonathan Owned by: cf-conventions@…
Priority: medium Milestone:
Component: cf-conventions Version:
Keywords: Cc:

Description

Dear CF committee and all

I'm pleased to say that Tanya Reshel at PCMDI is making good progress with working through the tickets which had been accepted by the end of Jan, which we agreed would define CF1.7, and implementing them in the new AsciiDoc source of the CF document on GitHub, with some advice from Jeff Painter, David Hassell and me. Thank you, Tanya.

It's been suggested several times, and most recently by Rich Signell on the email list, that we should consider moving CF conventions discussions to GitHub. We agreed as a committee (recorded in cf-trac.llnl.gov/trac/ticket/146) to revisit this issue when we had more experience of managing the CF document on GitHub, so perhaps this is now the time to discuss it. The suggestion is to replace trac tickets, for discussions of changes to the convention, with GitHub issues; the email list, for more general discussion, not directed to agreeing a particular proposal, could continue as it is.

At the moment we have a system, maintained by Jeff, to synchronise the CF email list, maintained at NCAR, with an email distribution list for trac ticket updates, so that everyone receives both after subscribing to the email list. The present system in principle allows people to be on one but not the other, but in practice there is no-one who's chosen to do that, so it does not seem to be a requirement. I think it's important that everyone should be notified of conventions discussions, because otherwise not enough people will engage with them, since they won't know.

Jeff thinks it would be simple to forward all GitHub notifications from CF conventions discussions to the CF email list. People who don't want them can filter them out. Subscribers who are mentioned by GitHub name in an issue would receive two copies; that's a minor inconvenience, but if everyone is getting them anyway there's probably no need to mention anyone by name. Clearly this would be an easier system to maintain. There would probably be no need to have a list of subscribers to CF on GitHub; anyone with a GitHub account could open an issue.

trac is quite familiar to us now and has served our purpose well. GitHub is a bit more complicated because it can do more. GitHub is probably more popular now. Would GitHub be suitable for us, do you think?

It would have the advantage that, when there is text to discuss, the proposer could make a branch of the CF conventions document and edit it, to show exactly what is proposed. I do not think that it should be required to do it that way, though, because (a) it's not always the clearest way to see things, when they're scattered through several parts of the document, (b) it could be an obstacle to some proposers, who would prefer to write out their proposed changes in their postings to the issue. An editor would then still be needed to implement the changes, once agreed, in the document source.

If we decide to migrate, I think we should do it once CF1.7 is finalised, so that there are no agreed tickets to be migrated. We could then leave the trac system in place for reference, but not permit new tickets. Any existing active tickets could be allowed to come to a conclusion on trac.

Standard name proposals could also be done as GitHub issues, I suppose. They come from a much wider range of contributors than conventions proposals. Would GitHub be a barrier for proposers?

Best wishes

Jonathan

Change History (23)

comment:1 Changed 4 months ago by balaji

Dear Jonathan and others: I have been following the exchanges and I am in favour of moving to Github. The branching alone, as outlined, would be a significant step forward toward a way of working on provisional standards within small groups.

The danger is of branches becoming long-lived enough to become more of a fork than a branch. Developers of software that is dependent on a particular version of the conventions have to be aware of the risks of committing themselves to a version not on the trunk; but beyond that I can very easily see a lot more flexibility builtin into a github-based approach.

comment:2 Changed 4 months ago by ngalbraith

I agree that github is a good option, although I'd like to keep the email list available for standard name proposals. For some people, having to use github might be enough of a challenge that they'd simply bypass the process altogether.

Last edited 4 months ago by ngalbraith (previous) (diff)

comment:3 Changed 4 months ago by david.arctur

On using github for the discussion of standard names, one part of me prefers the current approach of using the mail list without even trac; keeps the cognitive barrier as low as possible to encourage the broadest discussion. The other part of me feels that a reference like standard names needs stronger guidance and would perhaps benefit from a more rule-based approach like used for CSDMS names. How are CSDMS names proposed, discussed & decided?

comment:4 Changed 4 months ago by lowry

GitHub? seems very popular in the technical fora that I follow whereas Trac is virtually unknown outside CF. I would therefore support the migration and the clear, thought through migration strategy proposed by Jonathan. I would also strongly support Nan and David's view that GitHub? isn't the place for CF-Standard Name discussions. As they both point out, participation in those discussions should be made as easy as possible. I would also recommend that we decouple any discussions on the Standard Name process and possible alternatives from this ticket.

comment:5 Changed 4 months ago by mggrant

Moving to GitHub is a good idea. It will be more accessible to more people - getting a trac account here is a bit more manual, which inevitably imposes barriers to participation. It'll reduce maintenance burdens for LLNL too.

While I think issues might be good way to organise standard name requests, there might be some problems ensuring that everyone relevant hears about them and is able to easily respond - the mailing list is certainly simplest. Perhaps trial github with the tickets/issues/as a trac-replacement first, and consider a process for standard names later if the community seems to like the github workflow.

comment:6 Changed 4 months ago by cameronsmith1

I am using github on a different project. I think the setup and learning curve barriers for basic commenting should be small, especially if/when github gets integrated with email list. Making basic changes to documents should also be fairly straightforward. Managing branches is powerful, with a bigger learning curve, but that may only be needed by a few people. The process will certainly be smoother if decisions get made on branches quickly.

Overall, I think that a shift to github is a reasonable step.

comment:7 Changed 3 months ago by davidhassell

Thank you, Tanya, for the work you are putting in to CF-1.7.

I'm in favour of moving the trac discussions of changes/defects/etc. to git hub issues, especially in the light of the forthcoming move of the definitive conventions document to github. I would have thought that the current CF-metadata email list should carry on as is, unchanged - many readers and participants do not want or need to contribute to changes, and so would not then need a github account.

To contribute to an issue, would you need your github account to be registered, in some way, with the cf-conventions repository? There would need to be some access restrictions to protect the master repository which, I think (?), might preclude the creation of banches by anyone. Anyone can always fork the repository, though. (Do correct me if my git know-how is wrong!)

I think that keeping the iterations to changes within the issue is important as there they can be wrapped up with explanations and examples more easily than as edits directly to a fork. Perhaps a guideline could be that new changes are described in the issue's intial statement and accompanied by a fork of the document which contains the proposed changes. The changes would then be iterated in the issue discussion with updates to the fork only happening by agreement, and certainly at the discussion's successful conclusion. Just a thought.

Thanks, David

comment:8 Changed 3 months ago by rsignell

I am in favor of moving technical discussions surrounding both conventions and standard name issues to github issues.

The reasons are:

  • Technical discussion is easier (drag-and-drop screenshots, code syntax highlighting work better)
  • You can bring anyone from the github community into the discussion (people don't need to create a special login on CF to just discuss)
  • Integration with CF convention code on github
  • Github remembers your login information (I always have to dig out my track user/pass because I hardly ever use it)
  • If you don't want to get email notification from github issues you can turn it off easily.

As others have mentioned, we can maintain current mailing list for things that are not issue related.

comment:9 Changed 3 months ago by rsignell

Also, it looks like it would be possible to migrate our existing Trac tickets to Github issues: http://stackoverflow.com/questions/6671584/how-to-export-trac-to-github-issues

Rich

comment:10 Changed 3 months ago by davidhassell

Just a recap:

There appears to general support for moving the current Trac tickets to github issues, but there are currently conflicting opinions in whether to move the CF-metadata mailing list to github.

To my mind, the current CF-metadata mailing list comprises general queries (how do I do this?), reconnaissance (would this be a good idea?) and standard name proposals (which do not impact on the conventions document). Coments on trac tickets are not posted there, but are forwarded to this list so that as wide an audiance as possible has the potential to comment on convention changes.

All the best,

David

comment:11 Changed 3 months ago by rsignell

I would argue for standard_names to have their own repo, and that we raise requests for new standard names (and discuss) as github issues also.

I would vote for not forwarding all Github issue discussion to email, but letting people watch or unwatch as they desire. (watch sends email notification, unwatch does not)

https://help.github.com/articles/unwatching-repositories/

comment:12 Changed 3 months ago by cameronsmith1

I see the decision on whether to move Trac tickets and standard_name as separate. My recommendation is that we start by moving the Trac tickets. It will take a little while for people to become comfortable with the new system, and to work out kinks. We will then be in a good position to consider whether or not to move standard_name discussions too.

comment:13 Changed 3 months ago by davidhassell

I agree with just dealing with the Trac tickets first.

I'm not sure about making it possible to turn off a discussion by "un-watching" it in github. I think that the all github activity should be automatically copied to the CF metadata mailing list, just like the current Trac activity is. I believe that it is important for all potentially interested parties to get a chance to see all parts of a discussion.

All the best,

Daivd

comment:14 Changed 3 months ago by lowry

I totally agree with David. I have never used github but am very interested in CF, especially Standard Name discussions. As I'm unlikely to use github in the future a total switch to github with no e-mail alerts could alienate me from CF.

Cheers, Roy.

comment:15 Changed 3 months ago by jonathan

Comments from Dave Blodgett posted at https://github.com/cf-convention/cf-conventions/issues/106

@rsignell-usgs has urged me to comment on the [thread related to GitHub and it’s use in place of trac](https://cf-trac.llnl.gov/trac/ticket/160#comment:11). I don’t have a trac account and couldn’t figure out how to sign up, so I've decided to respond in github to demonstrate what it's about. As someone who uses GitHub extensively for project planning/management, as a source code repository, and as a registry for development of an in-process OGC standard, I don’t think it’s worth debating the merits of github’s community facilitation model. Rather, the discussion should be how this community wants to migrate its existing activities to GitHub and how the community wants to leverage the github infrastructure.

A few points to note about github's functionality that may be of use to the community.

  1. The CF email list should probably live on near term, at some level, and repeating GitHub notifications through the list is fine. That said, this is the last email list I’m on and I REALLY wish it would move to a searchable indexed list of issues, as I’d like to get the conversations out in the open and not buried in email formatting and archived inboxes. [Subscribing is really easy!](https://help.github.com/articles/watching-repositories/) [Joining github is too!](https://github.com/join)
  1. [Github issues](https://guides.github.com/features/issues/) work just like email if you want to use them that way. Once you’ve [watched a repository](https://help.github.com/articles/watching-repositories/), you [can respond directly](https://github.com/blog/811-reply-to-comments-from-email) to an issue email and your comment shows up in the issue’s discussion.
  1. Using GitHub is easy if you don’t care to use all the software repository features, e.g. branches. [There’s super simple wiki functionality,](https://guides.github.com/features/wikis/) [forking a project](https://guides.github.com/activities/forking/) and [editing documents in the browser](https://github.com/blog/844-forking-with-the-edit-button) are super simple and you don’t need to know all the complexities behind it.
  1. A lot more cool stuff can be done... and things can get kind of out of hand... [peruse the back issues here](https://github.com/twhiteaker/netCDF-CF-simple-geometry/issues) or [just check out this cherry bomb of a 60-comment thread!](https://github.com/twhiteaker/netCDF-CF-simple-geometry/issues/40)

On and on... Like I said above though, the discussion should be how does the community want ot use this system. What the tagging scheme will be, things like repository ownership raised by @marqh in #63, how to deal with stale old pull requests like #35, etc. etc.

Finally, regarding sequencing, I hope we could get 1.7 done and dusted prior to suggesting a full stop change to the infrastructure underlying CF governance. It would make a lot of sense to move 1.7+ into the new space though.

Regards,

Dave

p.s. It's always good practice to finish a new issue with closure criteria so it's original intent is clear. This issue can be closed once a planning of a process to decide how the community wants to use github has started.

p.p.s. Also, note that there is a “mute the thread” link at the bottom of every GitHub notification as well as a “view it on GitHub” link. These are very convenient

comment:16 Changed 3 months ago by jonathan

Comments from Jeff Painter posted at https://github.com/cf-convention/cf-conventions/issue/106:

I would like to see a move to Github for specific issues, but I think a mailing list is more convenient for unstructured discussions. I see no reason to stick with Trac for issue tracking, although the transition between issue trackers will take some time.

To sign up for the Trac issue tracker, email me! Instructions are in large blue letters on the bottom of every page. Registration used to be automated but then we got spam. This was the surest way I could see to avoid spam on Trac.

Finally, version 1.7 of the CF Conventions document is well along now. The conformance document will follow.

comment:17 Changed 3 months ago by jonathan

Dear all

I too think it's important to forward the GitHub notifications to the email list. Otherwise there are many people who would never be made aware that a conventions change was under discussion. It's important that these should be brought to everyone's attention. Jeff suggested that people who really did not want to see contributions to conventions discussions could suppress them by using email filters, but at the moment CF mailing list subscribers receive them and no-one has asked to opt out.

I think we should retain the mailing list for unstructured discussions, and for standard name proposals, at least for the moment.

More views are welcome on this ticket.

Best wishes

Jonathan

comment:18 Changed 3 months ago by edavis

Hi all,

I would like to see CF transition from Trac tickets to GitHub issues for all new and open issues. I agree that forwarding all GitHub issue notifications to the CF email list would, in many ways, smooth the transition for those not already using GitHub. My only concern is how responses to those forwarded emails from people without GitHub accounts would make their way back into the GitHub issue.

Cheers,

Ethan

comment:19 Changed 3 months ago by jonathan

Dear Ethan

To clarify this point: I think the situation would be much like now, in which all subscribers to the CF email list receive updates to trac tickets by email, but they cannot post updates to trac by email. It seems that this hasn't caused confusion; people have very rarely replied to the email list about trac tickets.

Cheers

Jonathan

comment:20 Changed 3 months ago by apamment

I've been following the discussion in this ticket with interest.

I support the move to Github as a forum to manage proposals for changes to the conventions document. Since the document now resides on Github, along with the pages that form the CF website, there seems little reason to retain trac as an additional technology for discussing issues. Signing up for a Github account is a very simple process and one does not need to be an expert in git/Github to open issues or comment on them. Considering that the discussions tend to involve a relatively small number of contributors I think it is reasonable to ask them to obtain a Github account for this purpose.

I use git/Github in a simple way to keep the standard name table up to date, but have no real experience of using branches. I'm not sure whether using branches would make managing the document easier or more difficult in the long run. Would we end up with a lot of overlapping changes that require manual merging thus making the process of editing the document more complicated that it need be?

I'm opposed to the suggestion that we move standard name discussions to Github. There are a number of reasons for this:
1) Anyone can search for a public Github repository and see associated issues. However, if one tries to open a new issue or comment on an existing one Github asks the user to log in or create an account. It is not uncommon for someone who is very new to CF to propose just one or two new names. Asking them to create a Github account in order to do this is overkill and will frighten some people away. Similarly, someone who wants to contribute a single comment to a discussion (and this does happen on the mailing list) should not be expected to create a Github account.
2) Many of the people who propose names are primarily scientists, not developers. I know that Github is currently the version control tool of choice for many software developers, but many scientists have never heard of it. For example, I recently helped to teach a course to environmental science PhD students in which git version control was one of the topics. At the start of the week only five or six people out of a group of thirty-seven had even heard of Github, let alone used it.
3) Often something that starts on the mailing list as a general discussion along the lines of “how do I do X in CF?” then leads to proposals for a few standard names. Recent mailing list discussions on “day of year” and “tripolar grids” are examples of this. I don't think it's sensible to make people switch from the mailing list to Github issues half way through a discussion as that just spreads the information across multiple sources making it harder to follow, not easier.
4) Discussing standard names as Github issues offers no practical advantage compared to discussing them on the CF mailing list. My primary tool for preparing updates to the standard name table is not in fact git/Github but the CEDA vocabulary editor. This is software written in Python, using Django to create web pages, and sitting on top of a database containing the standard name proposals and the “standard phrases” that are used to form the definition text. This software produces the standard names status pages linked from the CF website on the Discussion and Standard Names pages, e.g. http://cfeditor.ceda.ac.uk/proposals/1?status=active&namefilter=&proposerfilter=&descfilter=&filter+and+display=filter. I use the editor to keep track of all proposals and report their status, to generate the xml file that gets uploaded to the CF website and also to generate files in a format suitable for uploading the standard names to the NERC vocabulary server. When I generate a new version of the xml standard name table I commit to my local git repository (containing a copy of the CF website) and then push to Github, but this is only the final step in a more complex process.

I remember having a conversation very similar to this one when we started using trac and some people thought standard names ought to be proposed in trac tickets. My arguments then were pretty much the same as above. Now we are talking about dropping trac so I'm glad we kept standard names on the mailing list and I think we should continue to do so.

comment:21 Changed 3 months ago by cameronsmith1

FYI: If you get an email from a github discussion, then you can respond by replying to that email, and the message will be inserted into the github discussion (so it isn't necessary to go to github to respond (although some of the interactive features won't work). I just tested this, and it worked smoothly.

Note: I don't know whether it will still work if I didn't have a github account, but it might.

comment:22 Changed 8 weeks ago by jonathan

Dear all

Dave Blodgett has posted his proposed simple geometries convention as a GitHub issue at https://github.com/cf-convention/cf-conventions/pull/109. This is a good example for us. I would like to make detailed suggestions on his text, as we do on trac tickets, but I'm not sure how to go about it. This is what I have commented on GitHub.

Thanks for doing this. I think it's in pretty good shape but I have quite a few detailed comments and suggestions to make, nearly all on the proposed text rather than the convention itself. I'm not sure how to do that in GitHub, so it's a useful exercise to see how this would work. If your proposal was in trac, I would reply to your posting on the ticket, edit the wiki-markup text to show my changes and suggestions for the parts affected, and repost it to the trac ticket. If it was on a wiki, I would make a copy of it on the wiki page, and edit it similarly.

The way GitHub is set up, I suppose the natural way to do it is to make a new branch and edit that, but (a) I don't know how to do that, (b) it's not obvious to me that the changes I suggested would be clear to you. In fact I find the proposal in this form not as easy to follow as it would be in trac. I can view it as deltas of the files, but these have little context and are hard to read as text because the markup isn't translated, or I can read the properly rendered modified files, but these don't show what's been changed, and of course they show much more that isn't affected, and it's several different complete files.

So I'm inclined to think that it would be easier to use GitHub issues in the same way as we use trac. That is, you would post your entire text to the "issue". Then I presume I could copy your posting and edit it, as in trac. Unfortunately, the markup isn't the same, is it - the issues use markdown, I believe, whereas the convention text uses AsciiDoc. This is a technical obstacle. Is there an automatic translator? If not, once the text is agreed, it would have to be manually transposed into the conventions document, as we have been doing from trac.

I wonder if people could comment on these issues about how to use GitHub, in view of its relevance to this ticket. (If you want to comment on Dave's proposal, that would belong on the GitHub issue.)

Best wishes

Jonathan

comment:23 Changed 3 weeks ago by jonathan

In commenting on Dave's geometries proposal (https://github.com/cf-convention/cf-conventions/pull/109) it turned out that I had a lot of suggestions to make. It didn't seem possible to do this with GitHub comments so instead I edited Dave's AsciiDoc text. AsciiDoc is not the same as MarkDown, so GitHub can't render it correctly, but it's not very different. Having gone through this exercise, I wonder whether in future it would be best to develop conventions changes on GitHub issues in MarkDown format, and convert it to AsciiDoc after agreement. There is at least one program (pandoc) which is said to be able to do this. Or could we store the conventions document in MarkDown rather than AsciiDoc? I assume that MarkDown doesn't have all the required facilities, since AsciiDoc appears to be more powerful.

Jonathan

Note: See TracTickets for help on using tickets.