GitHub Workflow – Contributing and Diverging from Upstream Repos

etiquettegithubopen sourceworkflows

I'm new to GitHub and VCS in general. I've been programming in various languages for years, but I've always worked solo on custom projects (no public releases). I recently started using a jQuery UI widget I downloaded from GitHub in a project I am working on. The repo is no longer maintained by the original author. Another fork has incorporated some of the original pull requests. This is the one I forked from.

I found a couple of bugs and have come up with the fixes for them. I'd like to contribute these fixes, but I also have a whole lot of other changes I want to make, for our own use, that will break some of the existing features. Plus, I'd like to incorporate an idea from another fork.

I'm still learning GIT and GitHub and I'm trying to figure out the best way to go about everything. I've done a lot of reading (here, SO, GitHub help pages, Pro Git) about different concepts/tasks: workflows, merging, pull requests, cherry picking, rebasing, branching. My grey matter is swimming and I need to start doing so I can better understand what I've read.

Main Issues:

  1. I think I read (somewhere) that you can only have one pull request on a branch at a time. So does that mean I should have a separate branch for each bug and then do a separate pull request for each one?

  2. I want to clean up whitespace issues and I seem to remember reading that it's best to do this in a separate commit. Should I do this in my master or a separate branch? I don't want to do a pull request for something so trivial, but if I make whitespace changes before branching, will that affect the pull request for the bug fixes? Some forks did whitespace cleanup and it effectively made the diff pretty useless.

  3. I was thinking of creating issues against my fork as a way of documenting the bugs even though I already have the fix for them. Is that a good idea? How do I go about linking together the issue, the commit, and the merge to master? If I do a pull request upstream, will my issue appear upstream as well or will that documentation link be lost? I can't open an issue against the upstream repo (there is no issue tab).

  4. What's the best way to give credit to the other fork author for the idea of his that I want to use? I can't use his code exactly, especially since his change is applied against an older version of the upstream and is not compatible with my other changes as is. But I want to use the idea and I want to give credit where credit is due. Should I just link to his repo (or profile or specific commit) in my commit message?

  5. What is the etiquette regarding changing the readme file and the DocBlock at the top of the main file? Is it ok to make changes, add my name, add links to my repo and demo, remove links to the original demo (since my fork will end up being incompatible with the original)? Of course, I will leave the original author name and license information. For the record, it's licensed under the MIT license.

As a solo developer who's never used VCS, I am accustomed to rewriting history. I'm a perfectionist and like things to be neat and tidy. The idea of recorded history is making me a little nervous and I want to do it right the first time. I've created a new repo to play/learn with, but I am anxious to get moving on fixing up the jQuery UI widget so I can move on with my project.

Best Answer

  1. Correct: a pull-request is linked to a branch in your repository. If you modify the branch, you are then also modifying what you're submitting as a pull-request.

    So yes, you do have to create a branch (and pull-request) per bug fix. It might be wise to start with one and see how the maintainer reacts to that one before going on to do the rest. Open source is an inherently social process.

  2. Do make a pull-request for your whitespace changes! Speaking as someone who's sometimes a maintainer, I love these types of pull-requests: I either approve them or don't, and they take little time to process.

    What you also might run into is that the maintainer does not agree with your whitespace changes! So, beware..

  3. Hmm.. It's not clear what you're trying to achieve here. It sounds like over-documentation and not that good of an idea -- maybe you can clarify why you would want to do this?

  4. Linking to his repo in your commit message (or even in a comment in the code) is a great way to give credit. Be careful though -- make explicit that you are thanking him for his ideas and not for his code. If you have copied code, then I would e-mail him about it, unless it's very clear which license he's using for his code. If the licensing is clear (and it's a different license from the repository you're submitting the commit to) then you need to add the different license in your pull-request and also mention that in your pull-request message.

  5. This is a really good question and differs depending on who you talk to. My opinion is that you should never add your name to any commit or code you do. The main reason is that it implies "ownership of and responsibility for the code" -- it might prevent others from modifying the code because "it's yours". But now we're getting into a huge discussion about the nature of open source, so I'll stop here and say -- ask the project maintainer or just do it and see what his reaction his.

  6. You can rewrite (your local, not-yet-published) history with GIT! Learn the git rebase command -- this is one the main reasons that I love git. It's a really bad idea to (force) push rewritten commits/history to the shared repository (github, for example). This will then screw with the repositories that the other developers have -- they will have to do difficult things when pulling your (rewritten history) changes.

[#6: Thanks @toxalot!]