Need advice on contributing

Hello all,

I have recently started to contribute to pytorch, and some of the doubts I have are:

  1. Every time, I have to pull from upstream into my forked repo and then commit my changes this causes at least 2 + commits every time I push to my fork. in the many other pull requests I have seen, they at max contain 1 or 2 commits, so how do they keep the number of commits down like that?

  2. I want locally to build the repo so that I can test the code I have written, but owing to my potato PC, this is not possible, so I want to know if I can download the wheels from the tests that are run every time I push changes to the PR.

  3. In the PR page, can I ask pytorchbot to run just a couple of specific tests instead of the full suite, because I want to just check if a particular part of the code works/compiles.

1 Like

Hi,

Thanks for contributing !

  1. To update my branches with upstream, what I usually do is “pull master”, then “checkout my_branch” then “rebase master”. I’m sure you’ll find on google better explanation of what rebase does than what I can do, but basically it uses master as a base and adds your comits on top. so no merge commit !

  2. As far as I know, the tests don’t actually build wheel, they do an install from source every time.
    On your local machine, if you use ninja, the incremental build is really good now and even though the first build takes a (very) long time, the later ones are very fast.
    Also, for debugging purposes, I usually build with USE_DISTRIBUTED=0 USE_CUDA=0, in particular, if you don’t do cuda code, this speeds up the build A LOT.

  3. I don’t think you can ask the bot to do these things. Is really built to run the full tests, not to check if the code compiles :confused:

thanks for the info, so now I have a couple more questions, I have already made 15 + commits already, so do I have to squash it? and if yes how so?

also, can you please give me a rundown of how you would approach submitting a PR?

right now, how I did it was:

  1. fork pytorch
  2. clone my forked copy
  3. write my code and then push to master of my forked copy
  4. opened a PR at the issue
  5. now, every time I just keep pushing to my fork’s master and it triggers the whole CI suite

This I am not sure is the ideal way to do it

EDIT: also how do I substitute ninja for cmake? I have it with installed, but I am not sure it is actually using it

New feature ideas are best discussed on a specific issue. Please include as much information as you can, any accompanying data, and your proposed solution. The PyTorch team and community frequently reviews new issues and comments where they think they can help. If you feel confident in your solution, go ahead and implement it.

This is from the official guide.

I am sure you also checked the CONTRIBUTING document.
In there is the procedure explained by @AlbanD .

I want to specifically know if you create a branch for your feature in your fork, and then pull from upstream and then “git rebase master” on that… I am very confused on this.

Here is what I would do:

  • Create a new “feature” branch called my-new-feature from a base branch, such as master or develop
  • Do some work and commit the changes to the feature branch.
  • Push the feature branch to the centralized shared repo.
  • Open a new Pull Request for my-new-feature

As I checked out deeper this is old way to do it. This document explains the new way to do it.

Same thing here.
The trick is to have one repo with two remote so that in my case:
origin/master points to pytorch’s master branch
albanD/my-feature points to a branch in my fork.

If I want to update with pytorch’s master:

git co master
git pull
git co my-feature
git rebase master

And now your branch is up to date with master.

Also don’t worry about have too many commits on the PR: our merging process will take care of cleaning any extra commit !

thanks, I get it now :slight_smile: