Contents
TL;DR
Create a new Git repository:
$ git init
or copy a Git repository:
$ git clone <URL>
Submit changes:
$ git add <path> [<path> ...]
$ git commit
$ git push
Get and apply changes:
$ git pull
Git is a distributed version control system (DVCS) designed to handle everything from small to very large projects with speed and efficiency. It is a widely adopted version control system (VCS), and the tool we have chosen to use within SNO+ for all code development, along with hosting with GitHub. If you are unfamiliar with version control systems, refer to What is Version Control?.
Git has detailed documentation, and free book (Pro Git) available online in many format. Andy Mastbaum had written a fantastic overview of Git for SNO+, SNO+-doc-1462
it is suggested read, otherwise use this page for reference.
Your identity
Git has many configuration options. You are welcome to tweak your setup as you see fit. The first thing you should do when using Git, is to set your identity; it is important to know who committed the pushed the changes to the codebase.
$ git config --global user.name = "John Doe"
$ git config --global user.email = j.doe@lancaster.ac.uk
Creating a Git Repository
To make a your current working directory into a Git repository, you only need to run,
$ git init
which creates the Git directory, .git
, is where all the metadata and object database for your project resides. Do not delete this directory, or change anything within it, unless you know what you are doing. It also creates and sets the current branch to master
.
If you have a repository already setup somewhere else, another computer, GitHub, etc., that you would like a copy of, you clone it,
$ git clone <URL>
which automatically makes a copy of URL
s master
branch, sets up URL
as the remote repository origin
; for example,
$ git clone git@github.com:snoplus/rat.git
will make a copy (clone / local fork) of the SNO+ RAT repository.
Three States
It is important to know Git has three main states that you files can reside in: committed, modified, and staged.
-
committed
The data is safely stored in your local database
-
modified
You have changed the file, but have not committed it to you database yet
-
staged
You have marked a modified file in its current version to be in the next commit (shapshot)
The working directory is a single snapshot from your database. When you checkout a snapshot, the files are pulled out of the compressed database in the Git directory and placed on disk for use and modification.
The staging area is a file within your Git directory (.git/index
); it stores information of your next commit.
A simple workflow would be:
-
You modify your files in your working directory
$ vim changed-file.cxx
-
Stage the files that you have modified, which adds the files to the staging area
$ git add changed-file.cxx
-
Commit the changes, which takes the files as they were when staged, and stores that snapshot permanently in your database
$ git commit -m "Fix issue #42"
Just running git commit
will open your editor for a commit message. This editor uses the default for your configuration, set using the EDITOR
environment variable. If you wish to override this editor for Git only, you can set it via:
$ git config --global core.editor <editor-command>
Please write good commit messages, it should be a record of what was changed and why. How to Write a Git Commit Message is a blog-post which summaries the preferred method of writing commit messages.
In short, by treating the commit message as an email, with the first line being the subject,
- Separate subject from body with a blank line
- Limit the subject line to 50 characters
- Capitalise the subject line
- Do not end the subject line with a period
- Use the imperative mood in the subject line
- Wrap the body at 72 characters
- Use the body to explain what and why rather than how
Status and Changes
-
git status
Shows a summary of what files have been added, modified, staged, and any untracked files (usually new files). It even gives the relevant commands to add and remove files from the staging area, in case you have forgotten.
-
git diff [--staged] [<ref>] [<path> ...]
Shows a summary of all the unstaged changes relative to
ref
. Ifref
is omitted, it is defaulted toHEAD
. If--staged
is given, it shows the staged changes, rather than the unstaged changes. Ifpath
s are given, only the changes to those paths are returned. -
git diff <ref1> <ref2> [<path> ...]
Shows a summary of the all the changes between
ref1
andref2
. Ifpath
s are given, only the changes to those paths are returned. -
git log
Shows the commit history of the repository
Where ref
s are reference pointers to commits. This could be a hash, a tag name, a branch name, or a special identifier.
Undoing Things
A common mistake is forgetting a file, or have a typo in your most recent commit; you can correct the last commit with the --amend
option. Just make the relevant changes and:
$ git commit --amend
However, if you have already pushed your commits to a server, it is suggested that you do not edit the history in any way. There is an advanced command (which will not be covered here), git rebase
, which gives the user the ability to change anything in the history. Do not use this, unless you are knowledgeable, and are certain the changes you make will not affect any commits but the ones purely local to your version.
If you have staged a file by mistake, you can remove it with a quick:
$ git reset HEAD <file>
If you have modified a file and you decide that was not what you wanted, you can just checkout the file again; however, note your changes you made have been lost, make sure you want to do this.
$ git checkout -- <file>
Anything that has been committed in Git can almost always be recovered, even if you believe you have lost it; however, if you lose something you haven’t committed, it is lost forever — commit early, commit often!
Working Remote Repositories
These are remote repositories are versions of your project hosted somewhere else; usually these are your fork of the project hosted on some service like GitHub, the original/upstream project, or your collaborator’s version that you may wish to use something from.
We use remotes a lot in SNO+; it is the way we update our code to the upstream version, or submit our changes for others to use. In SNO+ a remote named origin
usually refers to our own fork on GitHub, whereas upstream
usually refers to the SNO+ version on GitHub.
To list what remotes you currently have set up, use git remote
; you might find git remote -v
more useful, as it lists the URLs of the remote repositories as well.
Adding Remotes
Adding a remote is fairly simple, providing you know the URL of the repository.
$ git remote add <remote-name> <url>
For example, to add the upstream development version of SNO+ RAT to your local version:
$ git remote add upstream git@github.com:snoplus/rat.git
Fetching and Pulling from Remotes
To get the data from a remote you need to fetch it,
$ git fetch [--all | <remote-name>]
if remote-name
is omitted, then the tracked remote branch is implied; if --all
is given, all remotes are fetched. Once you have the data fetched, you can then merge in the data to your repository.
This is a common practise, and if you have a local branch tracking a remote branch, you can simply run:
$ git pull
Which is shorthand for:
$ git fetch && git merge FETCH_HEAD
Pushing to Remotes
$ git push [<remote-name>] [<branch-name>]
Renaming and Removing Remotes
Sometimes you want to rename the remote,
$ git remote rename <old-remote-name> <new-remote-name>
this is the suggestion if you are using snoing to install rat-dev
, as origin
is set to the SNO+ version, rather than your fork:
$ git remote rename origin upstream
$ git remote add origin git@github.com:USERNAME/rat.git
Once you are done with a remote, you can deleted it, with either a short or long form:
$ git remote rm <remote-name>
$ git remote remove <remote-name>
Hash
You should be aware that everything is check-summed using a SHA-1 hash in your database; thus everything in Git can be referenced with a 40-character string of hexadecimal characters (0-9, a-f). It is unlikely you need to use all 40 characters for a unique reference, usually 7 characters is more than sufficient; i.e., using d0bc7a2
over d0bc7a204e36b47e8855d0f1e511c8c97d259323
.
Tagging
Tags are a useful tool for referring to a specific point in the history of your repository; commonly used for versioning. A tag can be used like any other reference, e.g., be checked out,
$ git checkout 5.3.2
however it is usually better to check out the tag to a branch.
git tag
lists available tagsgit tag -a <tag> [<ref>]
will create a tag,tag
, forref
, orHEAD
ifref
is omittedgit show <tag>
will print a summary of the tag,tag
Tags are not automatically shared when pushing, so you need to pass the --tags
option, e.g.,
$ git push origin --tags
Branches
A lot of other VCSs store information as a list of file-based changes over time (diffs); whereas Git stores information as a list of snapshots of the state of the files, without stored unchanged files. By treating data like this, Git is like a mini filesystem; this gives rise to cheap and simple branching and merging.
The term ‘branch’ is used as you can think of the Git commit history as an ancestry tree; each commit is node, and a branch would indicate a divergence in the code.
In terms of the trivial local version control system, you would have one directory filled with timestamped subdirectory copies of different versions of the codebase (commits); the parent directory to these timestamped subdirectories would be akin to a branch.
Lets say you want to make a new feature, which would involve a lot of refactoring; you want to keep the code in its current state, as you want to continue develop. You would make a copy of that parent directory and continue editing both simultaneously; one master
directory (branch), and another feature
directory (branch).
This analogy should give you a feel of what branches are; another copy of your code for parallel development. In Git they are a little different to the above explanation, but are akin to it.
You start with a master
branch, but wish to make some feature, whilst keeping the working version you have on master
,
$ git branch <new-branch> [<ref>]
will create a new branch named new-branch
, based off of ref
, or HEAD
if ref
is omitted,
$ git checkout -b <new-branch> [<ref>]
would also change to the new branch, and is equivalent to:
$ git branch <new-branch> [<ref>]
$ git checkout <new-branch>
git branch
will list all local branchesgit branch -d <branch>
will delete a local branch,branch
, if it has been mergedgit branch -D <branch>
will delete a local branch,branch
even if it hasn’t been mergedgit checkout <branch>
will change the current branch tobranch
if it exists, if it doesn’t but the remoteorigin/branch
exists, then it is created and set to trackorigin/branch
git branch -u <upstream-branch> [<branch>]
will set the upstream branch ofbranch
(or current branch if omitted) toupstream-branch
.
Merging
Once you have decided your feature is completed and you want to merge those changes into you master
branch,
$ git checkout master
$ git merge <branch-name>
this also works with remote branches.
Conflicts
Development occurs fast, you should stay up-to-date and merge in upstream changes often to ensure there are minimal conflicts and incompatible changes.
$ git fetch upstream
$ git merge upstream/master
It is still possible that you will get the occasional conflict, even if you try to stay as up-to-date as you can.
When merging, Git will inform you there is a conflict, and where. This occurs when the same part in the same file has different changes. git status
will five you more information, and instructions on what do.
The merge keeps both versions, navigating to the file you’ll see the conflicting code in the form:
<<<<<<<
version in the current branch (yours)
=======
version in the merging branch (theirs)
>>>>>>>
As shown, between <<<<<<<
and =======
is the code in your branch, the current branch; whereas the code between =======
and >>>>>>>
is the code in their branch, the branch you are merging into the current branch.
Edit this block of code (including the <<<<<<<
, =======
, and >>>>>>>
), to reflect the resolution that is correct; then stage the file and commit.
Squashing
An untidy commit history can be hard to follow. If you have lots of small, trivial changes, you can commit them together as one squashed commit, which combines your changes and gives you an opportunity to give a meaningful commit message, summarising all the changes.
To merge your squashed into the current upstream master
, in it’s own branch:
$ git fetch upstream
$ git checkout -b feature-squash upstream/master
$ git merge --squash feaure
$ git commit