Before class
- Setup class organization at Github
- Check the
Allow members to create repositories for this organization
permission- Set the
Default permissions
for the organization toNone
if you want to avoid students accessing each others repositories- Have students create a GitHub account and email their username to the instructor.
- Add students’ username to organization.
For class
- Download Gaeta_etal_CLC_data.csv.
Either arrange to have a teaching partner to attend class or be logged into GitHub as another user in the browser for collaboration demos.
- Open the following links in a browser and zoom in to make the images fill the screen.
Live coding demo and assignment are intertwined and designed to work in order. Instructions for both Posit Cloud & RStudio are included so just use the setup pieces that fit your class. Sign into GitHub
- Who has a directory on their computer with a bunch of filenames
- Get rid of messy folders and track changes to things like data files and code in a more manageable way.
Benefits of version control
- Track changes (but better)
- Tracks every change ever made in groups called commits
- Every commit stores the full state of all of your files at that time
- Never lose anything
- Revert or restore to any commit
- Easily unbreak your code/data/manuscript
- No more file name changes
- Tracks every change ever made in groups called commits
- Collaboration
- Work on things simultaneously
- See what changes others have made
- Everyone has the most recent version of everything
- Work on things simultaneously
Version control using Git & RStudio
Connecting Posit Cloud to GitHub (only if teaching in Posit Cloud)
- Go to (login in necessary)
- Click
Posit User Settings
- Go to GitHub section and click
Private repo access also enabled
- Will be redirected to GitHub to authorize Posit Cloud
- Click
Authorize posit-hosted
Create a Git repo
- Navigate to Github in a web browser and login.
- Click the
at the upper right corner of the page and chooseNew repository
. - Choose the class organization (e.g.,
) as theOwner
of the repo. - Fill in a
Repository name
that follows the formFirstnameLastname
. - Select
. - Select
Initialize this repository with a README
. - Click
Create Repository
. - From new GitHub repository, click green
button -> Click theCopy url to clipboard
Connect to the Git repo
Posit Cloud
- Posit Cloud (in class org): New Project -> New Project from Git Repo
- Paste copied URL in
URL of your Git Repository:
. - Click
. - Check to make sure you have a
tab in the upper right window.
- RStudio, File -> New Project -> Version Control -> Git
- Paste copied URL in
Repository URL:
. - Leave
Project directory name:
blank; automatically given repo name. - Choose where to
Create project as subdirectory of:
. - Click
Create Project
. - Check to make sure you have a
tab in the upper right window.
install.packages(c('dplyr', 'readr', 'usethis', 'gitcreds'))
Introduce yourself to Git
use_git_config( = "[name]", = "[email]")
That was Exercise 1 - Set-up Git. Have students confirm that this all worked and fix any issues.
First commits
Commit data
- Download the data file Gaeta_etal_CLC_data.csv to your project directory.
- Add the data file to version control
- Two step process:
- Add the data file (checkbox)
- Commit it
- Git -> Select
. - Commit with message.
Add fish size and growth rate data
- History:
- One commit
- Shows that the file has been added to version control
Commit R script
- Read in data to new R script.
fish_data = read_csv("Gaeta_etal_CLC_data.csv")
Make sure to have a new line at the end for a clean diff
- Save as
. - Git -> Select
.- Changes in staged files will be included in next commit.
- Can also see changes by selecting
- Commit with message.
Start script comparing fish length and scale size
- History:
- Two commits
- See what changes were made to
Building a history
doesn’t currently show on theGit
tab- No saved changes since last commit
- Add some more code to
- Create new categorical size column
fish_data_cat = fish_data |>
mutate(length_cat = ifelse(length > 200, "big", "small"))
- Save
. - Now we see the file on the
indicates that it’s been modified.
- To commit these changes, we need to stage the file.
- Check the box next to
- Check the box next to
- Commit with message.
Add categorical fish length column
- History:
- Three commits
- Each
commit shows the additions we made in that commit.
- Modify this code in
- Change category cut-off size
fish_data_cat = fish_data %>%
mutate(length_cat = ifelse(length > 300, "big", "small"))
- Save file -> stage -> commit
Change size cutoff for new column
- Green sections for added lines, red for deleted
- Git works line by line.
- The previous version of the line is shown as deleted.
- The new version of the line is shown as added.
Do Exercise 2 - First Solo Commit and Exercise 3 - Second Solo Commit
Instructor also do exercises
Committing multiple files
- Commits can include multiple files at once
- Let’s move our data file into a
subdirectory New Folder
- Checkbox
- Change code to read from new subdirectory
fish_data = read_csv("data/Gaeta_etal_CLC_data.csv")
- Changes to R script indicated by M
- Original datafile has a red D next to it which indicates “deleted”
- New, untracked, data directory
- git initially thinks we’ve deleted
and created a newGaeta_etal_CLC_data.csv
file in a new directory. - Click on both the old and new files to stage them
git then recognizes that we have moved (or renamed) the file by making the two files into one and marking this with an
for “rename”. - Commit:
Move data file into subdirectory
Instructor also do exercise
Git as a time machine
Experiment with impunity
fish_data_cat = fish_data %>% mutate(length_cat = ifelse(length > 300, "large", "small"))
and show changes are staged-
- Get previous state of a file
-> select commit ->View file @ ...
- Save file over current file
- Copy and paste relevant piece into current file
Delete with impunity
- Both of these also work for deleted files
- Close the upper left window with the
. - Choose the
tab in the lower right window. - Select
- Stage deleted file ->
GitHub Remotes
Draw diagram to link local machine with GitHub
- So far we’ve worked with a local
repository. - One of the big benefits of version control is easy collaboration.
- To do this, we synchronize our local changes with a remote repository called
. - Our remote repository is on GitHub.
- By far the most popular hosted version control site
- Public and private hosted repositories
- Private free for students and academics
- For the assignment, we’re using private repositories that we made at the beginning.
Push to a remote
Connect to GitHub
- To push to your remote we first have to connect to GitHub, which is a little tricky
- First, log in to GitHub in your browser
- Then create a GitHub token, this is like a special password just for one computer
You may need to allow popups and try again
- Select defaults
- Create token
Copy token
- Now add this token our local git setup so that it can use it to connect to GitHub
- Paste your password
sends your recent commits to theorigin
Draw push arrow on diagram on board from local to
- Before a
your commits show in your local history but not on the remote.
Show local commit history and lack of history in remote.
- To
to your remote, select thePush
button at the top of theGit
tab. - Now your changes and commit history are also stored on the remote.
Show local commits now on
Have students email a link to their repo to their instructor once they have finished Pushing Changes
The instructor should then commit the following code to their repo with the commit message:
Plot histogram of scale length by categorical size
ggplot(fish_data_cat, aes(x = scalelength, fill = length_cat)) +
Either you (logged in as another user) or your teaching partner should make the same change to your respository
- Big advantage to remotes is easy collaboration
- Avoids emailing files and shared folders where you are never sure if you actually have the most recent version
- Makes it easy to see what collaborators have done
- Automatically combines non-overlapping changes
- While I’ve been talking, a collaborator has added a plot of scale size and fish length to the code.
with collaborator commit.
Add collaborator local repo to diagram and
arrow fromorigin
to locals.
the changes from the remote repo with thePull
button on the Git tab
Show updates to history following
and run code
Do Tasks 3-6 in Exercise 6 - Pulling and Pushing.
Demo merges either with a partner or by logging into GitHub as another user in the browser.
- What happens if two people make changes at the same time?
- If they edit different parts of the code git will combine them automatically
- If they edit the same areas of the code this requires human intervention
- You decide to change the number of histogram bins to 10
geom_histogram(bins = 10)
- Your collaborator reassesses the measurement device and decides it is accurate down to 0.5 mm and pushes the change to the remote repository [make this change in the remote]
filter(scalelength >= 0.5)
- You try to push your change
- Get an error that shows someone else has made a change & you need to incorporate it to push
- Pull
- Merge happens automatically
- You have both sets of changes
- Remote still only has collaborators changes
- Push to add the merged version to the remote
Merge conflicts
- If both you and your collaborator edit the same location in the code git doesn’t know how to combine the changes.
A human has to make this kind of decision.
- You decide to change
mutate(length_cat = ifelse(length > 300, "large", "small"))
- Your collaborator changes the size threshold and pushes to the remote
mutate(length_cat = ifelse(length > 250, "big", "small"))
- You attempt to push your changes
- Merge conflict when pulling collaborators changes
- This shows as
for “unmerged” in RStudio - First block of code is your version
- Second block is the version on the remote
- Combine into a single block that includes everything
mutate(length_cat = ifelse(length > 250, "large", "small"))
- Click check box next to file
- Commit indicating that it is a merge
- Still not on remote yet
- Push
Full GitHub flow
- Collaborating on Github can get more complex with “forks” and “branches.
Optional: Redraw diagram with local,
, andupstream
. Arrows fromorigin
are pull requests and merges.
Show an example of a working repository with branches and forks. Navigate to pull requests.