May 26-27, 2016
9:00 am - 4:30 pm
Instructors: Jason Williams, Ryan Dale, Adam Thomas
Helpers: Vinai Roopchansingh, John Lee
Data Carpentry workshops are for any researcher who has data they want to analyze, and no prior computational experience is required. This hands-on workshop teaches basic concepts, skills and tools for working more effectively with data.
We will cover Cloud computing and command line for genomics and Data analysis and visualization in R. Participants should bring their laptops and plan to participate actively. By the end of the workshop learners should be able to more effectively manage and analyze data and be able to apply the tools and approaches directly to their ongoing research.
Who: The course is aimed at graduate students and other researchers.
Where: 10 Center Dr, Bethesda, MD 20892. Get directions with OpenStreetMap or Google Maps.
The workshop is in the NIH Library training room in Building 10 (interior map) To get to the Library, enter Building 10 through the South Entrance; the Library is the only door down the left corridor.
Requirements: Participants must bring a laptop with a few specific software packages installed (listed below). They are also required to abide by Data Carpentry's Code of Conduct.
Contact: Please mail williams@cshl.edu for more information. Registration is directly through NIH at https://datascience.nih.gov/community/workforce/upcoming
Morning: Intro, Data Processing, and Organization | |
Intro to Data Carpentry | Jason |
Intro to the Data Set | Jason |
Genomics Data Tidyness | Adam |
Connecting to the Cloud in 5 Minutes or Less | Jason |
R and R Studio Orientation | Ryan |
Intro to R and R Studio | Ryan |
Dataframes and Metadata | Ryan |
Dataframes Continued | Ryan |
Data Cleaning and Manipulation with dplyr | Adam |
Afternoon: Data Cleaning and Visualization in R - Intro to Linux | |
Data Clearning and Manipulation with dplyr (cont'd) | Adam |
Plotting and Visualizing in R | Ryan |
Data Importing and Uploading | Jason |
Intro to the Linux Shell - Filesystem and Navigation | Jason |
Morning: Using Linux to organize and process Genomics Data | |
Intro to the Linux Shell - Searching and Metadata | Ryan |
Project Organziation and Documentation | Adam |
'For' loops - QC of Sequencing Data | Jason |
Afternoon: Using Linux to Automate | |
Automating Analyses - Shell Scripting | Adam |
Creating Workflows - Varient Calling Workflow | Jason |
Workshop Conclusion | Please take the post-survey |
How to Make This Work on Your Own | |
Launching Your Own Cloud Instances | On Your Own |
Etherpad: http://pad.software-carpentry.org//2016-05-26-NIH.
We will use this Etherpad for chatting, taking notes, and sharing URLs and bits of code.
To participate in a Data Carpentry workshop, you will need working copies of the described software. Please make sure to install everything (or at least to download the installers) before the start of your workshop. Participants should bring and use their own laptops to insure the proper setup of tools for an efficient workflow once you leave the workshop.
Please follow these Setup Instructions.
We maintain a list of common issues that occur during installation as a reference for instructors that may be useful on the Configuration Problems and Solutions wiki page.