Syllabus
Syllabus for “Bioinformatics for Beginners"
Instructors:
Co-Instructors:
- To participate in these courses, you need a computer, reliable internet connection and a web browser.
- All classes and help sessions will be held online.
- This class will be taught with the GOLD learning environment on the DNAnexus platform. Every learner will get their own login and password.
This is the first of three courses we have developed for beginning bioinformatics learners with the scenario “I’ve just gotten my sequence data back from the Sequencing Center, how do I understand/analyze/work with it?".
Introduction to the Bioinformatics Training and Education Program (BTEP) website
- on the front page is the NIH Bioinformatics Calendar (search by topic, or organizer, info on current, future and past events)
- Licenses are available to online learning platforms (Dataquest and Coursera), where you can learn R, Python, SQL, Statistics and more!
- Distinguished Speakers Seminar Series
- FAQ Forums for ChIP-Seq and Single Cell RNA-Seq
- Information on training on point-and-click software provided to NCI and CCR (Partek Flow, Qiagen IPA, Qlucore, NIDAP)
Getting help in this course:
- Breakout room during class if you are lost or confused
- Email: ncibtep@nih.gov
- Scheduled help sessions
DNAnexus and the GOLD learning platform
-
Each of you will need to create an account on DNAnexus, which hosts the GOLD online learning platform. This login and password is different from the login/password you will use for the GOLD system. If you haven't yet created an account on DNAnexus we are going to do that right now.
-
Let's go to DNAnexus.com. Click on "Sign up" and create your new account. After setting up your account, put the email address and username you use to set up your DNAnexus account in the chat (only if you haven't done this already) so we can activate your account.
-
After you login to DNAnexus, you will see the GOLD (Genome Analysis Unit Online Learning Domain) learning system main page.
-
You will see a chart with your User name, Login and View columns.
-
Click on your name under "Login".
-
Enter your login: (your first name with first letter capitalized) and password (will be provided in class).
-
Please note, the cursor will not move when you type in your password. Don't let this fool you. Type in the password exactly as provided to you.
-
You will see a black screen with a LOT of output printed to the screen.
-
Your first unix command, "clear". The unix command "clear" will bring your command line back to the top of the screen so it's easier to see what you are doing. I will use "clear" frequently in class so you can see what I am doing. You do not need to type "clear" every time I do, but you can if you like.
-
Now you should see your prompt, it may look like "username:~>", where "username" is your username. This is your command line.
-
We will learn about the "View" option later in class. We can move files there for viewing.
Schedule
- Course One – Why learn Bioinformatics? And Beginner Unix.
- Course Two- Working with NGS data, bulk RNA-Seq example
- Course Three – Visualizing RNA-Seq data, pathway analysis, and intro to Biowulf and Globus
Course One
Why learn Bioinformatics? And Beginner Unix.
Who should take this course:
Learners who want to work with Next Gen Sequence data
Pre-requisites: None, this class if for beginner level bioinformatics learners
Learning Objectives: In the class learners will be able to:
- Understand why every bench scientist should learn some bioinformatics
- Log into and utilize the GOLD learning environment for class content and lessons
- Work with Unix files and directories to manage Next Gen Sequencing data and associated files
- Understand data formats (FASTA, FASTQ) and learn how to work with them at the Unix command line
Course Two
Working with NGS data, bulk RNA-Seq example
Who should take this course:
Learners who want to understand bulk RNA-Seq experimental and analysis techniques.
Pre-requisites: Learners should:
- be familiar with the GOLD learning environment
- have beginner level knowledge of working in a Unix environment
Learning Objectives
- Understand the basics of bulk RNA-Seq, how the experiments should be set up for best results and problems to avoid such as batch effects
- Interpret the information returned by the sequencing centers and determine next steps of analysis (Quality control, etc.)
- Perform the beginning steps of bulk RNA-Seq analysis, including assaying sequence quality, sequence data trimming/ removing adapters.
- Identify software resources available to them within NCI to help them analyze their data
Course Three
Visualizing RNA-Seq data, pathway analysis, and intro to Biowulf and Globus
Who should take this course:
Learners who want to learn to do a complete bulk RNA-Seq analysis from FASTQ files to biological pathway analysis.
Prerequisites: Learners should be
- familiar with the GOLD learning environment
- have beginner level Unix command line skills
- understand the steps of bulk RNA-Seq analysis including data formats, inputs and outputs.
Learning Objectives:
- Understand all the steps of a complete bulk RNA-Seq analysis, from FASTQ files to biological pathway analysis
- Visualize sequence data graphically with the Integrative Genome Viewer (IGV) tool
- Understand how to get an account on NIH High Performance Cluster Biowulf, utilizing /home and /data directories, and how to be a good citizen on Biowulf
- Learn how to use the NIH “Globus” service to transfer and share data files