#Unix Essentials: Hands-on #Parsing HBI array data #Goal: Process a gene expression file to get information such as genes of interest, sort by expression values, and subset the data for further investigation. #THIS SYMBOL "#" PRECEDES A COMMENT LINE #0. In your browser open the page: http://jura.wi.mit.edu/bio/education/hot_topics/unix_essentials_2015/UnixEssentials_HandsOn.txt #You can copy paste the commands from that page as we need them #1. Log into tak. See handouts. #2. Go to the BaRC’s training folder: cd /nfs/BaRC_training #Create a folder with your login name with mkdir command: mkdir your_login_name #[Note: Replace your_login_name with your tak login name] #Go to the directory that you just created: cd your_login_name #Check where you are: pwd #3. Copy the HBI data we will be working with cp ../HBI.partial.txt . #If you are following this instructions after the Hot Topics is over then use this command: #cp /nfs/BaRC_Public/Hot_Topics/Unix_Essentials_Oct2015/HBI.partial.txt . #4. View the file in your favorite editor. What is the first field? gedit HBI.partial.txt & more HBI.partial.txt head -1 HBI.partial.txt | cut -f1 #5. How many genes are in the HBI data? [Note: header line] wc -l HBI.partial.txt #6. Get the first column, and columns 20-22 and output it to a file called HBI.partial.new.txt, use this new file for the rest of the questions cut -f 1,20-22 HBI.partial.txt > HBI.partial.new.txt #7. What tissues are included in the new file? head -1 HBI.partial.new.txt #8. Are there any duplicate genes? [Hint: uniq needs a sorted list] cut -f 1 HBI.partial.new.txt | sort | uniq -d #9. Sort the expression values based on the second column. Which gene has the highest expression level? # [Note: the difference in using the sort options –g (general numeric sort) and –n (numerical sort)] sort -k 2,2gr HBI.partial.new.txt | head #10. Get all the genes that begin only with "ZNF" from the original file, and output to a new file. Make sure to include the header line by appending just the header to the new file first. [Hint: use grep] head -1 HBI.partial.new.txt > ZNF_genes.txt grep "^ZNF" HBI.partial.new.txt >> ZNF_genes.txt #11. At the end of the class make a folder with your name in your lab folder and copy all the material to lab it. Refer to the hand out for the location of your lab folder on Tak. #These are example commands: mkdir /lab/PIname_lab/username mkdir /lab/PI_name_lab/username/unix_essentials_class cp –r * /lab/PI_name_lab/username/unix_essentials_class