HOMEWORK 4

(For interactive page, go to /education/bioinfo
Answers are also available at that URL.)

  1. Click on the link of "Connecting to the hebrides application server and transferring files" on the Genomic Resources and Unix website to learn how to access tak or hebrides.
    Click on the link of "Using X windows on tak or hebrides" on the Genomic Resources and Unix website to learn how to use X windows.

  2. After logging in to your tak or hebrides account

  3. In this question, you will make a database, then blast a given sequences against this database, finally retrieve conserved regions from these blast hits.

  4. Gene structure determination and promoter extraction of BMP4 (bone morphogenetic protein 4)
    1. Go to the LocusLink page for mouse BMP4 and get the RefSeq cDNA sequence (NM_007554) in fasta format
    2. Use BLAT to align the cDNA to the mouse genome:
    3. Look at the "browser" view for the best hit, Zoom out 10X and look at these tracks in the browser: "Your sequence from BLAT Search" "RefSeq genes" Since the RefSeq track comes from a pre-computed BLAT alignment, they should be the same.
    4. Do any of these tracks show evidence that NM_007554 is not the full- length cDNA? You may want to turn on ["squish", "pack", or "full"] these tracks:
      MGC Genes
      Ensembl Genes
      Mouse mRNAs
    5. Select one of the longest mRNAs and get what appears to be the true full-length mRNA sequence.
    6. Is the evidence of any alternative splicing? The "Spliced ESTs" track may also help.
    7. Extract the "promoter" as defined as the sequence 2.5 kb upstream of the gene start.

  5. Create a directory called bioinfo_course, and inside it create another directory homework_4. What commands do you need to issue? Move all the files you made previously inside the homework_4 directory. What command could you use to do this?