Hu13K Promoter Microarray Production


It would be ideal to have a DNA microarray that contains the entire human genome sequence, but technical limitations and cost led us to select the most relevant portion of the genome for inclusion in this microarray. Because over 80% of known and predicted transcriptional binding sites in proximal promoters are within 1 kB of transcription start sites, we designed primers to amplify these genomic regions for printing onto a promoter array. We selected 15000 cDNAs from the NCBI RefSeq database of sequenced mRNAs. Where multiple splice variants had been described, we used the most upstream site, and verified the 5’-end by alignment with the Database of Transcriptional Start Sites. 200 bp of the sequence of each of these was blasted against the Golden path 4/2001 alignment of the genome, and the region surrounding was used to design primers.

To control for nonspecific binding, 9 amplified regions derived from long Arabidopsis open reading frames were included on the array. As a further negative control and for use in data normalization, we chose ~150 ORF regions within long exons of human genes for amplification.

To prepare the DNA content of the arrays, the program Primer3 was used to design primers using the sequences described above. PCRs were performed on these primer sets using standard conditions, except for the presence of 1 M betaine in all PCR reactions. Betaine was empirically observed to increase the success rate of the amplification reactions. Of the 13,000 pcr pairs, 70% gave a strong band of the appropriate size, as verified on 2% agarose gels. We have noted, however, that PCR products undetectable by agarose EtBr gel analysis can give valid positive signals when concentrated and printed on the DNA arrays. PCR quality evaluations were performed on the BRIDNA suite of programs from the Biotechnology Research Institute of the National Research Council of Canada.

PCR products were recovered from the reaction mixture by ammonium acetate/isopropanol precipitation and resuspended into 3x SSC with 1.5 M betaine to minimize evaporation and improve spot quality. The resuspended DNA was transfered to 384 well plates and printed on GAPS-coated slides (Corning) using a robot from Cartesian Technologies. The quality of the arrays was determined on a batch-wise basis by hybridization with sequence-neutral oligonucleotides covalently linked to Cy3 or Cy5, followed by calculation of usable percentage of spots, combined with direct visual inspection of the quality of the chip.

Because the genome was only approximately 70% complete in the April 2001 alignment used to generate the primer sets for this array, the Hu13K array was remapped against the August 2003 final release of the completed human genome post-production.

The dataset downloadable from this website will be continuously updated to reflect the corrected location of each arrayed promoter relative to the transcriptional start site as ongoing remapping efforts continue.