cDNA Libraries
Isolation of corresponding genetic information
Instead of synthesizing a desired gene, can we used the amino acid information to directly isolate the corresponding genetic information?
There are two general sources of genetic information:
Genomic DNA
mRNA
If we are considering genomic DNA from eukaryotes, then there are a couple of things to consider:
The coding region for a gene of interest may be interrupted by one or more intron regions, and thus the complete coding region could be quite long.
To a first approximation, it does not matter which tissue we use to isolate the genomic information, i.e. the genomic content is the same in all tissues.
If we are considering mRNA from eukaryotes, we may realize the following advantages:
Introns will be spliced out and the mRNA will contain a contiguous coding region.
Tissue specific expression of the protein of interest may allow us to isolate appropriate mRNA at enhanced levels, i.e. in tissues where the protein is expressed the mRNA levels are considerably higher than the corresponding genomic levels (there are many more molecules of mRNA than copies of the gene).
Libraries
A "library" is a convenient storage mechanism of genetic information.
They are typically either "genomic" or "cDNA" (i.e. mRNA in DNA form) genetic information.
Deduced genetic sequences from corresponding polypeptide information can be used to identify specific genetic information within a library.
cDNA library construction
The enzyme responsible for this is an RNA dependent DNA polymerase called reverse transcriptase.
Reverse transcriptases have traditionally been isolated from viruseswhose genome is actually in an RNA form and must be converted to duplex DNA.
These viruses typically carry a functional reverse transcriptase along with their mRNA genetic component when they infect cells.
One of the most common commercially available reverse transcriptases is Moloney murine leukemia virus (MMLV).
This RNA dependent DNA polymerase (as will all polymerases) add nucleotides to a nacent polynucleotide in the 5' to 3' direction using RNA as the template. It does not contain any 3'->5' exonuclease (proofreading) activity.
MMLV will use mRNA as a template, but requires a primer (it can extend a DNA primer but cannot synthesize one).
One of the really neat things about eukaryotic mRNA's is the presence of the 3' poly A tracks.
If we can introduce "nicks" into the RNA half of this DNA/RNA duplex then the situation would be very similar to that observed in "lagging strand" synthesis of prokaryotic genomic DNA.
Nicks in the RNA half of the molecule can be introduced via the action of the enzyme RNAse H.
This enzyme exhibits endonucleolytic cleavage of the RNA moiety of RNA/DNA hybrids, as well as 5'->3' and 3'->5' exoribonuclease activity.
In other words, it will nick the RNA and then proceed to digest back in both directions:
These RNA fragments can now serve as primers for DNA synthesis byE. coli Pol I. This enzyme will also translate the "nicks" to effectively remove the RNA primers:
Insertion of cDNA into plasmid.
To complete our construction of a useful cDNA library we need a way tomaintain and propagate our cDNA.
We can accomplish this by inserting the cDNA into an appropriateplasmid.
There are two classical ways of accomplishing this feat:
Homopolymeric tailing
Linker addition
Homopolymeric tailing
Terminal transferase is an unusual DNA polymerase found only in a type of eukaryotic cell called a prelymphocyte.
In the presence of a divalent cation the enzyme catalyzes the addition of dNTP's to the 3'-hydroxyl termini of DNA.
When the nucleotide to be added is a purine, Mg2+ is the cation used.
When the nucleotide to be added is a pyrimidine, Co2+ is used.
Depending on the reaction conditions, anywhere from three to several thousand bases will be added.
If we cut our plasmid and also treat it with terminal transferase, except now we add the complementary base to the one we added to our cDNA, we can anneal and ligate the cDNA into the plasmid.
The utility of inserting the C-tailed cDNA insert into a G-tailed Pst I site in the vector is as follows:
The Pst I recognition sequence and cleavage site is
5' C T G C A G 3'
3' G A C G T C 5'Cleavage of this site by Pst I, followed by G-tailing will produce
5' C T G C A (G)n G 3'
3' G (G)n A C G T C 5'
Linkers
An alternate method to insert cDNA fragments into a library vector is through the addition of "linkers".
Linkers are short oligonucleotides (~18 to 24 mers) which are typicallypalindromic and contain a single or repeated restriction endonuclease recognition sequence.
The palindromic nature allows the linker oligonucleotide to self-hybridize to form a blunt ended duplex.
If the ends of the cDNA fragments are blunt, then the linker can be ligated to both ends to introduce useful terminal restriction sites.
The steps in linker addition are as follows:
Treatment of cDNA with S1 nuclease (to remove possible 5' cap mRNA fragment remaining in cDNA duplex
Convert potential "ragged" ends to blunt by treatment with Pol I (will fill in 5' overhangs and chew back 3' overhangs)
Methylate cDNA at potential internal Eco RI sites by treatment with Eco RI methylase (plus S-adenosyl methionine)
Ligate linkers to blunt, methylated cDNA using T4 DNA ligase
Cut linkers with Eco RI restriction endonuclease
Remove linker fragments from cDNA fragments by agarose gel electrophoresis
Ligate cDNA to vector DNA fragment (opened up by Eco RI restriction endonuclease