关注公众号

关注公众号

手机扫码查看

手机查看

喜欢作者

打赏方式

微信支付微信支付
支付宝支付支付宝支付
×

The TRC shRNA Design Methods and Rules-2

2019.8.02

Score the Candidates Sequences for Specificity

We are forced to balance the prediction of knockdown efficiency against the desire to minimize interaction with off-target genes, without a clear understanding of just how to predict off-target "hits". We calculate a "specificity score" to promote candidates without obvious off-target transcripts. Each candidate is compared by BLASTN to two distinct abstractions of the transcriptome: the NCBI Unigene "unique" database (vaguely defined by NCBI as the "longest, best" sequence from each unigene cluster), and the transcripts from Refseq. We deem a 'miss' any sequence pair with at least three differences, with at least two of the differences in the core positions 3-19, i.e., not on the ends of the 21mer target region. We then determine if each candidate hits one unigene cluster, one Locuslink transcript, one Locuslink gene, and for those genes with muliple transcripts, all the the transcripts in the gene. Using just the "hits-One-Unigene" and the "hits-One-NM" values, we apply a "specificity score" to each candidate whereby candidates that uniquely hit one unigene cluster AND one Locuslink transcript are rewarded, those that hit one unigene OR one Locuslink transcript are rewarded, but less so, and those that had neither unigene or Locuslink specificity are penalized. After determining and storing this "specificityScore", we resort the candidates.

Spacing the candidate 21mers along the transcript

Since we synthesize 5 oligo pairs for each transcript, and since we hypothesize a role for the secondary structure of the target transcript in the effectiveness of an shRNA, we want to have the candidates spread out along the transcript, with one from the 3-prime UTR region and 4 along the CDS. To pick the five candidates, the highest scoring three-prime UTR candidate, if available, is chosen first. Next the top scoring candidate among the CDS candidates is chosen. A position-penalty is then applied to all the other CDS candidates, where the penalty is more severe the closer the candidate is to the first CDS candidate picked. After applying the position penalty, all the CDS candidates are resorted by their newly calculated, position-weighted score. From the list of remaining CDS candidates, the highest-scoring candidate is chosen and the position penalty is applied to all the remaining candidates based upon the already picked CDS candidates. This process is repeated until all the candidates are rescored. Finally the top 5 position-, specificity-weighted candidates are chosen for oligo synthesis.



推荐
热点排行
一周推荐
关闭