Sequence quality analysis tool for HIV type 1 protease and reverse transcriptase

AIDS Res Hum Retroviruses. 2012 Aug;28(8):894-901. doi: 10.1089/aid.2011.0120. Epub 2011 Oct 26.

Abstract

Access to antiretroviral therapy is increasing globally and drug resistance evolution is anticipated. Currently, protease (PR) and reverse transcriptase (RT) sequence generation is increasing, including the use of in-house sequencing assays, and quality assessment prior to sequence analysis is essential. We created a computational HIV PR/RT Sequence Quality Analysis Tool (SQUAT) that runs in the R statistical environment. Sequence quality thresholds are calculated from a large dataset (46,802 PR and 44,432 RT sequences) from the published literature ( http://hivdb.Stanford.edu ). Nucleic acid sequences are read into SQUAT, identified, aligned, and translated. Nucleic acid sequences are flagged if with >five 1-2-base insertions; >one 3-base insertion; >one deletion; >six PR or >18 RT ambiguous bases; >three consecutive PR or >four RT nucleic acid mutations; >zero stop codons; >three PR or >six RT ambiguous amino acids; >three consecutive PR or >four RT amino acid mutations; >zero unique amino acids; or <0.5% or >15% genetic distance from another submitted sequence. Thresholds are user modifiable. SQUAT output includes a summary report with detailed comments for troubleshooting of flagged sequences, histograms of pairwise genetic distances, neighbor joining phylogenetic trees, and aligned nucleic and amino acid sequences. SQUAT is a stand-alone, free, web-independent tool to ensure use of high-quality HIV PR/RT sequences in interpretation and reporting of drug resistance, while increasing awareness and expertise and facilitating troubleshooting of potentially problematic sequences.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Base Sequence
  • Drug Resistance, Viral / genetics*
  • HIV Infections / virology*
  • HIV Protease / genetics*
  • HIV Reverse Transcriptase / chemistry
  • HIV Reverse Transcriptase / genetics*
  • HIV-1 / classification
  • HIV-1 / genetics*
  • Peptide Hydrolases / chemistry
  • Phylogeny
  • Sequence Analysis, DNA / methods*

Substances

  • HIV Reverse Transcriptase
  • Peptide Hydrolases
  • HIV Protease
  • p16 protease, Human immunodeficiency virus 1