Efficient In Silico Identification of a Common Insertion in the MAK Gene which Causes Retinitis Pigmentosa

PLoS One. 2015 Nov 11;10(11):e0142614. doi: 10.1371/journal.pone.0142614. eCollection 2015.

Abstract

Background: Next generation sequencing (NGS) offers a rapid and comprehensive method of screening for mutations associated with retinitis pigmentosa and related disorders. However, certain sequence alterations such as large insertions or deletions may remain undetected using standard NGS pipelines. One such mutation is a recently-identified Alu insertion into the Male Germ Cell-Associated Kinase (MAK) gene, which is missed by standard NGS-based variant callers. Here, we developed an in silico method of searching NGS raw sequence reads to detect this mutation, without the need to recalculate sequence alignments or to screen every sample by PCR.

Methods: The Linux program grep was used to search for a 23 bp "probe" sequence containing the known junction sequence of the insert. A corresponding search was performed with the wildtype sequence. The matching reads were counted and further compared to the known sequences of the full wildtype and mutant genomic loci. (See https://github.com/MEEIBioinformaticsCenter/grepsearch.).

Results: In a test sample set consisting of eleven previously published homozygous mutants, detection of the MAK-Alu insertion was validated with 100% sensitivity and specificity. As a discovery cohort, raw NGS reads from 1,847 samples (including custom and whole exome selective capture) were searched in ~1 hour on a local computer cluster, yielding an additional five samples with MAK-Alu insertions and solving two previously unsolved pedigrees. Of these, one patient was homozygous for the insertion, one compound heterozygous with a missense change on the other allele (c. 46G>A; p.Gly16Arg), and three were heterozygous carriers.

Conclusions: Using the MAK-Alu grep program proved to be a rapid and effective method of finding a known, disease-causing Alu insertion in a large cohort of patients with NGS data. This simple approach avoids wet-lab assays or computationally expensive algorithms, and could also be used for other known disease-causing insertions and deletions.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Alleles
  • Alu Elements / genetics
  • Amino Acid Sequence
  • Animals
  • Base Sequence
  • Cohort Studies
  • Exons
  • Genetic Loci
  • Heterozygote
  • High-Throughput Nucleotide Sequencing
  • Homozygote
  • Humans
  • Male
  • Molecular Sequence Data
  • Mutagenesis, Insertional
  • Pedigree
  • Protein Serine-Threonine Kinases / genetics*
  • Retinitis Pigmentosa / genetics
  • Sequence Alignment
  • Sequence Analysis, DNA

Substances

  • Protein Serine-Threonine Kinases
  • MAK protein, human

Associated data

  • GENBANK/KT192064