Novel multigene families encoding highly repetitive peptide sequences. Sequence analyses of rat and mouse proline-rich protein cDNAs

J Biol Chem. 1985 Nov 5;260(25):13471-7.

Abstract

Multigene families encode the proline-rich proteins that are so prominent in human saliva and are dramatically induced in mouse and rat salivary glands by isoproterenol treatment and by feeding tannins. A cDNA encoding an acidic proline-rich protein of rat has been sequenced (Ziemer, M. A., Swain, W. F., Rutter, W. J., Clements, S., Ann, D. K., and Carlson D. M. (1984) J. Biol. Chem. 259, 10475-10480). This study presents the nucleotide sequences of five additional proline-rich protein cDNAs complementary to both mouse and rat parotid and submandibular gland mRNAs. Amino acid compositions deduced from the nucleotide sequences are typical for proline-rich proteins: 25-45% proline, 18-22% glycine, and 18-22% glutamine and generally an absence of sulfur-containing amino acids except for the initiator methionine. These proline-rich proteins display unusual repeating peptide sequences of 14-19 amino acids. The derived amino acid sequence of the cDNA insert of plasmid pMP1 from mouse has a 19-amino acid sequence which is repeated four times. The inserts of plasmids pUMP40 and pUMP4 also from mouse encode for 12 and 11 repeats of a 14-amino acid peptide, respectively. These repetitive sequences, and others from rat and mouse cDNAs and from human genomic clones, all show very high homologies and likely evolved from duplication of internal portions of an ancestral gene. Gene conversion could account for the high degree of conservation of nucleotide sequences of the repeat regions. Protein derived from the nucleotide sequences are all characterized by four general regions: a putative signal peptide, a transition region, the repetitive region, and a carboxyl-terminal region. The 5'-flanking sequences and sequences encoding the putative signal peptides are highly conserved (greater than 94%) in all six cDNAs. This sequence conservation may be important in the regulation of the biosynthesis of these unusual proteins.

Publication types

  • Research Support, Non-U.S. Gov't
  • Research Support, U.S. Gov't, P.H.S.

MeSH terms

  • Amino Acid Sequence
  • Animals
  • DNA / analysis*
  • Gene Expression Regulation
  • Mice
  • Nucleic Acid Hybridization
  • Peptides / genetics*
  • Proline-Rich Protein Domains
  • RNA, Messenger / analysis
  • Rats
  • Repetitive Sequences, Nucleic Acid
  • Sequence Homology, Nucleic Acid

Substances

  • Peptides
  • RNA, Messenger
  • DNA

Associated data

  • GENBANK/M11897
  • GENBANK/M11898
  • GENBANK/M11899
  • GENBANK/M11900
  • GENBANK/M11901
  • GENBANK/M11902
  • GENBANK/M19419