Isolation and characterization of an expressed hypervariable gene coding for a breast-cancer-associated antigen

Gene. 1990 Sep 14;93(2):313-8. doi: 10.1016/0378-1119(90)90242-j.

Abstract

A human gene and cDNA coding for a breast-cancer-associated antigen (H23Ag) were isolated and characterized. The gene contains two exons and one intron. Part of the second exon is a tandem repeat array (TRA) consisting of multiple 60-bp G + C-rich units. We report here the characterization of unique sequences that are found in the H23Ag gene and cDNA, in addition to the 60-bp repeats. Analysis of the cDNA sequences revealed a putative ATG start codon preceded by two overlapping initiation consensus sequences (CCACC). The open reading frame determines an amino acid (aa) sequence consisting of three regions. The first region contains an initiating methionine and a highly hydrophobic putative signal peptide. This is followed by a variable number of highly conserved 20-aa repeat units (TRA). The last region, C-terminal to TRA, contains four potential N-linked glycosylation sites. The genomic nucleotide sequences demonstrate a putative promoter region that includes a 'TATA' box. A putative estrogen regulatory element is located 5' to the promoter region. The characterization of the gene and cDNA coding for the H23Ag presented here, may help to elucidate its possible function in human breast cancer.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Amino Acid Sequence
  • Antigens, Neoplasm / genetics*
  • Base Sequence
  • Breast Neoplasms / genetics*
  • Breast Neoplasms / immunology
  • Consensus Sequence
  • DNA / chemistry
  • Genetic Variation*
  • Humans
  • Introns
  • Membrane Glycoproteins / genetics*
  • Molecular Sequence Data
  • Mucin-1
  • Mucins / genetics*
  • Neoplasm Proteins / genetics*
  • Repetitive Sequences, Nucleic Acid
  • Restriction Mapping
  • Sequence Homology, Nucleic Acid
  • TATA Box

Substances

  • Antigens, Neoplasm
  • Membrane Glycoproteins
  • Mucin-1
  • Mucins
  • Neoplasm Proteins
  • DNA

Associated data

  • GENBANK/M35093