Molecular cloning and bioinformatic analysis of SPATA4 gene

J Biochem Mol Biol. 2005 Nov 30;38(6):739-47. doi: 10.5483/bmbrep.2005.38.6.739.

Abstract

Full-length cDNA sequences of four novel SPATA4 genes in chimpanzee, cow, chicken and ascidian were identified by bioinformatic analysis using mouse or human SPATA4 cDNA fragment as electronic probe. All these genes have 6 exons and have similar protein molecular weight and do not localize in sex chromosome. The mouse SPATA4 sequence is identified as significantly changed in cryptorchidism, which shares no significant homology with any known protein in swissprot databases except for the homologous genes in various vertebrates. Our searching results showed that all SPATA4 proteins have a putative conserved domain DUF1042. The percentages of putative SPATA4 protein sequence identity ranging from 30 % to 99 %. The high similarity was also found in 1 kb promoter regions of human, mouse and rat SPATA4 gene. The similarities of the sequences upstream of SPATA4 promoter also have a high proportion. The results of searching SymAtlas (http://symatlas.gnf.org/SymAtlas/) showed that human SPATA4 has a high expression in testis, especially in testis interstitial, leydig cell, seminiferous tubule and germ cell. Mouse SPATA4 was observed exclusively in adult mouse testis and almost no signal was detected in other tissues. The pI values of the protein are negative, ranging from 9.44 to 10.15. The subcellular location of the protein is usually in the nucleus. And the signal peptide possibilities for SPATA4 are always zero. Using the SNPs data in NCBI, we found 33 SNPs in human SPATA4 gene genomic DNA region, with the distribution of 29 SNPs in the introns. CpG island searching gives the data about CpG island, which shows that the regions of the CpG island have a high similarity with each other, though the length of the CpG island is different from each other. This research is a fundamental work in the fields of the bioinformational analysis, and also put forward a new way for the bioinformatic analysis of other genes.

MeSH terms

  • Amino Acid Sequence
  • Animals
  • Base Sequence
  • Cattle
  • Cell Nucleus / metabolism
  • Chickens
  • Cloning, Molecular
  • Computational Biology / methods*
  • CpG Islands
  • Mice
  • Molecular Sequence Data
  • Pan troglodytes
  • Promoter Regions, Genetic
  • Proteins / genetics*
  • Sequence Homology, Amino Acid
  • Sequence Homology, Nucleic Acid
  • Species Specificity
  • Urochordata

Substances

  • Proteins
  • SPATA4 protein, human
  • SPATA4 protein, mouse

Associated data

  • GENBANK/AF395083
  • GENBANK/AY040204
  • GENBANK/AY651919
  • GENBANK/AY651920
  • GENBANK/AY653229
  • GENBANK/AY660661
  • GENBANK/AY970819