Complete DNA sequence of yeast chromosome II

EMBO J. 1994 Dec 15;13(24):5795-809. doi: 10.1002/j.1460-2075.1994.tb06923.x.

Abstract

In the framework of the EU genome-sequencing programmes, the complete DNA sequence of the yeast Saccharomyces cerevisiae chromosome II (807 188 bp) has been determined. At present, this is the largest eukaryotic chromosome entirely sequenced. A total of 410 open reading frames (ORFs) were identified, covering 72% of the sequence. Similarity searches revealed that 124 ORFs (30%) correspond to genes of known function, 51 ORFs (12.5%) appear to be homologues of genes whose functions are known, 52 others (12.5%) have homologues the functions of which are not well defined and another 33 of the novel putative genes (8%) exhibit a degree of similarity which is insufficient to confidently assign function. Of the genes on chromosome II, 37-45% are thus of unpredicted function. Among the novel putative genes, we found several that are related to genes that perform differentiated functions in multicellular organisms of are involved in malignancy. In addition to a compact arrangement of potential protein coding sequences, the analysis of this chromosome confirmed general chromosome patterns but also revealed particular novel features of chromosomal organization. Alternating regional variations in average base composition correlate with variations in local gene density along chromosome II, as observed in chromosomes XI and III. We propose that functional ARS elements are preferably located in the AT-rich regions that have a spacing of approximately 110 kb. Similarly, the 13 tRNA genes and the three Ty elements of chromosome II are found in AT-rich regions. In chromosome II, the distribution of coding sequences between the two strands is biased, with a ratio of 1.3:1. An interesting aspect regarding the evolution of the eukaryotic genome is the finding that chromosome II has a high degree of internal genetic redundancy, amounting to 16% of the coding capacity.

Publication types

  • Comparative Study
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Base Composition
  • Base Sequence
  • Chromosome Mapping / methods*
  • Chromosomes, Fungal / genetics*
  • Cloning, Molecular
  • Cosmids / genetics
  • DNA, Fungal / genetics*
  • Genes, Fungal / genetics*
  • Molecular Sequence Data
  • Open Reading Frames
  • Quality Control
  • Repetitive Sequences, Nucleic Acid
  • Reproducibility of Results
  • Saccharomyces cerevisiae / genetics*
  • Sequence Analysis, DNA
  • Sequence Homology, Amino Acid
  • Telomere / genetics

Substances

  • DNA, Fungal

Associated data

  • GENBANK/Z35762
  • GENBANK/Z35763
  • GENBANK/Z35764
  • GENBANK/Z35765
  • GENBANK/Z35766
  • GENBANK/Z35767
  • GENBANK/Z35768
  • GENBANK/Z35769
  • GENBANK/Z35770
  • GENBANK/Z35771
  • GENBANK/Z35773
  • GENBANK/Z35774
  • GENBANK/Z35775
  • GENBANK/Z35776
  • GENBANK/Z35777
  • GENBANK/Z35778
  • GENBANK/Z35779
  • GENBANK/Z35780
  • GENBANK/Z35781
  • GENBANK/Z35782
  • GENBANK/Z35783
  • GENBANK/Z35784
  • GENBANK/Z35785
  • GENBANK/Z35786
  • GENBANK/Z35787
  • GENBANK/Z35788
  • GENBANK/Z35789
  • GENBANK/Z35790
  • GENBANK/Z35791
  • GENBANK/Z35792