U.S. flag

An official website of the United States government

Format

Send to:

Choose Destination

NSD1 nuclear receptor binding SET domain protein 1 [ Homo sapiens (human) ]

Gene ID: 64324, updated on 3-Apr-2024

Summary

Official Symbol
NSD1provided by HGNC
Official Full Name
nuclear receptor binding SET domain protein 1provided by HGNC
Primary source
HGNC:HGNC:14234
See related
Ensembl:ENSG00000165671 MIM:606681; AllianceGenome:HGNC:14234
Gene type
protein coding
RefSeq status
REVIEWED
Organism
Homo sapiens
Lineage
Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Primates; Haplorrhini; Catarrhini; Hominidae; Homo
Also known as
STO; KMT3B; SOTOS; ARA267; SOTOS1
Summary
This gene encodes a protein containing a SET domain, 2 LXXLL motifs, 3 nuclear translocation signals (NLSs), 4 plant homeodomain (PHD) finger regions, and a proline-rich region. The encoded protein enhances androgen receptor (AR) transactivation, and this enhancement can be increased further in the presence of other androgen receptor associated coregulators. This protein may act as a nucleus-localized, basic transcriptional factor and also as a bifunctional transcriptional regulator. Mutations of this gene have been associated with Sotos syndrome and Weaver syndrome. One version of childhood acute myeloid leukemia is the result of a cryptic translocation with the breakpoints occurring within nuclear receptor-binding Su-var, enhancer of zeste, and trithorax domain protein 1 on chromosome 5 and nucleoporin, 98-kd on chromosome 11. Multiple transcript variants encoding distinct isoforms have been identified for this gene. [provided by RefSeq, Sep 2018]
Expression
Ubiquitous expression in testis (RPKM 4.4), thyroid (RPKM 4.3) and 25 other tissues See more
Orthologs
NEW
Try the new Gene table
Try the new Transcript table

Genomic context

Location:
5q35.3
Exon count:
29
Annotation release Status Assembly Chr Location
RS_2023_10 current GRCh38.p14 (GCF_000001405.40) 5 NC_000005.10 (177131798..177300213)
RS_2023_10 current T2T-CHM13v2.0 (GCF_009914755.1) 5 NC_060929.1 (177675008..177843448)
105.20220307 previous assembly GRCh37.p13 (GCF_000001405.25) 5 NC_000005.9 (176560016..176727214)

Chromosome 5 - NC_000005.10Genomic Context describing neighboring genes Neighboring gene H3K27ac hESC enhancer GRCh37_chr5:176449401-176449938 Neighboring gene H3K27ac hESC enhancer GRCh37_chr5:176449939-176450475 Neighboring gene ATAC-STARR-seq lymphoblastoid silent region 16670 Neighboring gene zinc finger protein 346 Neighboring gene CDK7 strongly-dependent group 2 enhancer GRCh37_chr5:176488686-176489885 Neighboring gene FGFR4 5' regulatory region Neighboring gene H3K4me1 hESC enhancer GRCh37_chr5:176537235-176537734 Neighboring gene fibroblast growth factor receptor 4 Neighboring gene ATAC-STARR-seq lymphoblastoid active region 23693 Neighboring gene H3K4me1 hESC enhancer GRCh37_chr5:176558105-176558658 Neighboring gene NANOG-H3K27ac-H3K4me1 hESC enhancers GRCh37_chr5:176559765-176560316 and GRCh37_chr5:176560317-176560870 Neighboring gene H3K27ac hESC enhancer GRCh37_chr5:176560871-176561422 Neighboring gene H3K27ac hESC enhancer GRCh37_chr5:176561423-176561976 Neighboring gene ATAC-STARR-seq lymphoblastoid silent region 16676 Neighboring gene H3K4me1 hESC enhancer GRCh37_chr5:176692158-176693050 Neighboring gene H3K4me1 hESC enhancer GRCh37_chr5:176693051-176693942 Neighboring gene MED14-independent group 3 enhancer GRCh37_chr5:176696443-176697642 Neighboring gene ribosomal protein L21 pseudogene 60 Neighboring gene protein arginine methyltransferase 1 pseudogene 1 Neighboring gene H3K4me1 hESC enhancer GRCh37_chr5:176722701-176723202 Neighboring gene H3K4me1 hESC enhancer GRCh37_chr5:176723203-176723702 Neighboring gene ReSE screen-validated silencer GRCh37_chr5:176727388-176727581 Neighboring gene CDK7 strongly-dependent group 2 enhancer GRCh37_chr5:176728411-176729610 Neighboring gene ATAC-STARR-seq lymphoblastoid silent region 16677 Neighboring gene NANOG-H3K27ac-H3K4me1 hESC enhancer GRCh37_chr5:176730775-176731652 Neighboring gene ATAC-STARR-seq lymphoblastoid active region 23699 Neighboring gene ATAC-STARR-seq lymphoblastoid active region 23700 Neighboring gene PRELI domain containing 1 Neighboring gene RAB24, member RAS oncogene family Neighboring gene MAX dimerization protein 3

Genomic regions, transcripts, and products

Expression

  • Project title: HPA RNA-seq normal tissues
  • Description: RNA-seq was performed of tissue samples from 95 human individuals representing 27 different tissues in order to determine tissue-specificity of all protein-coding genes
  • BioProject: PRJEB4337
  • Publication: PMID 24309898
  • Analysis date: Wed Apr 4 07:08:55 2018

Bibliography

GeneRIFs: Gene References Into Functions

What's a GeneRIF?

Phenotypes

Associated conditions

Description Tests
Sotos syndrome
MedGen: C0175695 OMIM: 117550 GeneReviews: Sotos Syndrome
Compare labs

Copy number response

Description
Copy number response
Triplosensitivity

No evidence available (Last evaluated 2020-07-27)

ClinGen Genome Curation Page
Haploinsufficency

Sufficient evidence for dosage pathogenicity (Last evaluated 2020-07-27)

ClinGen Genome Curation PagePubMed

EBI GWAS Catalog

Description
Genetic associations for activated partial thromboplastin time and prothrombin time, their gene expression profiles, and risk of coronary artery disease.
EBI GWAS Catalog
Genome-wide meta-analysis identifies new susceptibility loci for migraine.
EBI GWAS Catalog

Pathways from PubChem

Interactions

Products Interactant Other Gene Complex Source Pubs Description

General gene information

Markers

Clone Names

  • FLJ10684, FLJ22263, FLJ44628, DKFZp666C163

Gene Ontology Provided by GOA

Function Evidence Code Pubs
enables RNA polymerase II cis-regulatory region sequence-specific DNA binding IDA
Inferred from Direct Assay
more info
PubMed 
enables chromatin binding ISS
Inferred from Sequence or Structural Similarity
more info
 
enables histone H3 methyltransferase activity TAS
Traceable Author Statement
more info
 
enables histone H3K36 dimethyltransferase activity IEA
Inferred from Electronic Annotation
more info
 
enables histone H3K36 methyltransferase activity IBA
Inferred from Biological aspect of Ancestor
more info
 
enables histone H3K36 methyltransferase activity IDA
Inferred from Direct Assay
more info
PubMed 
enables histone H3K36 methyltransferase activity IMP
Inferred from Mutant Phenotype
more info
PubMed 
enables histone H3K36 methyltransferase activity ISS
Inferred from Sequence or Structural Similarity
more info
 
enables histone H4K20 methyltransferase activity ISS
Inferred from Sequence or Structural Similarity
more info
 
enables nuclear androgen receptor binding IDA
Inferred from Direct Assay
more info
PubMed 
enables nuclear estrogen receptor binding ISS
Inferred from Sequence or Structural Similarity
more info
 
enables nuclear retinoic acid receptor binding ISS
Inferred from Sequence or Structural Similarity
more info
 
enables nuclear retinoid X receptor binding ISS
Inferred from Sequence or Structural Similarity
more info
 
enables nuclear thyroid hormone receptor binding ISS
Inferred from Sequence or Structural Similarity
more info
 
enables protein binding IPI
Inferred from Physical Interaction
more info
PubMed 
enables transcription coregulator activity IDA
Inferred from Direct Assay
more info
PubMed 
enables transcription corepressor activity ISS
Inferred from Sequence or Structural Similarity
more info
 
enables zinc ion binding IDA
Inferred from Direct Assay
more info
PubMed 
Component Evidence Code Pubs
part_of chromatin IBA
Inferred from Biological aspect of Ancestor
more info
 
located_in nucleoplasm TAS
Traceable Author Statement
more info
 
is_active_in nucleus IBA
Inferred from Biological aspect of Ancestor
more info
 

General protein information

Preferred Names
histone-lysine N-methyltransferase, H3 lysine-36 specific
Names
H3-K36-HMTase
H4-K20-HMTase
NR-binding SET domain-containing protein
androgen receptor coactivator 267 kDa protein
androgen receptor-associated coregulator 267
androgen receptor-associated protein of 267 kDa
histone-lysine N-methyltransferase, H3 lysine-36 and H4 lysine-20 specific
lysine N-methyltransferase 3B
nuclear receptor SET domain-containing protein 1
nuclear receptor-binding SET domain-containing protein 1
NP_001352613.2
NP_001396230.1
NP_001396231.1
NP_001396232.1
NP_001396233.1
NP_001396234.1
NP_001396235.1
NP_001396236.1
NP_001396237.1
NP_001396238.1
NP_071900.2
NP_758859.2

NCBI Reference Sequences (RefSeq)

NEW Try the new Transcript table

RefSeqs maintained independently of Annotated Genomes

These reference sequences exist independently of genome builds. Explain

These reference sequences are curated independently of the genome annotation cycle, so their versions may not match the RefSeq versions in the current genome build. Identify version mismatches by comparing the version of the RefSeq in this section to the one reported in Genomic regions, transcripts, and products above.

Genomic

  1. NG_009821.1 RefSeqGene

    Range
    5001..172135
    Download
    GenBank, FASTA, Sequence Viewer (Graphics), LRG_512

mRNA and Protein(s)

  1. NM_001365684.2NP_001352613.2  histone-lysine N-methyltransferase, H3 lysine-36 specific isoform a

    Status: REVIEWED

    Source sequence(s)
    AC008570, AC027314, AC146507
    Consensus CDS
    CCDS4413.1
    UniProtKB/TrEMBL
    A0A8I5QJP2, D6RA58
    Related
    ENSP00000343209.5, ENST00000347982.9
  2. NM_001409301.1NP_001396230.1  histone-lysine N-methyltransferase, H3 lysine-36 specific isoform b

    Status: REVIEWED

    Source sequence(s)
    AC008570, AC027314, AC146507
    UniProtKB/Swiss-Prot
    Q96L73, Q96PD8, Q96RN7
  3. NM_001409302.1NP_001396231.1  histone-lysine N-methyltransferase, H3 lysine-36 specific isoform b

    Status: REVIEWED

    Source sequence(s)
    AC008570, AC027314, AC146507
    UniProtKB/Swiss-Prot
    Q96L73, Q96PD8, Q96RN7
  4. NM_001409303.1NP_001396232.1  histone-lysine N-methyltransferase, H3 lysine-36 specific isoform b

    Status: REVIEWED

    Source sequence(s)
    AC008570, AC027314, AC146507
    UniProtKB/Swiss-Prot
    Q96L73, Q96PD8, Q96RN7
    Related
    ENSP00000508426.1, ENST00000687453.1
  5. NM_001409304.1NP_001396233.1  histone-lysine N-methyltransferase, H3 lysine-36 specific isoform c

    Status: REVIEWED

    Source sequence(s)
    AC008570, AC027314, AC146507
    Related
    ENST00000688613.1
  6. NM_001409305.1NP_001396234.1  histone-lysine N-methyltransferase, H3 lysine-36 specific isoform d

    Status: REVIEWED

    Source sequence(s)
    AC008570, AC027314, AC146507
  7. NM_001409306.1NP_001396235.1  histone-lysine N-methyltransferase, H3 lysine-36 specific isoform e

    Status: REVIEWED

    Source sequence(s)
    AC008570, AC027314, AC146507
  8. NM_001409307.1NP_001396236.1  histone-lysine N-methyltransferase, H3 lysine-36 specific isoform e

    Status: REVIEWED

    Source sequence(s)
    AC008570, AC027314, AC146507
  9. NM_001409308.1NP_001396237.1  histone-lysine N-methyltransferase, H3 lysine-36 specific isoform a

    Status: REVIEWED

    Source sequence(s)
    AC008570, AC027314, AC146507
    UniProtKB/TrEMBL
    A0A8I5QJP2, D6RA58
    Related
    ENSP00000423372.3, ENST00000508896.7
  10. NM_001409309.1NP_001396238.1  histone-lysine N-methyltransferase, H3 lysine-36 specific isoform f

    Status: REVIEWED

    Source sequence(s)
    AC008570, AC027314, AC146507
  11. NM_022455.5NP_071900.2  histone-lysine N-methyltransferase, H3 lysine-36 specific isoform b

    See identical proteins and their annotated locations for NP_071900.2

    Status: REVIEWED

    Source sequence(s)
    AC008570, AC027314, AC146507
    Consensus CDS
    CCDS4412.1
    UniProtKB/Swiss-Prot
    Q96L73, Q96PD8, Q96RN7
    UniProtKB/TrEMBL
    B2RWP5
    Related
    ENSP00000395929.2, ENST00000439151.7
    Conserved Domains (12) summary
    cd05837
    Location:319429
    MSH6_like; The PWWP domain is present in MSH6, a mismatch repair protein homologous to bacterial MutS. The PWWP domain of histone-lysine N-methyltransferase, also known as Nuclear SET domain-containing protein 3, is also included. Mutations in MSH6 have been ...
    cd05838
    Location:17541848
    WHSC1_related; The PWWP domain was first identified in the WHSC1 (Wolf-Hirschhorn syndrome candidate 1) protein, a protein implicated in Wolf-Hirschhorn syndrome (WHS). When translocated, WHSC1 plays a role in lymphoid multiple myeloma (MM) disease, also known as ...
    smart00570
    Location:18911941
    AWS; associated with SET domains
    smart00317
    Location:19422065
    SET; SET (Su(var)3-9, Enhancer-of-zeste, Trithorax) domain
    cd15648
    Location:15451587
    PHD1_NSD1_2; PHD finger 1 found in nuclear receptor-binding SET domain-containing protein NSD1 and NSD2
    cd15650
    Location:15921638
    PHD2_NSD1; PHD finger 2 found in nuclear receptor-binding SET domain-containing protein 1 (NSD1)
    cd15653
    Location:16391692
    PHD3_NSD1; PHD finger 3 found in nuclear receptor-binding SET domain-containing protein 1 (NSD1)
    cd15656
    Location:17091748
    PHD4_NSD1; PHD finger 4 found in nuclear receptor-binding SET domain-containing protein 1 (NSD1)
    cd15659
    Location:21202162
    PHD5_NSD1; PHD finger 5 found in nuclear receptor-binding SET domain-containing protein 1 (NSD1)
    cl26267
    Location:14311587
    ING; Inhibitor of growth proteins N-terminal histone-binding
    cl26386
    Location:24382663
    DNA_pol3_gamma3; DNA polymerase III subunits gamma and tau domain III
    cl26464
    Location:22082574
    Atrophin-1; Atrophin-1 family
  12. NM_172349.5NP_758859.2  histone-lysine N-methyltransferase, H3 lysine-36 specific isoform a

    Status: REVIEWED

    Source sequence(s)
    AC008570, AC027314, AC146507
    Consensus CDS
    CCDS4413.1
    UniProtKB/TrEMBL
    A0A8I5QJP2, D6RA58
    Related
    ENSP00000346111.5, ENST00000354179.9

RefSeqs of Annotated Genomes: GCF_000001405.40-RS_2023_10

The following sections contain reference sequences that belong to a specific genome build. Explain

Reference GRCh38.p14 Primary Assembly

Genomic

  1. NC_000005.10 Reference GRCh38.p14 Primary Assembly

    Range
    177131798..177300213
    Download
    GenBank, FASTA, Sequence Viewer (Graphics)

Alternate T2T-CHM13v2.0

Genomic

  1. NC_060929.1 Alternate T2T-CHM13v2.0

    Range
    177675008..177843448
    Download
    GenBank, FASTA, Sequence Viewer (Graphics)