U.S. flag

An official website of the United States government

Format

Send to:

Choose Destination

SON SON DNA and RNA binding protein [ Homo sapiens (human) ]

Gene ID: 6651, updated on 7-Apr-2024

Summary

Official Symbol
SONprovided by HGNC
Official Full Name
SON DNA and RNA binding proteinprovided by HGNC
Primary source
HGNC:HGNC:11183
See related
Ensembl:ENSG00000159140 MIM:182465; AllianceGenome:HGNC:11183
Gene type
protein coding
RefSeq status
REVIEWED
Organism
Homo sapiens
Lineage
Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Primates; Haplorrhini; Catarrhini; Hominidae; Homo
Also known as
SON3; BASS1; DBP-5; NREBP; TOKIMS; C21orf50
Summary
This gene encodes a protein that contains multiple simple repeats. The encoded protein binds RNA and promotes pre-mRNA splicing, particularly of transcripts with poor splice sites. The protein also recognizes a specific DNA sequence found in the human hepatitis B virus (HBV) and represses HBV core promoter activity. There is a pseudogene for this gene on chromosome 1. Alternative splicing results in multiple transcript variants. [provided by RefSeq, Jul 2013]
Expression
Ubiquitous expression in bone marrow (RPKM 43.7), lymph node (RPKM 33.4) and 25 other tissues See more
Orthologs
NEW
Try the new Gene table
Try the new Transcript table

Genomic context

Location:
21q22.11
Exon count:
15
Annotation release Status Assembly Chr Location
RS_2023_10 current GRCh38.p14 (GCF_000001405.40) 21 NC_000021.9 (33543038..33577481)
RS_2023_10 current T2T-CHM13v2.0 (GCF_009914755.1) 21 NC_060945.1 (31924877..31959322)
105.20220307 previous assembly GRCh37.p13 (GCF_000001405.25) 21 NC_000021.8 (34915344..34949787)

Chromosome 21 - NC_000021.9Genomic Context describing neighboring genes Neighboring gene DnaJ heat shock protein family (Hsp40) member C28 Neighboring gene phosphoribosylglycinamide formyltransferase, phosphoribosylglycinamide synthetase, phosphoribosylaminoimidazole synthetase Neighboring gene BRD4-independent group 4 enhancer GRCh37_chr21:34903402-34904601 Neighboring gene basic transcription factor 3 pseudogene 6 Neighboring gene ATAC-STARR-seq lymphoblastoid silent region 13260 Neighboring gene NANOG-H3K27ac-H3K4me1 hESC enhancer GRCh37_chr21:34914753-34915484 Neighboring gene ATAC-STARR-seq lymphoblastoid silent region 13261 Neighboring gene microRNA 6501 Neighboring gene H3K27ac hESC enhancer GRCh37_chr21:34960679-34961178 Neighboring gene DNA replication fork stabilization factor DONSON Neighboring gene crystallin zeta like 1 Neighboring gene ATAC-STARR-seq lymphoblastoid active region 18382 Neighboring gene ATAC-STARR-seq lymphoblastoid active region 18383

Genomic regions, transcripts, and products

Expression

  • Project title: HPA RNA-seq normal tissues
  • Description: RNA-seq was performed of tissue samples from 95 human individuals representing 27 different tissues in order to determine tissue-specificity of all protein-coding genes
  • BioProject: PRJEB4337
  • Publication: PMID 24309898
  • Analysis date: Wed Apr 4 07:08:55 2018

Bibliography

GeneRIFs: Gene References Into Functions

What's a GeneRIF?

Phenotypes

Associated conditions

Description Tests
ZTTK syndrome
MedGen: C4310696 OMIM: 617140 GeneReviews: Not available
Compare labs

Copy number response

Description
Copy number response
Triplosensitivity

No evidence available (Last evaluated 2019-04-24)

ClinGen Genome Curation Page
Haploinsufficency

Sufficient evidence for dosage pathogenicity (Last evaluated 2019-04-24)

ClinGen Genome Curation PagePubMed

HIV-1 interactions

Protein interactions

Protein Gene Interaction Pubs
Pr55(Gag) gag HIV-1 Gag interacts with SON as demonstrated by proximity dependent biotinylation proteomics PubMed

Go to the HIV-1, Human Interaction Database

Interactions

Products Interactant Other Gene Complex Source Pubs Description

General gene information

Markers

Clone Names

  • FLJ21099, FLJ33914, KIAA1019

Gene Ontology Provided by GOA

Function Evidence Code Pubs
enables DNA binding IEA
Inferred from Electronic Annotation
more info
 
enables RNA binding HDA PubMed 
enables RNA binding IBA
Inferred from Biological aspect of Ancestor
more info
 
enables RNA binding IDA
Inferred from Direct Assay
more info
PubMed 
enables protein binding IPI
Inferred from Physical Interaction
more info
PubMed 
Component Evidence Code Pubs
located_in nuclear speck IDA
Inferred from Direct Assay
more info
PubMed 

General protein information

Preferred Names
protein SON
Names
Bax antagonist selected in Saccharomyces 1
NRE-binding protein
SON DNA binding protein
negative regulatory element-binding protein

NCBI Reference Sequences (RefSeq)

NEW Try the new Transcript table

RefSeqs maintained independently of Annotated Genomes

These reference sequences exist independently of genome builds. Explain

These reference sequences are curated independently of the genome annotation cycle, so their versions may not match the RefSeq versions in the current genome build. Identify version mismatches by comparing the version of the RefSeq in this section to the one reported in Genomic regions, transcripts, and products above.

Genomic

  1. NG_052981.2 RefSeqGene

    Range
    5002..39445
    Download
    GenBank, FASTA, Sequence Viewer (Graphics)

mRNA and Protein(s)

  1. NM_001291411.2NP_001278340.2  protein SON isoform E

    Status: REVIEWED

    Description
    Transcript Variant: This variant (e) lacks multiple 3' coding exons and contains an alternate 3' exon, resulting in a distinct 3' coding region and 3' UTR, compared to variant f. It encodes isoform E which is shorter and has a distinct C-terminus, compared to isoform F.
    Source sequence(s)
    AP000303
    Consensus CDS
    CCDS74784.1
    Related
    ENSP00000371095.4, ENST00000381679.8
    Conserved Domains (4) summary
    PHA03247
    Location:170460
    PHA03247; large tegument protein UL36; Provisional
    PHA03379
    Location:340673
    PHA03379; EBNA-3A; Provisional
    PRK10811
    Location:12891481
    rne; ribonuclease E; Reviewed
    NF000535
    Location:730903
    MSCRAMM_SdrC; MSCRAMM family adhesin SdrC
  2. NM_001291412.3NP_001278341.1  protein SON isoform H

    Status: REVIEWED

    Description
    Transcript Variant: This variant (h) represents the allele encoded by the GRCh38 reference genome and encodes isoform (H).
    Source sequence(s)
    AP000303, AP000304
    Consensus CDS
    CCDS77624.1
    UniProtKB/TrEMBL
    A0A994J4Y9, J3QSZ5
    Related
    ENSP00000371111.2, ENST00000381692.6
    Conserved Domains (2) summary
    pfam01585
    Location:333376
    G-patch; G-patch domain
    cl00054
    Location:398441
    DSRM; Double-stranded RNA binding motif. Binding is not sequence specific but is highly specific for double stranded RNA. Found in a variety of proteins including dsRNA dependent protein kinase PKR, RNA helicases, Drosophila staufen protein, E. coli RNase III, ...
  3. NM_001412132.1NP_001399061.1  protein SON isoform I

    Status: REVIEWED

    Description
    Transcript Variant: This variant (i) uses the same exon combination as variant h but represents the allele encoded by the T2T genome assembly. The encoded isoform (I) has a slightly different sequence in the C-terminal region compared to isoform H.
    Source sequence(s)
    CP068257
    UniProtKB/TrEMBL
    Q6ZRV7
  4. NM_001412133.1NP_001399062.1  protein SON isoform J

    Status: REVIEWED

    Description
    Transcript Variant: This variant (j) uses the same exon combination as variant f but represents the allele encoded by the T2T genome assembly. The encoded isoform (J) has a slightly different sequence in the C-terminal region compared to isoform F.
    Source sequence(s)
    CP068257
  5. NM_032195.3NP_115571.3  protein SON isoform B

    Status: REVIEWED

    Description
    Transcript Variant: This variant (b) lacks multiple 3' coding exons and contains an alternate 3' exon, resulting in a distinct 3' coding region and 3' UTR, compared to variant f. The encoded isoform (B) is shorter and has a distinct C-terminus, compared to isoform F.
    Source sequence(s)
    AP000303, AP000304
    Consensus CDS
    CCDS13631.1
    Related
    ENSP00000300278.2, ENST00000300278.8
    Conserved Domains (4) summary
    PHA03247
    Location:170460
    PHA03247; large tegument protein UL36; Provisional
    PHA03379
    Location:340673
    PHA03379; EBNA-3A; Provisional
    PRK10811
    Location:12891481
    rne; ribonuclease E; Reviewed
    NF000535
    Location:730903
    MSCRAMM_SdrC; MSCRAMM family adhesin SdrC
  6. NM_138927.4NP_620305.3  protein SON isoform F

    Status: REVIEWED

    Description
    Transcript Variant: This variant (f) represents the allele encoded by the GRCh38 reference genome and encodes isoform (F).
    Source sequence(s)
    AF380184, AK307612, AP000303
    Consensus CDS
    CCDS13629.1
    UniProtKB/Swiss-Prot
    D3DSF5, D3DSF6, E7ETE8, E7EU67, E7EVW3, E9PFQ2, O14487, O95981, P18583, Q14120, Q6PKE0, Q9H7B1, Q9P070, Q9P072, Q9UKP9, Q9UPY0
    Related
    ENSP00000348984.4, ENST00000356577.10
    Conserved Domains (6) summary
    PHA03247
    Location:170460
    PHA03247; large tegument protein UL36; Provisional
    PHA03379
    Location:340673
    PHA03379; EBNA-3A; Provisional
    PRK10811
    Location:12891481
    rne; ribonuclease E; Reviewed
    NF000535
    Location:730903
    MSCRAMM_SdrC; MSCRAMM family adhesin SdrC
    pfam01585
    Location:23052349
    G-patch; G-patch domain
    cl00054
    Location:23692419
    DSRM_SF; double-stranded RNA binding motif (DSRM) superfamily

RNA

  1. NR_103797.2 RNA Sequence

    Status: REVIEWED

    Description
    Transcript Variant: This variant (c) contains an alternate internal exon and uses an alternate splice site at the 3' exon, compared to variant f. This variant is represented as non-coding because the use of the 5'-most expected translational start codon, as used in variant f, renders the transcript a candidate for nonsense-mediated mRNA decay (NMD).
    Source sequence(s)
    AP000303, AP000304
    Related
    ENST00000455528.5

RefSeqs of Annotated Genomes: GCF_000001405.40-RS_2023_10

The following sections contain reference sequences that belong to a specific genome build. Explain

Reference GRCh38.p14 Primary Assembly

Genomic

  1. NC_000021.9 Reference GRCh38.p14 Primary Assembly

    Range
    33543038..33577481
    Download
    GenBank, FASTA, Sequence Viewer (Graphics)

Alternate T2T-CHM13v2.0

Genomic

  1. NC_060945.1 Alternate T2T-CHM13v2.0

    Range
    31924877..31959322
    Download
    GenBank, FASTA, Sequence Viewer (Graphics)

Suppressed Reference Sequence(s)

The following Reference Sequences have been suppressed. Explain

  1. NM_003103.5: Suppressed sequence

    Description
    NM_003103.5: This RefSeq was permanently suppressed because currently there is insufficient support for the transcript and the protein.
  2. NM_138925.1: Suppressed sequence

    Description
    NM_138925.1: This RefSeq was permanently suppressed because currently there is insufficient support for the transcript and the protein.
  3. NR_103796.1: Suppressed sequence

    Description
    NR_103796.1: This RefSeq was permanently suppressed because it is now thought that this transcript variant does encode a protein.
  4. NR_103798.1: Suppressed sequence

    Description
    NR_103798.1: This RefSeq was temporarily suppressed because currently there is not sufficient data to support this transcript.