U.S. flag

An official website of the United States government

Format

Send to:

Choose Destination

SUMF2 sulfatase modifying factor 2 [ Homo sapiens (human) ]

Gene ID: 25870, updated on 3-Apr-2024

Summary

Official Symbol
SUMF2provided by HGNC
Official Full Name
sulfatase modifying factor 2provided by HGNC
Primary source
HGNC:HGNC:20415
See related
Ensembl:ENSG00000129103 MIM:607940; AllianceGenome:HGNC:20415
Gene type
protein coding
RefSeq status
REVIEWED
Organism
Homo sapiens
Lineage
Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Primates; Haplorrhini; Catarrhini; Hominidae; Homo
Also known as
pFGE
Summary
The catalytic sites of sulfatases are only active if they contain a unique amino acid, C-alpha-formylglycine (FGly). The FGly residue is posttranslationally generated from a cysteine by enzymes with FGly-generating activity. The gene described in this record is a member of the sulfatase-modifying factor family and encodes a protein with a DUF323 domain that localizes to the lumen of the endoplasmic reticulum. This protein has low levels of FGly-generating activity but can heterodimerize with another family member - a protein with high levels of FGly-generating activity. Alternate transcriptional splice variants, encoding different isoforms, have been characterized. [provided by RefSeq, Jul 2008]
Expression
Ubiquitous expression in thyroid (RPKM 45.6), kidney (RPKM 41.8) and 25 other tissues See more
Orthologs
NEW
Try the new Gene table
Try the new Transcript table

Genomic context

Location:
7p11.2
Exon count:
12
Annotation release Status Assembly Chr Location
RS_2023_10 current GRCh38.p14 (GCF_000001405.40) 7 NC_000007.14 (56064286..56087946)
RS_2023_10 current T2T-CHM13v2.0 (GCF_009914755.1) 7 NC_060931.1 (56224110..56247761)
105.20220307 previous assembly GRCh37.p13 (GCF_000001405.25) 7 NC_000007.13 (56131979..56148363)

Chromosome 7 - NC_000007.14Genomic Context describing neighboring genes Neighboring gene chaperonin containing TCP1 subunit 6A Neighboring gene small nucleolar RNA, H/ACA box 22B Neighboring gene Sharpr-MPRA regulatory region 2601 Neighboring gene ATAC-STARR-seq lymphoblastoid silent region 18195 Neighboring gene small nucleolar RNA, H/ACA box 15 Neighboring gene ATAC-STARR-seq lymphoblastoid active region 26050 Neighboring gene Sharpr-MPRA regulatory region 7863 Neighboring gene ATAC-STARR-seq lymphoblastoid active region 26051 Neighboring gene ATAC-STARR-seq lymphoblastoid active region 26052 Neighboring gene phosphorylase kinase catalytic subunit gamma 1 Neighboring gene uncharacterized LOC124901852 Neighboring gene H3K4me1 hESC enhancer GRCh37_chr7:56171568-56172068 Neighboring gene H3K4me1 hESC enhancer GRCh37_chr7:56172069-56172569 Neighboring gene ATAC-STARR-seq lymphoblastoid active region 26053 Neighboring gene coiled-coil-helix-coiled-coil-helix domain containing 2

Genomic regions, transcripts, and products

Expression

  • Project title: HPA RNA-seq normal tissues
  • Description: RNA-seq was performed of tissue samples from 95 human individuals representing 27 different tissues in order to determine tissue-specificity of all protein-coding genes
  • BioProject: PRJEB4337
  • Publication: PMID 24309898
  • Analysis date: Wed Apr 4 07:08:55 2018

Bibliography

GeneRIFs: Gene References Into Functions

What's a GeneRIF?

Phenotypes

EBI GWAS Catalog

Description
Genome wide association study (GWAS) of Chagas cardiomyopathy in Trypanosoma cruzi seropositive subjects.
EBI GWAS Catalog

HIV-1 interactions

Protein interactions

Protein Gene Interaction Pubs
Envelope surface glycoprotein gp120 env HIV-1 gp120 is identified to have a physical interaction with sulfatase modifying factor 2 (SUMF2) in human HEK293 and/or Jurkat cell lines by using affinity tagging and purification mass spectrometry analyses PubMed
Envelope surface glycoprotein gp160, precursor env HIV-1 gp160 interacts with SUMF2; predicted interaction to be within the endoplasmic reticulum PubMed

Go to the HIV-1, Human Interaction Database

Pathways from PubChem

Interactions

Products Interactant Other Gene Complex Source Pubs Description

General gene information

Markers

Clone Names

  • MGC99485, DKFZp566I1024, DKFZp686I1024, DKFZp781L1035, DKFZp686L17160

Gene Ontology Provided by GOA

Function Evidence Code Pubs
enables identical protein binding IPI
Inferred from Physical Interaction
more info
PubMed 
enables metal ion binding IEA
Inferred from Electronic Annotation
more info
 
enables protein binding IPI
Inferred from Physical Interaction
more info
PubMed 
Component Evidence Code Pubs
is_active_in endoplasmic reticulum IBA
Inferred from Biological aspect of Ancestor
more info
 
located_in endoplasmic reticulum IDA
Inferred from Direct Assay
more info
PubMed 
located_in endoplasmic reticulum lumen TAS
Traceable Author Statement
more info
 

General protein information

Preferred Names
inactive C-alpha-formylglycine-generating enzyme 2
Names
C-alpha-formyglycine-generating enzyme 2
C-alpha-formylglycine-generating enzyme 2
epididymis secretory sperm binding protein
paralog of formylglycine-generating enzyme
paralog of the formylglycine-generating enzyme

NCBI Reference Sequences (RefSeq)

NEW Try the new Transcript table

RefSeqs maintained independently of Annotated Genomes

These reference sequences exist independently of genome builds. Explain

These reference sequences are curated independently of the genome annotation cycle, so their versions may not match the RefSeq versions in the current genome build. Identify version mismatches by comparing the version of the RefSeq in this section to the one reported in Genomic regions, transcripts, and products above.

mRNA and Protein(s)

  1. NM_001042468.3NP_001035933.3  inactive C-alpha-formylglycine-generating enzyme 2 isoform a precursor

    Status: REVIEWED

    Source sequence(s)
    BC084539, BP256067, BP296294, CR936757, DA130450, KU178583
    UniProtKB/TrEMBL
    B4DLK7
    Conserved Domains (1) summary
    pfam03781
    Location:26295
    FGE-sulfatase; Sulfatase-modifying factor enzyme 1
  2. NM_001042469.3NP_001035934.3  inactive C-alpha-formylglycine-generating enzyme 2 isoform c precursor

    Status: REVIEWED

    Description
    Transcript Variant: This variant (3) lacks an alternate in-frame exon in the mid coding region, compared to variant 2, resulting in a shorter protein (isoform c), compared to isoform b.
    Source sequence(s)
    BC084539, BP296294, CR936749, CR936757, KU178583
    Consensus CDS
    CCDS43588.3
    UniProtKB/TrEMBL
    A8MXB9, B4DLK7
    Related
    ENSP00000498981.1, ENST00000651586.1
    Conserved Domains (1) summary
    pfam03781
    Location:26277
    FGE-sulfatase; Sulfatase-modifying factor enzyme 1
  3. NM_001042470.3NP_001035935.3  inactive C-alpha-formylglycine-generating enzyme 2 isoform d precursor

    Status: REVIEWED

    Description
    Transcript Variant: This variant (4) lacks three alternate exons that causes the absence of an in-frame segment in the mid coding region, compared to variant 2. The resulting isoform (d) is shorter than isoform b.
    Source sequence(s)
    BC084539, BP296294, CR936757, DA917807, KU178583
    Consensus CDS
    CCDS43589.3
    UniProtKB/TrEMBL
    E9PBT8
    Related
    ENSP00000498997.1, ENST00000650735.1
    Conserved Domains (1) summary
    cl23855
    Location:26208
    FGE-sulfatase; Sulfatase-modifying factor enzyme 1
  4. NM_001130069.4NP_001123541.2  inactive C-alpha-formylglycine-generating enzyme 2 isoform e precursor

    Status: REVIEWED

    Description
    Transcript Variant: This variant (5) lacks the penultimate coding exon, compared to variant 2, resulting in a frameshift and a protein (isoform e) with a longer and distinct C-terminus when compared to isoform b.
    Source sequence(s)
    BC000224, BC015600, DA924174, KU178583
    Consensus CDS
    CCDS47589.2
    UniProtKB/TrEMBL
    F8WA42
    Related
    ENSP00000341938.7, ENST00000342190.11
    Conserved Domains (1) summary
    pfam03781
    Location:26226
    FGE-sulfatase; Sulfatase-modifying factor enzyme 1
  5. NM_001146333.3NP_001139805.1  inactive C-alpha-formylglycine-generating enzyme 2 isoform g

    See identical proteins and their annotated locations for NP_001139805.1

    Status: REVIEWED

    Description
    Transcript Variant: This variant (7) differs in the 5' UTR, lacks a portion of the 5' coding region, and initiates translation at a downstream start codon, compared to variant 2. The encoded isoform (g) is shorter than isoform b.
    Source sequence(s)
    BC000224, BC065222, DA924174
    Consensus CDS
    CCDS55111.1
    Related
    ENSP00000275607.9, ENST00000275607.13
    Conserved Domains (2) summary
    COG1262
    Location:8207
    YfmG; Formylglycine-generating enzyme, required for sulfatase activity, contains SUMF1/FGE domain [Posttranslational modification, protein turnover, chaperones]
    pfam03781
    Location:8204
    FGE-sulfatase; Sulfatase-modifying factor enzyme 1
  6. NM_001366647.2NP_001353576.1  inactive C-alpha-formylglycine-generating enzyme 2 isoform h precursor

    Status: REVIEWED

    Source sequence(s)
    AC092101, BP357420, CV379204, DA581576
    Conserved Domains (1) summary
    pfam03781
    Location:26259
    FGE-sulfatase; Sulfatase-modifying factor enzyme 1
  7. NM_001366648.2NP_001353577.1  inactive C-alpha-formylglycine-generating enzyme 2 isoform i precursor

    Status: REVIEWED

    Source sequence(s)
    AC092101, BQ431971, DB260494, DB294596
    Consensus CDS
    CCDS94107.1
    UniProtKB/TrEMBL
    C9J660
    Related
    ENSP00000406445.1, ENST00000413756.5
    Conserved Domains (1) summary
    pfam03781
    Location:26274
    FGE-sulfatase; Sulfatase-modifying factor enzyme 1
  8. NM_001366649.2NP_001353578.1  inactive C-alpha-formylglycine-generating enzyme 2 isoform j

    Status: REVIEWED

    Source sequence(s)
    AC092101, AL528271, DA908093
  9. NM_015411.4NP_056226.3  inactive C-alpha-formylglycine-generating enzyme 2 isoform b precursor

    Status: REVIEWED

    Description
    Transcript Variant: This variant (2) represents the longest transcript and encodes isoform b.
    Source sequence(s)
    AK075477, BC084539, BP296294, CR936757, KU178583
    Consensus CDS
    CCDS5524.3
    UniProtKB/Swiss-Prot
    B4DU41, B4DWQ0, Q14DW5, Q53ZE3, Q8NBJ7, Q96BH2, Q9BRN3, Q9BWI1, Q9Y405
    UniProtKB/TrEMBL
    B4DLK7
    Related
    ENSP00000400922.3, ENST00000434526.8
    Conserved Domains (1) summary
    pfam03781
    Location:26292
    FGE-sulfatase; Sulfatase-modifying factor enzyme 1

RefSeqs of Annotated Genomes: GCF_000001405.40-RS_2023_10

The following sections contain reference sequences that belong to a specific genome build. Explain

Reference GRCh38.p14 Primary Assembly

Genomic

  1. NC_000007.14 Reference GRCh38.p14 Primary Assembly

    Range
    56064286..56087946
    Download
    GenBank, FASTA, Sequence Viewer (Graphics)

mRNA and Protein(s)

  1. XM_047420120.1XP_047276076.1  inactive C-alpha-formylglycine-generating enzyme 2 isoform X1

    Related
    ENSP00000414434.3, ENST00000413952.7
  2. XM_047420122.1XP_047276078.1  inactive C-alpha-formylglycine-generating enzyme 2 isoform X2

  3. XM_011515254.4XP_011513556.2  inactive C-alpha-formylglycine-generating enzyme 2 isoform X1

  4. XM_047420121.1XP_047276077.1  inactive C-alpha-formylglycine-generating enzyme 2 isoform X2

  5. XM_047420123.1XP_047276079.1  inactive C-alpha-formylglycine-generating enzyme 2 isoform X3

Alternate T2T-CHM13v2.0

Genomic

  1. NC_060931.1 Alternate T2T-CHM13v2.0

    Range
    56224110..56247761
    Download
    GenBank, FASTA, Sequence Viewer (Graphics)

mRNA and Protein(s)

  1. XM_054357769.1XP_054213744.1  inactive C-alpha-formylglycine-generating enzyme 2 isoform X1

  2. XM_054357771.1XP_054213746.1  inactive C-alpha-formylglycine-generating enzyme 2 isoform X2

  3. XM_054357768.1XP_054213743.1  inactive C-alpha-formylglycine-generating enzyme 2 isoform X1

  4. XM_054357770.1XP_054213745.1  inactive C-alpha-formylglycine-generating enzyme 2 isoform X2

  5. XM_054357772.1XP_054213747.1  inactive C-alpha-formylglycine-generating enzyme 2 isoform X3

Suppressed Reference Sequence(s)

The following Reference Sequences have been suppressed. Explain

  1. NM_001130070.1: Suppressed sequence

    Description
    NM_001130070.1: This RefSeq was permanently suppressed because currently there is insufficient support for the transcript and the protein.