NCBI Home Page NCBI Site Search page NCBI Guide that lists and describes the NCBI resources
Conserved domains on  [gi|530369258|ref|XP_005263754|]
View 

pre-mRNA 3' end processing protein WDR33 isoform X2 [Homo sapiens]

Protein Classification

Graphical summary

 Zoom to residue level

show extra options »

Show site features     Horizontal zoom: ×

List of domain hits

Name Accession Description Interval E-value
WD40 cd00200
WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions ...
121-402 9.08e-67

WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment.


:

Pssm-ID: 238121 [Multi-domain]  Cd Length: 289  Bit Score: 226.83  E-value: 9.08e-67
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 530369258  121 PVFVVRWTPEGRRLVTGASSGEFTLWNGLTFNFETILQAHDSPVRAMTWSHNDMWMLTADHGGYVKYWQSNMNN-VKMFQ 199
Cdd:cd00200    11 GVTCVAFSPDGKLLATGSGDGTIKVWDLETGELLRTLKGHTGPVRDVAASADGTYLASGSSDKTIRLWDLETGEcVRTLT 90
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 530369258  200 AHKEAIREASFSPTDNKFATCSDDGTVRIWDFLRCHEERILRGHGADVKCVDWHPTKGLVVSGSKDSQqpIKFWDPKTGQ 279
Cdd:cd00200    91 GHTSYVSSVAFSPDGRILSSSSRDKTIKVWDVETGKCLTTLRGHTDWVNSVAFSPDGTFVASSSQDGT--IKLWDLRTGK 168
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 530369258  280 SLATLHAHKNTVMEVKLNLNGNWLLTASRDHLCKLFDIRNlKEELQVFRGHKKEATAVAWHPvHEGLFASGGSDGSLLFW 359
Cdd:cd00200   169 CVATLTGHTGEVNSVAFSPDGEKLLSSSSDGTIKLWDLST-GKCLGTLRGHENGVNSVAFSP-DGYLLASGSEDGTIRVW 246
                         250       260       270       280
                  ....*....|....*....|....*....|....*....|...
gi 530369258  360 HVGVEKEVGGMEmAHEGMIWSLAWHPLGHILCSGSNDHTSKFW 402
Cdd:cd00200   247 DLRTGECVQTLS-GHTNSVTSLAWSPDGKRLASGSADGTIRIW 288
gly_rich_SclB super family cl45768
LPXTG-anchored collagen-like adhesin Scl2/SclB; SclB (or Scl2 - streptococcal collagen-like ...
618-864 3.35e-10

LPXTG-anchored collagen-like adhesin Scl2/SclB; SclB (or Scl2 - streptococcal collagen-like protein 2) is an LPXTG-anchored surface-anchored adhesin with a variable-length region of triple helix-forming collagen-like Gly-Xaa-Xaa repeats.


The actual alignment was detected with superfamily member NF038329:

Pssm-ID: 468478 [Multi-domain]  Cd Length: 440  Bit Score: 63.77  E-value: 3.35e-10
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 530369258  618 GPPGPQGQFRPPGPQGQMGPQGPPlhqggggPQGFMGPQGPQGPPQGLPRPQDMHGPQGMQrhpgphgplgpqgppgpqg 697
Cdd:NF038329  123 GPAGPAGPAGEQGPRGDRGETGPA-------GPAGPPGPQGERGEKGPAGPQGEAGPQGPA------------------- 176
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 530369258  698 ssgpqghmgpqgppgpqghigpqgppGPQghlGPQGPPGTQGMQGPPGPRGMQGPPHPHGIQGGPGSQGIQGPVSQGPLM 777
Cdd:NF038329  177 --------------------------GKD---GEAGAKGPAGEKGPQGPRGETGPAGEQGPAGPAGPDGEAGPAGEDGPA 227
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 530369258  778 GLNPRGMQGPPGPRENQGPApqgmimghppqemrGPHPPGGLlgHGPQEMRGPQEIRGMQGPPPQGSMLGPPqelrGPPG 857
Cdd:NF038329  228 GPAGDGQQGPDGDPGPTGED--------------GPQGPDGP--AGKDGPRGDRGEAGPDGPDGKDGERGPV----GPAG 287

                  ....*..
gi 530369258  858 SQSQQGP 864
Cdd:NF038329  288 KDGQNGK 294
PABP-1234 super family cl31127
polyadenylate binding protein, human types 1, 2, 3, 4 family; These eukaryotic proteins ...
556-676 1.68e-05

polyadenylate binding protein, human types 1, 2, 3, 4 family; These eukaryotic proteins recognize the poly-A of mRNA and consists of four tandem RNA recognition domains at the N-terminus (rrm: pfam00076) followed by a PABP-specific domain (pfam00658) at the C-terminus. The protein is involved in the transport of mRNA's from the nucleus to the cytoplasm. There are four paralogs in Homo sapiens which are expressed in testis, platelets, broadly expressed and of unknown tissue range.


The actual alignment was detected with superfamily member TIGR01628:

Pssm-ID: 130689 [Multi-domain]  Cd Length: 562  Bit Score: 49.03  E-value: 1.68e-05
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 530369258   556 LEQLKIERLAQKQVE--QIQPP---PSSGTPLLG---PQPFPGQGPMSQIPQ-----GFQQPHPSQQMPMnmaqmGPPGP 622
Cdd:TIGR01628  360 LAQRKEQRRAHLQDQfmQLQPRmrqLPMGSPMGGamgQPPYYGQGPQQQFNGqplgwPRMSMMPTPMGPG-----GPLRP 434
                           90       100       110       120       130
                   ....*....|....*....|....*....|....*....|....*....|....*
gi 530369258   623 QGqFRPPGPQGQMGPQGPPL-HQGGGGPQGFMGPQGPQGPPQGLPRPQDMHGPQG 676
Cdd:TIGR01628  435 NG-LAPMNAVRAPSRNAQNAaQKPPMQPVMYPPNYQSLPLSQDLPQPQSTASQGG 488
PAT1 super family cl37801
Topoisomerase II-associated protein PAT1; Members of this family are necessary for accurate ...
796-946 3.71e-05

Topoisomerase II-associated protein PAT1; Members of this family are necessary for accurate chromosome transmission during cell division.


The actual alignment was detected with superfamily member pfam09770:

Pssm-ID: 401645 [Multi-domain]  Cd Length: 846  Bit Score: 48.11  E-value: 3.71e-05
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 530369258   796 PAPQGMIMGHPPQEmrGPHPPGGLLGHGPQEMRGPQEIRGMQGPPPQGSMLG-PPQELRGPPGSQSQQGPPQGSlgpppq 874
Cdd:pfam09770  211 AQQPAPAPAQPPAA--PPAQQAQQQQQFPPQIQQQQQPQQQPQQPQQHPGQGhPVTILQRPQSPQPDPAQPSIQ------ 282
                           90       100       110       120       130       140       150
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|...
gi 530369258   875 ggmqgppgpqGQQNPARGPHPSQGPIPFQQQKTPLLGDGPRAPFNQEGQ-STGPPPLIPGLGQQGAQGRIPPL 946
Cdd:pfam09770  283 ----------PQAQQFHQQPPPVPVQPTQILQNPNRLSAARVGYPQNPQpGVQPAPAHQAHRQQGSFGRQAPI 345
 
Name Accession Description Interval E-value
WD40 cd00200
WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions ...
121-402 9.08e-67

WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment.


Pssm-ID: 238121 [Multi-domain]  Cd Length: 289  Bit Score: 226.83  E-value: 9.08e-67
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 530369258  121 PVFVVRWTPEGRRLVTGASSGEFTLWNGLTFNFETILQAHDSPVRAMTWSHNDMWMLTADHGGYVKYWQSNMNN-VKMFQ 199
Cdd:cd00200    11 GVTCVAFSPDGKLLATGSGDGTIKVWDLETGELLRTLKGHTGPVRDVAASADGTYLASGSSDKTIRLWDLETGEcVRTLT 90
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 530369258  200 AHKEAIREASFSPTDNKFATCSDDGTVRIWDFLRCHEERILRGHGADVKCVDWHPTKGLVVSGSKDSQqpIKFWDPKTGQ 279
Cdd:cd00200    91 GHTSYVSSVAFSPDGRILSSSSRDKTIKVWDVETGKCLTTLRGHTDWVNSVAFSPDGTFVASSSQDGT--IKLWDLRTGK 168
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 530369258  280 SLATLHAHKNTVMEVKLNLNGNWLLTASRDHLCKLFDIRNlKEELQVFRGHKKEATAVAWHPvHEGLFASGGSDGSLLFW 359
Cdd:cd00200   169 CVATLTGHTGEVNSVAFSPDGEKLLSSSSDGTIKLWDLST-GKCLGTLRGHENGVNSVAFSP-DGYLLASGSEDGTIRVW 246
                         250       260       270       280
                  ....*....|....*....|....*....|....*....|...
gi 530369258  360 HVGVEKEVGGMEmAHEGMIWSLAWHPLGHILCSGSNDHTSKFW 402
Cdd:cd00200   247 DLRTGECVQTLS-GHTNSVTSLAWSPDGKRLASGSADGTIRIW 288
WD40 COG2319
WD40 repeat [General function prediction only];
121-402 1.10e-57

WD40 repeat [General function prediction only];


Pssm-ID: 441893 [Multi-domain]  Cd Length: 403  Bit Score: 204.76  E-value: 1.10e-57
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 530369258  121 PVFVVRWTPEGRRLVTGASSGEFTLWNGLTFNFETILQAHDSPVRAMTWSHNDMWMLTADHGGYVKYWQSNMNN-VKMFQ 199
Cdd:COG2319   122 AVRSVAFSPDGKTLASGSADGTVRLWDLATGKLLRTLTGHSGAVTSVAFSPDGKLLASGSDDGTVRLWDLATGKlLRTLT 201
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 530369258  200 AHKEAIREASFSPTDNKFATCSDDGTVRIWDFLRCHEERILRGHGADVKCVDWHPTKGLVVSGSKDSQqpIKFWDPKTGQ 279
Cdd:COG2319   202 GHTGAVRSVAFSPDGKLLASGSADGTVRLWDLATGKLLRTLTGHSGSVRSVAFSPDGRLLASGSADGT--VRLWDLATGE 279
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 530369258  280 SLATLHAHKNTVMEVKLNLNGNWLLTASRDHLCKLFDIRNlKEELQVFRGHKKEATAVAWHPvHEGLFASGGSDGSLLFW 359
Cdd:COG2319   280 LLRTLTGHSGGVNSVAFSPDGKLLASGSDDGTVRLWDLAT-GKLLRTLTGHTGAVRSVAFSP-DGKTLASGSDDGTVRLW 357
                         250       260       270       280
                  ....*....|....*....|....*....|....*....|...
gi 530369258  360 HVGVEKEVGGMEmAHEGMIWSLAWHPLGHILCSGSNDHTSKFW 402
Cdd:COG2319   358 DLATGELLRTLT-GHTGAVTSVAFSPDGRTLASGSADGTVRLW 399
gly_rich_SclB NF038329
LPXTG-anchored collagen-like adhesin Scl2/SclB; SclB (or Scl2 - streptococcal collagen-like ...
618-864 3.35e-10

LPXTG-anchored collagen-like adhesin Scl2/SclB; SclB (or Scl2 - streptococcal collagen-like protein 2) is an LPXTG-anchored surface-anchored adhesin with a variable-length region of triple helix-forming collagen-like Gly-Xaa-Xaa repeats.


Pssm-ID: 468478 [Multi-domain]  Cd Length: 440  Bit Score: 63.77  E-value: 3.35e-10
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 530369258  618 GPPGPQGQFRPPGPQGQMGPQGPPlhqggggPQGFMGPQGPQGPPQGLPRPQDMHGPQGMQrhpgphgplgpqgppgpqg 697
Cdd:NF038329  123 GPAGPAGPAGEQGPRGDRGETGPA-------GPAGPPGPQGERGEKGPAGPQGEAGPQGPA------------------- 176
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 530369258  698 ssgpqghmgpqgppgpqghigpqgppGPQghlGPQGPPGTQGMQGPPGPRGMQGPPHPHGIQGGPGSQGIQGPVSQGPLM 777
Cdd:NF038329  177 --------------------------GKD---GEAGAKGPAGEKGPQGPRGETGPAGEQGPAGPAGPDGEAGPAGEDGPA 227
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 530369258  778 GLNPRGMQGPPGPRENQGPApqgmimghppqemrGPHPPGGLlgHGPQEMRGPQEIRGMQGPPPQGSMLGPPqelrGPPG 857
Cdd:NF038329  228 GPAGDGQQGPDGDPGPTGED--------------GPQGPDGP--AGKDGPRGDRGEAGPDGPDGKDGERGPV----GPAG 287

                  ....*..
gi 530369258  858 SQSQQGP 864
Cdd:NF038329  288 KDGQNGK 294
WD40 smart00320
WD40 repeats; Note that these repeats are permuted with respect to the structural repeats ...
191-230 6.08e-09

WD40 repeats; Note that these repeats are permuted with respect to the structural repeats (blades) of the beta propeller domain.


Pssm-ID: 197651 [Multi-domain]  Cd Length: 40  Bit Score: 52.70  E-value: 6.08e-09
                            10        20        30        40
                    ....*....|....*....|....*....|....*....|
gi 530369258    191 NMNNVKMFQAHKEAIREASFSPTDNKFATCSDDGTVRIWD 230
Cdd:smart00320    1 SGELLKTLKGHTGPVTSVAFSPDGKYLASGSDDGTIKLWD 40
PTZ00421 PTZ00421
coronin; Provisional
205-319 1.33e-08

coronin; Provisional


Pssm-ID: 173611 [Multi-domain]  Cd Length: 493  Bit Score: 58.75  E-value: 1.33e-08
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 530369258  205 IREASFSPTDN-KFATCSDDGTVRIWDFlrcHEERI----------LRGHGADVKCVDWHPT-KGLVVSGSKDSQqpIKF 272
Cdd:PTZ00421   78 IIDVAFNPFDPqKLFTASEDGTIMGWGI---PEEGLtqnisdpivhLQGHTKKVGIVSFHPSaMNVLASAGADMV--VNV 152
                          90       100       110       120
                  ....*....|....*....|....*....|....*....|....*..
gi 530369258  273 WDPKTGQSLATLHAHKNTVMEVKLNLNGNWLLTASRDHLCKLFDIRN 319
Cdd:PTZ00421  153 WDVERGKAVEVIKCHSDQITSLEWNLDGSLLCTTSKDKKLNIIDPRD 199
WD40 pfam00400
WD domain, G-beta repeat;
195-230 6.40e-08

WD domain, G-beta repeat;


Pssm-ID: 459801 [Multi-domain]  Cd Length: 39  Bit Score: 49.65  E-value: 6.40e-08
                           10        20        30
                   ....*....|....*....|....*....|....*.
gi 530369258   195 VKMFQAHKEAIREASFSPTDNKFATCSDDGTVRIWD 230
Cdd:pfam00400    4 LKTLEGHTGSVTSLAFSPDGKLLASGSDDGTVKVWD 39
Med15 pfam09606
ARC105 or Med15 subunit of Mediator complex non-fungal; The approx. 70 residue Med15 domain of ...
564-932 2.85e-07

ARC105 or Med15 subunit of Mediator complex non-fungal; The approx. 70 residue Med15 domain of the ARC-Mediator co-activator is a three-helix bundle with marked similarity to the KIX domain. The sterol regulatory element binding protein (SREBP) family of transcription activators use the ARC105 subunit to activate target genes in the regulation of cholesterol and fatty acid homeostasis. In addition, Med15 is a critical transducer of gene activation signals that control early metazoan development.


Pssm-ID: 312941 [Multi-domain]  Cd Length: 732  Bit Score: 54.63  E-value: 2.85e-07
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 530369258   564 LAQKQVEQIQPPPSSGT--PLLGPQPFPGQGPMsqiPQGFQQPHPSQQMPMNmAQMGPPGPQG--QFRPPGPQGQMGPQG 639
Cdd:pfam09606   63 PQGGQGNGGMGGGQQGMpdPINALQNLAGQGTR---PQMMGPMGPGPGGPMG-QQMGGPGTASnlLASLGRPQMPMGGAG 138
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 530369258   640 PPlHQGGGGPQGFMGPQGPQGPPQGLPRPqdmhGPQGMQRHPGPHGPLGPQGPPGPQGSSGPQGHMGPQGPPGPQGHIGP 719
Cdd:pfam09606  139 FP-SQMSRVGRMQPGGQAGGMMQPSSGQP----GSGTPNQMGPNGGPGQGQAGGMNGGQQGPMGGQMPPQMGVPGMPGPA 213
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 530369258   720 QGPPGPQGHLGPQGPPGTQGMQGPPGPRGMQGPPHPHGI----------------QGGPGSQGIQGPVSQGPLMGLNPRG 783
Cdd:pfam09606  214 DAGAQMGQQAQANGGMNPQQMGGAPNQVAMQQQQPQQQGqqsqlgmginqmqqmpQGVGGGAGQGGPGQPMGPPGQQPGA 293
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 530369258   784 M------QGPPGPRENQGPAPQ----GMIMGHPPQEMRGPHPPGGLLGHGPQEM---------RGPQEIRGMQGPPPQgs 844
Cdd:pfam09606  294 MpnvmsiGDQNNYQQQQTRQQQqqqgGNHPAAHQQQMNQSVGQGGQVVALGGLNhletwnpgnFGGLGANPMQRGQPG-- 371
                          330       340       350       360       370       380       390       400
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 530369258   845 MLGPPQELRG-------PPGSQSQQGPPQGSLGPPPQGGMQGPPGPQGQQNPARGPHPSQGPIPFQQQKTPLLGDGPRAP 917
Cdd:pfam09606  372 MMSSPSPVPGqqvrqvtPNQFMRQSPQPSVPSPQGPGSQPPQSHPGGMIPSPALIPSPSPQMSQQPAQQRTIGQDSPGGS 451
                          410
                   ....*....|....*
gi 530369258   918 FNQEGQSTGPPPLIP 932
Cdd:pfam09606  452 LNTPGQSAVNSPLNP 466
SPT5 COG5164
Transcription elongation factor SPT5 [Transcription];
618-963 4.82e-07

Transcription elongation factor SPT5 [Transcription];


Pssm-ID: 444063 [Multi-domain]  Cd Length: 495  Bit Score: 53.88  E-value: 4.82e-07
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 530369258  618 GPPGPQGQFRPPGPQGQMGPqgpplhqggggpqgfmgpqgpqgppqglPRPQDMHGPQGMQrhpgphgplgpqgppgpqG 697
Cdd:COG5164    10 GPSDPGGVTTPAGSQGSTKP----------------------------AQNQGSTRPAGNT------------------G 43
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 530369258  698 SSGPQGHMgpqgppgpqghigpqgppgpqghlGPQGPPGTQGMQGPPGPRGMQGPPHPHGIQGGPGSQGIQGPVSQGPlm 777
Cdd:COG5164    44 GTRPAQNQ------------------------GSTTPAGNTGGTRPAGNQGATGPAQNQGGTTPAQNQGGTRPAGNTG-- 97
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 530369258  778 GLNPRGMQG---PPGPRENQGPAPQGMIMGHPPQEMRGPHPPGGLLGHGPqemrGPQEIRGMQGPPPQGSMLGPPQE--L 852
Cdd:COG5164    98 GTTPAGDGGatgPPDDGGATGPPDDGGSTTPPSGGSTTPPGDGGSTPPGP----GSTGPGGSTTPPGDGGSTTPPGPggS 173
                         250       260       270       280       290       300       310       320
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 530369258  853 RGPPGSQSQQGPPQGSlgpppqggmqgppgpqgqqnpARGPHPSQGPIPFQQQKTPLLGDGPRAPFNQEGQSTGPPPLIP 932
Cdd:COG5164   174 TTPPDDGGSTTPPNKG---------------------ETGTDIPTGGTPRQGPDGPVKKDDKNGKGNPPDDRGGKTGPKD 232
                         330       340       350
                  ....*....|....*....|....*....|...
gi 530369258  933 GLGQQGAQGRIPPLNPGQGPGPNKV--SEEEPR 963
Cdd:COG5164   233 QRPKTNPIERRGPERPEAAALPAELtaLEAENR 265
PABP-1234 TIGR01628
polyadenylate binding protein, human types 1, 2, 3, 4 family; These eukaryotic proteins ...
556-676 1.68e-05

polyadenylate binding protein, human types 1, 2, 3, 4 family; These eukaryotic proteins recognize the poly-A of mRNA and consists of four tandem RNA recognition domains at the N-terminus (rrm: pfam00076) followed by a PABP-specific domain (pfam00658) at the C-terminus. The protein is involved in the transport of mRNA's from the nucleus to the cytoplasm. There are four paralogs in Homo sapiens which are expressed in testis, platelets, broadly expressed and of unknown tissue range.


Pssm-ID: 130689 [Multi-domain]  Cd Length: 562  Bit Score: 49.03  E-value: 1.68e-05
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 530369258   556 LEQLKIERLAQKQVE--QIQPP---PSSGTPLLG---PQPFPGQGPMSQIPQ-----GFQQPHPSQQMPMnmaqmGPPGP 622
Cdd:TIGR01628  360 LAQRKEQRRAHLQDQfmQLQPRmrqLPMGSPMGGamgQPPYYGQGPQQQFNGqplgwPRMSMMPTPMGPG-----GPLRP 434
                           90       100       110       120       130
                   ....*....|....*....|....*....|....*....|....*....|....*
gi 530369258   623 QGqFRPPGPQGQMGPQGPPL-HQGGGGPQGFMGPQGPQGPPQGLPRPQDMHGPQG 676
Cdd:TIGR01628  435 NG-LAPMNAVRAPSRNAQNAaQKPPMQPVMYPPNYQSLPLSQDLPQPQSTASQGG 488
PHA03247 PHA03247
large tegument protein UL36; Provisional
731-972 3.46e-05

large tegument protein UL36; Provisional


Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 48.40  E-value: 3.46e-05
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 530369258  731 PQGPPGTQGMQGPPGPRGMQGPPHPHGIQGGPGSQGIQGPVSQGPLMGLNPRGMQGPPGPRENQGPAPQGMI------MG 804
Cdd:PHA03247 2757 PARPPTTAGPPAPAPPAAPAAGPPRRLTRPAVASLSESRESLPSPWDPADPPAAVLAPAAALPPAASPAGPLppptsaQP 2836
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 530369258  805 HPPQEMRGPHPP-----GGLLGHGPQEMRGPQeirGMQGPPPQGSMLGPPQELRGPPGSQSQQGPPQgslgppPQGGMQG 879
Cdd:PHA03247 2837 TAPPPPPGPPPPslplgGSVAPGGDVRRRPPS---RSPAAKPAAPARPPVRRLARPAVSRSTESFAL------PPDQPER 2907
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 530369258  880 PPGPQGQQNPARGPHPSQGPIPFQQQKTPLLGDGPRAPFNQEGQSTGPPPLIPglgqqgaQGRIPPLNPGQGPGP-NKVS 958
Cdd:PHA03247 2908 PPQPQAPPPPQPQPQPPPPPQPQPPPPPPPRPQPPLAPTTDPAGAGEPSGAVP-------QPWLGALVPGRVAVPrFRVP 2980
                         250
                  ....*....|....
gi 530369258  959 EEEPRRGMRAVLPP 972
Cdd:PHA03247 2981 QPAPSREAPASSTP 2994
PAT1 pfam09770
Topoisomerase II-associated protein PAT1; Members of this family are necessary for accurate ...
796-946 3.71e-05

Topoisomerase II-associated protein PAT1; Members of this family are necessary for accurate chromosome transmission during cell division.


Pssm-ID: 401645 [Multi-domain]  Cd Length: 846  Bit Score: 48.11  E-value: 3.71e-05
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 530369258   796 PAPQGMIMGHPPQEmrGPHPPGGLLGHGPQEMRGPQEIRGMQGPPPQGSMLG-PPQELRGPPGSQSQQGPPQGSlgpppq 874
Cdd:pfam09770  211 AQQPAPAPAQPPAA--PPAQQAQQQQQFPPQIQQQQQPQQQPQQPQQHPGQGhPVTILQRPQSPQPDPAQPSIQ------ 282
                           90       100       110       120       130       140       150
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|...
gi 530369258   875 ggmqgppgpqGQQNPARGPHPSQGPIPFQQQKTPLLGDGPRAPFNQEGQ-STGPPPLIPGLGQQGAQGRIPPL 946
Cdd:pfam09770  283 ----------PQAQQFHQQPPPVPVQPTQILQNPNRLSAARVGYPQNPQpGVQPAPAHQAHRQQGSFGRQAPI 345
PRK13729 PRK13729
conjugal transfer pilus assembly protein TraB; Provisional
509-629 1.76e-04

conjugal transfer pilus assembly protein TraB; Provisional


Pssm-ID: 184281 [Multi-domain]  Cd Length: 475  Bit Score: 45.58  E-value: 1.76e-04
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 530369258  509 MQNKVPIPAPNEV-----------LNDRKEDIKLEEKKKTQAEIEQEMATLQ-----YTNPQLLEQLKIERLAQ------ 566
Cdd:PRK13729   38 MSGNGEAVAEQEPvpdmtgvvdttFDDKVRQHATTEMQVTAAQMQKQYEEIRreldvLNKQRGDDQRRIEKLGQdnaala 117
                          90       100       110       120       130       140       150
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....
gi 530369258  567 KQVEQI---------QPPPSSGTPLLGPQ--PFPGQGPMSQIPQGFQQPHPSQQMPMNMAQMGPPGPQGQFRPP 629
Cdd:PRK13729  118 EQVKALganpvtatgEPVPQMPASPPGPEgePQPGNTPVSFPPQGSVAVPPPTAFYPGNGVTPPPQVTYQSVPV 191
PRK07764 PRK07764
DNA polymerase III subunits gamma and tau; Validated
744-989 1.60e-03

DNA polymerase III subunits gamma and tau; Validated


Pssm-ID: 236090 [Multi-domain]  Cd Length: 824  Bit Score: 42.67  E-value: 1.60e-03
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 530369258  744 PGPRGMQGPPHPHGIQGGPGSQGIQGPVSQGPlmglnprgmQGPPGPRENQGPAPQGMIMGHPPQEMRGPHPPGGLLGHG 823
Cdd:PRK07764  590 PAPGAAGGEGPPAPASSGPPEEAARPAAPAAP---------AAPAAPAPAGAAAAPAEASAAPAPGVAAPEHHPKHVAVP 660
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 530369258  824 PQEMRGPQEIRGMQGPPPQGSMLGPPqelrgPPGSQSQQGPPQGSLGPPPQGGMQGPPGPQGQQNPARGPHPSQGPIPFQ 903
Cdd:PRK07764  661 DASDGGDGWPAKAGGAAPAAPPPAPA-----PAAPAAPAGAAPAQPAPAPAATPPAGQADDPAAQPPQAAQGASAPSPAA 735
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 530369258  904 QQKTPLLGDGPRAPFNQEGQSTGPPPLIPGLGQQGAQgriPPLNPGQGPGPNKVSEEEPRRGMRAVLPPEEGMVFLVLKT 983
Cdd:PRK07764  736 DDPVPLPPEPDDPPDPAGAPAQPPPPPAPAPAAAPAA---APPPSPPSEEEEMAEDDAPSMDDEDRRDAEEVAMELLEEE 812

                  ....*.
gi 530369258  984 LVQRRI 989
Cdd:PRK07764  813 LGAKKI 818
PAT1 pfam09770
Topoisomerase II-associated protein PAT1; Members of this family are necessary for accurate ...
533-643 1.99e-03

Topoisomerase II-associated protein PAT1; Members of this family are necessary for accurate chromosome transmission during cell division.


Pssm-ID: 401645 [Multi-domain]  Cd Length: 846  Bit Score: 42.33  E-value: 1.99e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 530369258   533 EKKKTQAEIEQEMATLQYTNPQL------LEQLKIERLAQKQVEQIQPPPSSGTPLLGPQPFPGQGPMSQIPQGFQQPHP 606
Cdd:pfam09770  167 PKKAAAPAPAPQPAAQPASLPAPsrkmmsLEEVEAAMRAQAKKPAQQPAPAPAQPPAAPPAQQAQQQQQFPPQIQQQQQP 246
                           90       100       110       120
                   ....*....|....*....|....*....|....*....|....
gi 530369258   607 sQQMPMNMAQMGPPGP-------QGQFRPPGPQGQMGPQGPPLH 643
Cdd:pfam09770  247 -QQQPQQPQQHPGQGHpvtilqrPQSPQPDPAQPSIQPQAQQFH 289
 
Name Accession Description Interval E-value
WD40 cd00200
WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions ...
121-402 9.08e-67

WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment.


Pssm-ID: 238121 [Multi-domain]  Cd Length: 289  Bit Score: 226.83  E-value: 9.08e-67
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 530369258  121 PVFVVRWTPEGRRLVTGASSGEFTLWNGLTFNFETILQAHDSPVRAMTWSHNDMWMLTADHGGYVKYWQSNMNN-VKMFQ 199
Cdd:cd00200    11 GVTCVAFSPDGKLLATGSGDGTIKVWDLETGELLRTLKGHTGPVRDVAASADGTYLASGSSDKTIRLWDLETGEcVRTLT 90
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 530369258  200 AHKEAIREASFSPTDNKFATCSDDGTVRIWDFLRCHEERILRGHGADVKCVDWHPTKGLVVSGSKDSQqpIKFWDPKTGQ 279
Cdd:cd00200    91 GHTSYVSSVAFSPDGRILSSSSRDKTIKVWDVETGKCLTTLRGHTDWVNSVAFSPDGTFVASSSQDGT--IKLWDLRTGK 168
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 530369258  280 SLATLHAHKNTVMEVKLNLNGNWLLTASRDHLCKLFDIRNlKEELQVFRGHKKEATAVAWHPvHEGLFASGGSDGSLLFW 359
Cdd:cd00200   169 CVATLTGHTGEVNSVAFSPDGEKLLSSSSDGTIKLWDLST-GKCLGTLRGHENGVNSVAFSP-DGYLLASGSEDGTIRVW 246
                         250       260       270       280
                  ....*....|....*....|....*....|....*....|...
gi 530369258  360 HVGVEKEVGGMEmAHEGMIWSLAWHPLGHILCSGSNDHTSKFW 402
Cdd:cd00200   247 DLRTGECVQTLS-GHTNSVTSLAWSPDGKRLASGSADGTIRIW 288
WD40 COG2319
WD40 repeat [General function prediction only];
121-402 1.10e-57

WD40 repeat [General function prediction only];


Pssm-ID: 441893 [Multi-domain]  Cd Length: 403  Bit Score: 204.76  E-value: 1.10e-57
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 530369258  121 PVFVVRWTPEGRRLVTGASSGEFTLWNGLTFNFETILQAHDSPVRAMTWSHNDMWMLTADHGGYVKYWQSNMNN-VKMFQ 199
Cdd:COG2319   122 AVRSVAFSPDGKTLASGSADGTVRLWDLATGKLLRTLTGHSGAVTSVAFSPDGKLLASGSDDGTVRLWDLATGKlLRTLT 201
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 530369258  200 AHKEAIREASFSPTDNKFATCSDDGTVRIWDFLRCHEERILRGHGADVKCVDWHPTKGLVVSGSKDSQqpIKFWDPKTGQ 279
Cdd:COG2319   202 GHTGAVRSVAFSPDGKLLASGSADGTVRLWDLATGKLLRTLTGHSGSVRSVAFSPDGRLLASGSADGT--VRLWDLATGE 279
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 530369258  280 SLATLHAHKNTVMEVKLNLNGNWLLTASRDHLCKLFDIRNlKEELQVFRGHKKEATAVAWHPvHEGLFASGGSDGSLLFW 359
Cdd:COG2319   280 LLRTLTGHSGGVNSVAFSPDGKLLASGSDDGTVRLWDLAT-GKLLRTLTGHTGAVRSVAFSP-DGKTLASGSDDGTVRLW 357
                         250       260       270       280
                  ....*....|....*....|....*....|....*....|...
gi 530369258  360 HVGVEKEVGGMEmAHEGMIWSLAWHPLGHILCSGSNDHTSKFW 402
Cdd:COG2319   358 DLATGELLRTLT-GHTGAVTSVAFSPDGRTLASGSADGTVRLW 399
WD40 COG2319
WD40 repeat [General function prediction only];
121-402 1.86e-52

WD40 repeat [General function prediction only];


Pssm-ID: 441893 [Multi-domain]  Cd Length: 403  Bit Score: 189.74  E-value: 1.86e-52
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 530369258  121 PVFVVRWTPEGRRLVTGASSGEFTLWNGLTFNFETILQAHDSPVRAMTWSHNDMWMLTADHGGYVKYWQ-SNMNNVKMFQ 199
Cdd:COG2319    38 AVASLAASPDGARLAAGAGDLTLLLLDAAAGALLATLLGHTAAVLSVAFSPDGRLLASASADGTVRLWDlATGLLLRTLT 117
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 530369258  200 AHKEAIREASFSPTDNKFATCSDDGTVRIWDFLRCHEERILRGHGADVKCVDWHPTKGLVVSGSKDSQqpIKFWDPKTGQ 279
Cdd:COG2319   118 GHTGAVRSVAFSPDGKTLASGSADGTVRLWDLATGKLLRTLTGHSGAVTSVAFSPDGKLLASGSDDGT--VRLWDLATGK 195
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 530369258  280 SLATLHAHKNTVMEVKLNLNGNWLLTASRDHLCKLFDIRNlKEELQVFRGHKKEATAVAWHPvhEG-LFASGGSDGSLLF 358
Cdd:COG2319   196 LLRTLTGHTGAVRSVAFSPDGKLLASGSADGTVRLWDLAT-GKLLRTLTGHSGSVRSVAFSP--DGrLLASGSADGTVRL 272
                         250       260       270       280
                  ....*....|....*....|....*....|....*....|....
gi 530369258  359 WHVGvEKEVGGMEMAHEGMIWSLAWHPLGHILCSGSNDHTSKFW 402
Cdd:COG2319   273 WDLA-TGELLRTLTGHSGGVNSVAFSPDGKLLASGSDDGTVRLW 315
WD40 COG2319
WD40 repeat [General function prediction only];
121-361 1.54e-47

WD40 repeat [General function prediction only];


Pssm-ID: 441893 [Multi-domain]  Cd Length: 403  Bit Score: 175.48  E-value: 1.54e-47
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 530369258  121 PVFVVRWTPEGRRLVTGASSGEFTLWNGLTFNFETILQAHDSPVRAMTWSHNDMWMLTADHGGYVKYWqsNMNN---VKM 197
Cdd:COG2319   164 AVTSVAFSPDGKLLASGSDDGTVRLWDLATGKLLRTLTGHTGAVRSVAFSPDGKLLASGSADGTVRLW--DLATgklLRT 241
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 530369258  198 FQAHKEAIREASFSPTDNKFATCSDDGTVRIWDFLRCHEERILRGHGADVKCVDWHPTKGLVVSGSKDSQqpIKFWDPKT 277
Cdd:COG2319   242 LTGHSGSVRSVAFSPDGRLLASGSADGTVRLWDLATGELLRTLTGHSGGVNSVAFSPDGKLLASGSDDGT--VRLWDLAT 319
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 530369258  278 GQSLATLHAHKNTVMEVKLNLNGNWLLTASRDHLCKLFDIRNlKEELQVFRGHKKEATAVAWHPvHEGLFASGGSDGSLL 357
Cdd:COG2319   320 GKLLRTLTGHTGAVRSVAFSPDGKTLASGSDDGTVRLWDLAT-GELLRTLTGHTGAVTSVAFSP-DGRTLASGSADGTVR 397

                  ....
gi 530369258  358 FWHV 361
Cdd:COG2319   398 LWDL 401
WD40 cd00200
WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions ...
194-402 3.36e-43

WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment.


Pssm-ID: 238121 [Multi-domain]  Cd Length: 289  Bit Score: 159.42  E-value: 3.36e-43
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 530369258  194 NVKMFQAHKEAIREASFSPTDNKFATCSDDGTVRIWDFLRCHEERILRGHGADVKCVDWHPTKGLVVSGSKDSQqpIKFW 273
Cdd:cd00200     1 LRRTLKGHTGGVTCVAFSPDGKLLATGSGDGTIKVWDLETGELLRTLKGHTGPVRDVAASADGTYLASGSSDKT--IRLW 78
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 530369258  274 DPKTGQSLATLHAHKNTVMEVKLNLNGNWLLTASRDHLCKLFDIRNlKEELQVFRGHKKEATAVAWHPvhEGLFASGGS- 352
Cdd:cd00200    79 DLETGECVRTLTGHTSYVSSVAFSPDGRILSSSSRDKTIKVWDVET-GKCLTTLRGHTDWVNSVAFSP--DGTFVASSSq 155
                         170       180       190       200       210
                  ....*....|....*....|....*....|....*....|....*....|
gi 530369258  353 DGSLLFWHVGVEKEVGGMEmAHEGMIWSLAWHPLGHILCSGSNDHTSKFW 402
Cdd:cd00200   156 DGTIKLWDLRTGKCVATLT-GHTGEVNSVAFSPDGEKLLSSSSDGTIKLW 204
WD40 COG2319
WD40 repeat [General function prediction only];
126-402 2.54e-42

WD40 repeat [General function prediction only];


Pssm-ID: 441893 [Multi-domain]  Cd Length: 403  Bit Score: 160.08  E-value: 2.54e-42
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 530369258  126 RWTPEGRRLVTGASSGEFTLWNGLTFNFETILQAHDSPVRAMTWSHND-MWMLTADHGGYVKYWQSNMNNVKMFQAHKEA 204
Cdd:COG2319     1 ALSADGAALAAASADLALALLAAALGALLLLLLGLAAAVASLAASPDGaRLAAGAGDLTLLLLDAAAGALLATLLGHTAA 80
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 530369258  205 IREASFSPTDNKFATCSDDGTVRIWDFLRCHEERILRGHGADVKCVDWHPTKGLVVSGSKDSQqpIKFWDPKTGQSLATL 284
Cdd:COG2319    81 VLSVAFSPDGRLLASASADGTVRLWDLATGLLLRTLTGHTGAVRSVAFSPDGKTLASGSADGT--VRLWDLATGKLLRTL 158
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 530369258  285 HAHKNTVMEVKLNLNGNWLLTASRDHLCKLFDIRNLKeELQVFRGHKKEATAVAWHPvhEG-LFASGGSDGSLLFWHVGV 363
Cdd:COG2319   159 TGHSGAVTSVAFSPDGKLLASGSDDGTVRLWDLATGK-LLRTLTGHTGAVRSVAFSP--DGkLLASGSADGTVRLWDLAT 235
                         250       260       270
                  ....*....|....*....|....*....|....*....
gi 530369258  364 EKEVGGMEmAHEGMIWSLAWHPLGHILCSGSNDHTSKFW 402
Cdd:COG2319   236 GKLLRTLT-GHSGSVRSVAFSPDGRLLASGSADGTVRLW 273
WD40 cd00200
WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions ...
119-359 3.45e-39

WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment.


Pssm-ID: 238121 [Multi-domain]  Cd Length: 289  Bit Score: 147.48  E-value: 3.45e-39
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 530369258  119 KCPVFVVRWTPEGRRLVTGASSGEFTLWNGLTFNFETILQAHDSPVRAmtwshndmwmltadhggyvkywqsnmnnvkmf 198
Cdd:cd00200    93 TSYVSSVAFSPDGRILSSSSRDKTIKVWDVETGKCLTTLRGHTDWVNS-------------------------------- 140
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 530369258  199 qahkeaireASFSPtDNKF-ATCSDDGTVRIWDFLRCHEERILRGHGADVKCVDWHPTKGLVVSGSKDSQqpIKFWDPKT 277
Cdd:cd00200   141 ---------VAFSP-DGTFvASSSQDGTIKLWDLRTGKCVATLTGHTGEVNSVAFSPDGEKLLSSSSDGT--IKLWDLST 208
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 530369258  278 GQSLATLHAHKNTVMEVKLNLNGNWLLTASRDHLCKLFDIRNlKEELQVFRGHKKEATAVAWHPvHEGLFASGGSDGSLL 357
Cdd:cd00200   209 GKCLGTLRGHENGVNSVAFSPDGYLLASGSEDGTIRVWDLRT-GECVQTLSGHTNSVTSLAWSP-DGKRLASGSADGTIR 286

                  ..
gi 530369258  358 FW 359
Cdd:cd00200   287 IW 288
WD40 COG2319
WD40 repeat [General function prediction only];
121-319 5.91e-39

WD40 repeat [General function prediction only];


Pssm-ID: 441893 [Multi-domain]  Cd Length: 403  Bit Score: 150.45  E-value: 5.91e-39
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 530369258  121 PVFVVRWTPEGRRLVTGASSGEFTLWNGLTFNFETILQAHDSPVRAMTWSHNDMWMLTADHGGYVKYWQ-SNMNNVKMFQ 199
Cdd:COG2319   206 AVRSVAFSPDGKLLASGSADGTVRLWDLATGKLLRTLTGHSGSVRSVAFSPDGRLLASGSADGTVRLWDlATGELLRTLT 285
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 530369258  200 AHKEAIREASFSPTDNKFATCSDDGTVRIWDFLRCHEERILRGHGADVKCVDWHPTKGLVVSGSKDSQqpIKFWDPKTGQ 279
Cdd:COG2319   286 GHSGGVNSVAFSPDGKLLASGSDDGTVRLWDLATGKLLRTLTGHTGAVRSVAFSPDGKTLASGSDDGT--VRLWDLATGE 363
                         170       180       190       200
                  ....*....|....*....|....*....|....*....|
gi 530369258  280 SLATLHAHKNTVMEVKLNLNGNWLLTASRDHLCKLFDIRN 319
Cdd:COG2319   364 LLRTLTGHTGAVTSVAFSPDGRTLASGSADGTVRLWDLAT 403
WD40 cd00200
WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions ...
119-274 1.25e-28

WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment.


Pssm-ID: 238121 [Multi-domain]  Cd Length: 289  Bit Score: 117.05  E-value: 1.25e-28
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 530369258  119 KCPVFVVRWTPEGRRLVTGASSGEFTLWNGLTFNFETILQAHDSPVRAMTWSHNDMWMLTADHGGYVKYWQSNM-NNVKM 197
Cdd:cd00200   135 TDWVNSVAFSPDGTFVASSSQDGTIKLWDLRTGKCVATLTGHTGEVNSVAFSPDGEKLLSSSSDGTIKLWDLSTgKCLGT 214
                          90       100       110       120       130       140       150
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*..
gi 530369258  198 FQAHKEAIREASFSPTDNKFATCSDDGTVRIWDFLRCHEERILRGHGADVKCVDWHPTKGLVVSGSKDSQqpIKFWD 274
Cdd:cd00200   215 LRGHENGVNSVAFSPDGYLLASGSEDGTIRVWDLRTGECVQTLSGHTNSVTSLAWSPDGKRLASGSADGT--IRIWD 289
gly_rich_SclB NF038329
LPXTG-anchored collagen-like adhesin Scl2/SclB; SclB (or Scl2 - streptococcal collagen-like ...
618-864 3.35e-10

LPXTG-anchored collagen-like adhesin Scl2/SclB; SclB (or Scl2 - streptococcal collagen-like protein 2) is an LPXTG-anchored surface-anchored adhesin with a variable-length region of triple helix-forming collagen-like Gly-Xaa-Xaa repeats.


Pssm-ID: 468478 [Multi-domain]  Cd Length: 440  Bit Score: 63.77  E-value: 3.35e-10
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 530369258  618 GPPGPQGQFRPPGPQGQMGPQGPPlhqggggPQGFMGPQGPQGPPQGLPRPQDMHGPQGMQrhpgphgplgpqgppgpqg 697
Cdd:NF038329  123 GPAGPAGPAGEQGPRGDRGETGPA-------GPAGPPGPQGERGEKGPAGPQGEAGPQGPA------------------- 176
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 530369258  698 ssgpqghmgpqgppgpqghigpqgppGPQghlGPQGPPGTQGMQGPPGPRGMQGPPHPHGIQGGPGSQGIQGPVSQGPLM 777
Cdd:NF038329  177 --------------------------GKD---GEAGAKGPAGEKGPQGPRGETGPAGEQGPAGPAGPDGEAGPAGEDGPA 227
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 530369258  778 GLNPRGMQGPPGPRENQGPApqgmimghppqemrGPHPPGGLlgHGPQEMRGPQEIRGMQGPPPQGSMLGPPqelrGPPG 857
Cdd:NF038329  228 GPAGDGQQGPDGDPGPTGED--------------GPQGPDGP--AGKDGPRGDRGEAGPDGPDGKDGERGPV----GPAG 287

                  ....*..
gi 530369258  858 SQSQQGP 864
Cdd:NF038329  288 KDGQNGK 294
WD40 smart00320
WD40 repeats; Note that these repeats are permuted with respect to the structural repeats ...
191-230 6.08e-09

WD40 repeats; Note that these repeats are permuted with respect to the structural repeats (blades) of the beta propeller domain.


Pssm-ID: 197651 [Multi-domain]  Cd Length: 40  Bit Score: 52.70  E-value: 6.08e-09
                            10        20        30        40
                    ....*....|....*....|....*....|....*....|
gi 530369258    191 NMNNVKMFQAHKEAIREASFSPTDNKFATCSDDGTVRIWD 230
Cdd:smart00320    1 SGELLKTLKGHTGPVTSVAFSPDGKYLASGSDDGTIKLWD 40
PTZ00421 PTZ00421
coronin; Provisional
205-319 1.33e-08

coronin; Provisional


Pssm-ID: 173611 [Multi-domain]  Cd Length: 493  Bit Score: 58.75  E-value: 1.33e-08
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 530369258  205 IREASFSPTDN-KFATCSDDGTVRIWDFlrcHEERI----------LRGHGADVKCVDWHPT-KGLVVSGSKDSQqpIKF 272
Cdd:PTZ00421   78 IIDVAFNPFDPqKLFTASEDGTIMGWGI---PEEGLtqnisdpivhLQGHTKKVGIVSFHPSaMNVLASAGADMV--VNV 152
                          90       100       110       120
                  ....*....|....*....|....*....|....*....|....*..
gi 530369258  273 WDPKTGQSLATLHAHKNTVMEVKLNLNGNWLLTASRDHLCKLFDIRN 319
Cdd:PTZ00421  153 WDVERGKAVEVIKCHSDQITSLEWNLDGSLLCTTSKDKKLNIIDPRD 199
WD40 pfam00400
WD domain, G-beta repeat;
195-230 6.40e-08

WD domain, G-beta repeat;


Pssm-ID: 459801 [Multi-domain]  Cd Length: 39  Bit Score: 49.65  E-value: 6.40e-08
                           10        20        30
                   ....*....|....*....|....*....|....*.
gi 530369258   195 VKMFQAHKEAIREASFSPTDNKFATCSDDGTVRIWD 230
Cdd:pfam00400    4 LKTLEGHTGSVTSLAFSPDGKLLASGSDDGTVKVWD 39
PLN00181 PLN00181
protein SPA1-RELATED; Provisional
215-359 1.53e-07

protein SPA1-RELATED; Provisional


Pssm-ID: 177776 [Multi-domain]  Cd Length: 793  Bit Score: 55.86  E-value: 1.53e-07
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 530369258  215 NKFATCSDDGTVRIWDFLRCHEERILRGHGADVKCVDWH---PTkgLVVSGSKDSQqpIKFWDPKTGQSLATLHAHKNTV 291
Cdd:PLN00181  546 SQVASSNFEGVVQVWDVARSQLVTEMKEHEKRVWSIDYSsadPT--LLASGSDDGS--VKLWSINQGVSIGTIKTKANIC 621
                          90       100       110       120       130       140
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*...
gi 530369258  292 MEVKLNLNGNWLLTASRDHLCKLFDIRNLKEELQVFRGHKKEATAVAWhpVHEGLFASGGSDGSLLFW 359
Cdd:PLN00181  622 CVQFPSESGRSLAFGSADHKVYYYDLRNPKLPLCTMIGHSKTVSYVRF--VDSSTLVSSSTDNTLKLW 687
Med15 pfam09606
ARC105 or Med15 subunit of Mediator complex non-fungal; The approx. 70 residue Med15 domain of ...
564-932 2.85e-07

ARC105 or Med15 subunit of Mediator complex non-fungal; The approx. 70 residue Med15 domain of the ARC-Mediator co-activator is a three-helix bundle with marked similarity to the KIX domain. The sterol regulatory element binding protein (SREBP) family of transcription activators use the ARC105 subunit to activate target genes in the regulation of cholesterol and fatty acid homeostasis. In addition, Med15 is a critical transducer of gene activation signals that control early metazoan development.


Pssm-ID: 312941 [Multi-domain]  Cd Length: 732  Bit Score: 54.63  E-value: 2.85e-07
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 530369258   564 LAQKQVEQIQPPPSSGT--PLLGPQPFPGQGPMsqiPQGFQQPHPSQQMPMNmAQMGPPGPQG--QFRPPGPQGQMGPQG 639
Cdd:pfam09606   63 PQGGQGNGGMGGGQQGMpdPINALQNLAGQGTR---PQMMGPMGPGPGGPMG-QQMGGPGTASnlLASLGRPQMPMGGAG 138
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 530369258   640 PPlHQGGGGPQGFMGPQGPQGPPQGLPRPqdmhGPQGMQRHPGPHGPLGPQGPPGPQGSSGPQGHMGPQGPPGPQGHIGP 719
Cdd:pfam09606  139 FP-SQMSRVGRMQPGGQAGGMMQPSSGQP----GSGTPNQMGPNGGPGQGQAGGMNGGQQGPMGGQMPPQMGVPGMPGPA 213
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 530369258   720 QGPPGPQGHLGPQGPPGTQGMQGPPGPRGMQGPPHPHGI----------------QGGPGSQGIQGPVSQGPLMGLNPRG 783
Cdd:pfam09606  214 DAGAQMGQQAQANGGMNPQQMGGAPNQVAMQQQQPQQQGqqsqlgmginqmqqmpQGVGGGAGQGGPGQPMGPPGQQPGA 293
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 530369258   784 M------QGPPGPRENQGPAPQ----GMIMGHPPQEMRGPHPPGGLLGHGPQEM---------RGPQEIRGMQGPPPQgs 844
Cdd:pfam09606  294 MpnvmsiGDQNNYQQQQTRQQQqqqgGNHPAAHQQQMNQSVGQGGQVVALGGLNhletwnpgnFGGLGANPMQRGQPG-- 371
                          330       340       350       360       370       380       390       400
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 530369258   845 MLGPPQELRG-------PPGSQSQQGPPQGSLGPPPQGGMQGPPGPQGQQNPARGPHPSQGPIPFQQQKTPLLGDGPRAP 917
Cdd:pfam09606  372 MMSSPSPVPGqqvrqvtPNQFMRQSPQPSVPSPQGPGSQPPQSHPGGMIPSPALIPSPSPQMSQQPAQQRTIGQDSPGGS 451
                          410
                   ....*....|....*
gi 530369258   918 FNQEGQSTGPPPLIP 932
Cdd:pfam09606  452 LNTPGQSAVNSPLNP 466
SPT5 COG5164
Transcription elongation factor SPT5 [Transcription];
618-963 4.82e-07

Transcription elongation factor SPT5 [Transcription];


Pssm-ID: 444063 [Multi-domain]  Cd Length: 495  Bit Score: 53.88  E-value: 4.82e-07
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 530369258  618 GPPGPQGQFRPPGPQGQMGPqgpplhqggggpqgfmgpqgpqgppqglPRPQDMHGPQGMQrhpgphgplgpqgppgpqG 697
Cdd:COG5164    10 GPSDPGGVTTPAGSQGSTKP----------------------------AQNQGSTRPAGNT------------------G 43
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 530369258  698 SSGPQGHMgpqgppgpqghigpqgppgpqghlGPQGPPGTQGMQGPPGPRGMQGPPHPHGIQGGPGSQGIQGPVSQGPlm 777
Cdd:COG5164    44 GTRPAQNQ------------------------GSTTPAGNTGGTRPAGNQGATGPAQNQGGTTPAQNQGGTRPAGNTG-- 97
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 530369258  778 GLNPRGMQG---PPGPRENQGPAPQGMIMGHPPQEMRGPHPPGGLLGHGPqemrGPQEIRGMQGPPPQGSMLGPPQE--L 852
Cdd:COG5164    98 GTTPAGDGGatgPPDDGGATGPPDDGGSTTPPSGGSTTPPGDGGSTPPGP----GSTGPGGSTTPPGDGGSTTPPGPggS 173
                         250       260       270       280       290       300       310       320
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 530369258  853 RGPPGSQSQQGPPQGSlgpppqggmqgppgpqgqqnpARGPHPSQGPIPFQQQKTPLLGDGPRAPFNQEGQSTGPPPLIP 932
Cdd:COG5164   174 TTPPDDGGSTTPPNKG---------------------ETGTDIPTGGTPRQGPDGPVKKDDKNGKGNPPDDRGGKTGPKD 232
                         330       340       350
                  ....*....|....*....|....*....|...
gi 530369258  933 GLGQQGAQGRIPPLNPGQGPGPNKV--SEEEPR 963
Cdd:COG5164   233 QRPKTNPIERRGPERPEAAALPAELtaLEAENR 265
WD40 smart00320
WD40 repeats; Note that these repeats are permuted with respect to the structural repeats ...
238-274 4.94e-07

WD40 repeats; Note that these repeats are permuted with respect to the structural repeats (blades) of the beta propeller domain.


Pssm-ID: 197651 [Multi-domain]  Cd Length: 40  Bit Score: 47.31  E-value: 4.94e-07
                            10        20        30
                    ....*....|....*....|....*....|....*..
gi 530369258    238 RILRGHGADVKCVDWHPTKGLVVSGSKDSQqpIKFWD 274
Cdd:smart00320    6 KTLKGHTGPVTSVAFSPDGKYLASGSDDGT--IKLWD 40
Med15 pfam09606
ARC105 or Med15 subunit of Mediator complex non-fungal; The approx. 70 residue Med15 domain of ...
730-950 2.22e-06

ARC105 or Med15 subunit of Mediator complex non-fungal; The approx. 70 residue Med15 domain of the ARC-Mediator co-activator is a three-helix bundle with marked similarity to the KIX domain. The sterol regulatory element binding protein (SREBP) family of transcription activators use the ARC105 subunit to activate target genes in the regulation of cholesterol and fatty acid homeostasis. In addition, Med15 is a critical transducer of gene activation signals that control early metazoan development.


Pssm-ID: 312941 [Multi-domain]  Cd Length: 732  Bit Score: 51.93  E-value: 2.22e-06
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 530369258   730 GPQGPPGTQgmQGPPG--------------PRGMQGPPHPHGIQGGPGSQGIQGPVSQgPLMGLNPRGMQGPPGPRENQG 795
Cdd:pfam09606  105 GPGGPMGQQ--MGGPGtasnllaslgrpqmPMGGAGFPSQMSRVGRMQPGGQAGGMMQ-PSSGQPGSGTPNQMGPNGGPG 181
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 530369258   796 --------PAPQGMIMGHPPQEMRGPHPPGGLLGHGpqemRGPQEIRGMQGPPPQGSMLGPPQELRgppgsqsQQGPPQG 867
Cdd:pfam09606  182 qgqaggmnGGQQGPMGGQMPPQMGVPGMPGPADAGA----QMGQQAQANGGMNPQQMGGAPNQVAM-------QQQQPQQ 250
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 530369258   868 SLGPPPQGGMQGPPGPQGQQNPARGPH--PSQ-GPIPFQQQKTPLLGDGPRAPFNQEGQSTGPPplipglgQQGAQGRIP 944
Cdd:pfam09606  251 QGQQSQLGMGINQMQQMPQGVGGGAGQggPGQpMGPPGQQPGAMPNVMSIGDQNNYQQQQTRQQ-------QQQQGGNHP 323

                   ....*.
gi 530369258   945 PLNPGQ 950
Cdd:pfam09606  324 AAHQQQ 329
WD40 pfam00400
WD domain, G-beta repeat;
238-274 2.84e-06

WD domain, G-beta repeat;


Pssm-ID: 459801 [Multi-domain]  Cd Length: 39  Bit Score: 45.03  E-value: 2.84e-06
                           10        20        30
                   ....*....|....*....|....*....|....*..
gi 530369258   238 RILRGHGADVKCVDWHPTKGLVVSGSKDSQqpIKFWD 274
Cdd:pfam00400    5 KTLEGHTGSVTSLAFSPDGKLLASGSDDGT--VKVWD 39
WD40 smart00320
WD40 repeats; Note that these repeats are permuted with respect to the structural repeats ...
321-360 3.62e-06

WD40 repeats; Note that these repeats are permuted with respect to the structural repeats (blades) of the beta propeller domain.


Pssm-ID: 197651 [Multi-domain]  Cd Length: 40  Bit Score: 44.61  E-value: 3.62e-06
                            10        20        30        40
                    ....*....|....*....|....*....|....*....|
gi 530369258    321 KEELQVFRGHKKEATAVAWHPvHEGLFASGGSDGSLLFWH 360
Cdd:smart00320    2 GELLKTLKGHTGPVTSVAFSP-DGKYLASGSDDGTIKLWD 40
Collagen pfam01391
Collagen triple helix repeat (20 copies); Members of this family belong to the collagen ...
730-791 4.70e-06

Collagen triple helix repeat (20 copies); Members of this family belong to the collagen superfamily. Collagens are generally extracellular structural proteins involved in formation of connective tissue structure. The alignment contains 20 copies of the G-X-Y repeat that forms a triple helix. The first position of the repeat is glycine, the second and third positions can be any residue but are frequently proline and hydroxy-proline. Collagens are post translationally modified by proline hydroxylase to form the hydroxy-proline residues. Defective hydroxylation is the cause of scurvy. Some members of the collagen superfamily are not involved in connective tissue structure but share the same triple helical structure. The family includes bacterial collagen-like triple-helix repeat proteins.


Pssm-ID: 460189 [Multi-domain]  Cd Length: 57  Bit Score: 44.79  E-value: 4.70e-06
                           10        20        30        40        50        60
                   ....*....|....*....|....*....|....*....|....*....|....*....|..
gi 530369258   730 GPQGPPGTQGMQGPPGPRGMQGPPHPHGIQGGPGSQGIQGpvsqgplmglnPRGMQGPPGPR 791
Cdd:pfam01391    7 GPPGPPGPPGPPGPPGPPGPPGPPGEPGPPGPPGPPGPPG-----------PPGAPGAPGPP 57
WD40 smart00320
WD40 repeats; Note that these repeats are permuted with respect to the structural repeats ...
373-402 6.08e-06

WD40 repeats; Note that these repeats are permuted with respect to the structural repeats (blades) of the beta propeller domain.


Pssm-ID: 197651 [Multi-domain]  Cd Length: 40  Bit Score: 44.23  E-value: 6.08e-06
                            10        20        30
                    ....*....|....*....|....*....|
gi 530369258    373 AHEGMIWSLAWHPLGHILCSGSNDHTSKFW 402
Cdd:smart00320   10 GHTGPVTSVAFSPDGKYLASGSDDGTIKLW 39
WD40 smart00320
WD40 repeats; Note that these repeats are permuted with respect to the structural repeats ...
277-316 9.74e-06

WD40 repeats; Note that these repeats are permuted with respect to the structural repeats (blades) of the beta propeller domain.


Pssm-ID: 197651 [Multi-domain]  Cd Length: 40  Bit Score: 43.46  E-value: 9.74e-06
                            10        20        30        40
                    ....*....|....*....|....*....|....*....|
gi 530369258    277 TGQSLATLHAHKNTVMEVKLNLNGNWLLTASRDHLCKLFD 316
Cdd:smart00320    1 SGELLKTLKGHTGPVTSVAFSPDGKYLASGSDDGTIKLWD 40
WD40 pfam00400
WD domain, G-beta repeat;
278-316 1.19e-05

WD domain, G-beta repeat;


Pssm-ID: 459801 [Multi-domain]  Cd Length: 39  Bit Score: 43.10  E-value: 1.19e-05
                           10        20        30
                   ....*....|....*....|....*....|....*....
gi 530369258   278 GQSLATLHAHKNTVMEVKLNLNGNWLLTASRDHLCKLFD 316
Cdd:pfam00400    1 GKLLKTLEGHTGSVTSLAFSPDGKLLASGSDDGTVKVWD 39
PABP-1234 TIGR01628
polyadenylate binding protein, human types 1, 2, 3, 4 family; These eukaryotic proteins ...
556-676 1.68e-05

polyadenylate binding protein, human types 1, 2, 3, 4 family; These eukaryotic proteins recognize the poly-A of mRNA and consists of four tandem RNA recognition domains at the N-terminus (rrm: pfam00076) followed by a PABP-specific domain (pfam00658) at the C-terminus. The protein is involved in the transport of mRNA's from the nucleus to the cytoplasm. There are four paralogs in Homo sapiens which are expressed in testis, platelets, broadly expressed and of unknown tissue range.


Pssm-ID: 130689 [Multi-domain]  Cd Length: 562  Bit Score: 49.03  E-value: 1.68e-05
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 530369258   556 LEQLKIERLAQKQVE--QIQPP---PSSGTPLLG---PQPFPGQGPMSQIPQ-----GFQQPHPSQQMPMnmaqmGPPGP 622
Cdd:TIGR01628  360 LAQRKEQRRAHLQDQfmQLQPRmrqLPMGSPMGGamgQPPYYGQGPQQQFNGqplgwPRMSMMPTPMGPG-----GPLRP 434
                           90       100       110       120       130
                   ....*....|....*....|....*....|....*....|....*....|....*
gi 530369258   623 QGqFRPPGPQGQMGPQGPPL-HQGGGGPQGFMGPQGPQGPPQGLPRPQDMHGPQG 676
Cdd:TIGR01628  435 NG-LAPMNAVRAPSRNAQNAaQKPPMQPVMYPPNYQSLPLSQDLPQPQSTASQGG 488
PTZ00421 PTZ00421
coronin; Provisional
288-398 3.37e-05

coronin; Provisional


Pssm-ID: 173611 [Multi-domain]  Cd Length: 493  Bit Score: 47.97  E-value: 3.37e-05
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 530369258  288 KNTVMEVKLN-LNGNWLLTASRDHLCKLFDI------RNLKEELQVFRGHKKEATAVAWHPVHEGLFASGGSDGSLLFWH 360
Cdd:PTZ00421   75 EGPIIDVAFNpFDPQKLFTASEDGTIMGWGIpeegltQNISDPIVHLQGHTKKVGIVSFHPSAMNVLASAGADMVVNVWD 154
                          90       100       110       120
                  ....*....|....*....|....*....|....*....|
gi 530369258  361 V--GVEKEVggmEMAHEGMIWSLAWHPLGHILCSGSNDHT 398
Cdd:PTZ00421  155 VerGKAVEV---IKCHSDQITSLEWNLDGSLLCTTSKDKK 191
PHA03247 PHA03247
large tegument protein UL36; Provisional
731-972 3.46e-05

large tegument protein UL36; Provisional


Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 48.40  E-value: 3.46e-05
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 530369258  731 PQGPPGTQGMQGPPGPRGMQGPPHPHGIQGGPGSQGIQGPVSQGPLMGLNPRGMQGPPGPRENQGPAPQGMI------MG 804
Cdd:PHA03247 2757 PARPPTTAGPPAPAPPAAPAAGPPRRLTRPAVASLSESRESLPSPWDPADPPAAVLAPAAALPPAASPAGPLppptsaQP 2836
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 530369258  805 HPPQEMRGPHPP-----GGLLGHGPQEMRGPQeirGMQGPPPQGSMLGPPQELRGPPGSQSQQGPPQgslgppPQGGMQG 879
Cdd:PHA03247 2837 TAPPPPPGPPPPslplgGSVAPGGDVRRRPPS---RSPAAKPAAPARPPVRRLARPAVSRSTESFAL------PPDQPER 2907
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 530369258  880 PPGPQGQQNPARGPHPSQGPIPFQQQKTPLLGDGPRAPFNQEGQSTGPPPLIPglgqqgaQGRIPPLNPGQGPGP-NKVS 958
Cdd:PHA03247 2908 PPQPQAPPPPQPQPQPPPPPQPQPPPPPPPRPQPPLAPTTDPAGAGEPSGAVP-------QPWLGALVPGRVAVPrFRVP 2980
                         250
                  ....*....|....
gi 530369258  959 EEEPRRGMRAVLPP 972
Cdd:PHA03247 2981 QPAPSREAPASSTP 2994
PAT1 pfam09770
Topoisomerase II-associated protein PAT1; Members of this family are necessary for accurate ...
796-946 3.71e-05

Topoisomerase II-associated protein PAT1; Members of this family are necessary for accurate chromosome transmission during cell division.


Pssm-ID: 401645 [Multi-domain]  Cd Length: 846  Bit Score: 48.11  E-value: 3.71e-05
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 530369258   796 PAPQGMIMGHPPQEmrGPHPPGGLLGHGPQEMRGPQEIRGMQGPPPQGSMLG-PPQELRGPPGSQSQQGPPQGSlgpppq 874
Cdd:pfam09770  211 AQQPAPAPAQPPAA--PPAQQAQQQQQFPPQIQQQQQPQQQPQQPQQHPGQGhPVTILQRPQSPQPDPAQPSIQ------ 282
                           90       100       110       120       130       140       150
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|...
gi 530369258   875 ggmqgppgpqGQQNPARGPHPSQGPIPFQQQKTPLLGDGPRAPFNQEGQ-STGPPPLIPGLGQQGAQGRIPPL 946
Cdd:pfam09770  283 ----------PQAQQFHQQPPPVPVQPTQILQNPNRLSAARVGYPQNPQpGVQPAPAHQAHRQQGSFGRQAPI 345
PHA03378 PHA03378
EBNA-3B; Provisional
753-952 3.83e-05

EBNA-3B; Provisional


Pssm-ID: 223065 [Multi-domain]  Cd Length: 991  Bit Score: 48.14  E-value: 3.83e-05
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 530369258  753 PHPHGIQGGPGSQGI--------QGPVSQGPLmGLNPRGMQ----GPPGPRENQGPaPQGMIMGHPPQEMRGPHPPGGLL 820
Cdd:PHA03378  600 PHPSQTPEPPTTQSHipetsaprQWPMPLRPI-PMRPLRMQpitfNVLVFPTPHQP-PQVEITPYKPTWTQIGHIPYQPS 677
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 530369258  821 GHGPQEMRGPQEIRGMQGPPPQGSMLGPPQElrGPPGSQSqqgPPQGSLGPPPQGGMQGPPGPQGQQNPARGPHPSQGPI 900
Cdd:PHA03378  678 PTGANTMLPIQWAPGTMQPPPRAPTPMRPPA--APPGRAQ---RPAAATGRARPPAAAPGRARPPAAAPGRARPPAAAPG 752
                         170       180       190       200       210
                  ....*....|....*....|....*....|....*....|....*....|....
gi 530369258  901 PFQQqktPLLGDGP-RAPFNQEGQST-GPPPLIPGLGQQGAQGRIPPLNPGQGP 952
Cdd:PHA03378  753 RARP---PAAAPGRaRPPAAAPGAPTpQPPPQAPPAPQQRPRGAPTPQPPPQAG 803
WD40 pfam00400
WD domain, G-beta repeat;
373-402 4.66e-05

WD domain, G-beta repeat;


Pssm-ID: 459801 [Multi-domain]  Cd Length: 39  Bit Score: 41.56  E-value: 4.66e-05
                           10        20        30
                   ....*....|....*....|....*....|
gi 530369258   373 AHEGMIWSLAWHPLGHILCSGSNDHTSKFW 402
Cdd:pfam00400    9 GHTGSVTSLAFSPDGKLLASGSDDGTVKVW 38
SPT5 COG5164
Transcription elongation factor SPT5 [Transcription];
573-933 8.33e-05

Transcription elongation factor SPT5 [Transcription];


Pssm-ID: 444063 [Multi-domain]  Cd Length: 495  Bit Score: 46.56  E-value: 8.33e-05
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 530369258  573 QPPPSSGTPLLGPQPFPGQGPMSqiPQGFQQPhpsqqmPMNMAQMGPPGPQGQFRPPGPQGQMGPqgpplhqggggpqgf 652
Cdd:COG5164    18 TTPAGSQGSTKPAQNQGSTRPAG--NTGGTRP------AQNQGSTTPAGNTGGTRPAGNQGATGP--------------- 74
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 530369258  653 mgpqgpqgppqglPRPQDMHGPQGMQrhpgphgplgpqgppgpqgssgpqghmgpqgppgpqghigpqgppgpqghlGPQ 732
Cdd:COG5164    75 -------------AQNQGGTTPAQNQ---------------------------------------------------GGT 90
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 530369258  733 GPPGTQGMQGPPGPRGMQGPPHPHGIQGGPGSQGIQGPVSQGPlmglnprgmQGPPGPRenqgpapqgmimGHPPQEMRG 812
Cdd:COG5164    91 RPAGNTGGTTPAGDGGATGPPDDGGATGPPDDGGSTTPPSGGS---------TTPPGDG------------GSTPPGPGS 149
                         250       260       270       280       290       300       310       320
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 530369258  813 PHPPGGLLGHGPQEMRGPQEIRGMQGPPPQGSMLGPPQELRGPPGSQSQQGPPQGSLGPPPQGGMQGPPGPQGQQNPARG 892
Cdd:COG5164   150 TGPGGSTTPPGDGGSTTPPGPGGSTTPPDDGGSTTPPNKGETGTDIPTGGTPRQGPDGPVKKDDKNGKGNPPDDRGGKTG 229
                         330       340       350       360
                  ....*....|....*....|....*....|....*....|.
gi 530369258  893 PHPSQGPIPFQQQKTPLLGDGPRAPFNQEGQSTGPPPLIPG 933
Cdd:COG5164   230 PKDQRPKTNPIERRGPERPEAAALPAELTALEAENRAANPE 270
WD40 pfam00400
WD domain, G-beta repeat;
321-359 1.36e-04

WD domain, G-beta repeat;


Pssm-ID: 459801 [Multi-domain]  Cd Length: 39  Bit Score: 40.41  E-value: 1.36e-04
                           10        20        30
                   ....*....|....*....|....*....|....*....
gi 530369258   321 KEELQVFRGHKKEATAVAWHPvHEGLFASGGSDGSLLFW 359
Cdd:pfam00400    1 GKLLKTLEGHTGSVTSLAFSP-DGKLLASGSDDGTVKVW 38
PRK13729 PRK13729
conjugal transfer pilus assembly protein TraB; Provisional
509-629 1.76e-04

conjugal transfer pilus assembly protein TraB; Provisional


Pssm-ID: 184281 [Multi-domain]  Cd Length: 475  Bit Score: 45.58  E-value: 1.76e-04
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 530369258  509 MQNKVPIPAPNEV-----------LNDRKEDIKLEEKKKTQAEIEQEMATLQ-----YTNPQLLEQLKIERLAQ------ 566
Cdd:PRK13729   38 MSGNGEAVAEQEPvpdmtgvvdttFDDKVRQHATTEMQVTAAQMQKQYEEIRreldvLNKQRGDDQRRIEKLGQdnaala 117
                          90       100       110       120       130       140       150
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....
gi 530369258  567 KQVEQI---------QPPPSSGTPLLGPQ--PFPGQGPMSQIPQGFQQPHPSQQMPMNMAQMGPPGPQGQFRPP 629
Cdd:PRK13729  118 EQVKALganpvtatgEPVPQMPASPPGPEgePQPGNTPVSFPPQGSVAVPPPTAFYPGNGVTPPPQVTYQSVPV 191
Med15 pfam09606
ARC105 or Med15 subunit of Mediator complex non-fungal; The approx. 70 residue Med15 domain of ...
612-967 1.83e-04

ARC105 or Med15 subunit of Mediator complex non-fungal; The approx. 70 residue Med15 domain of the ARC-Mediator co-activator is a three-helix bundle with marked similarity to the KIX domain. The sterol regulatory element binding protein (SREBP) family of transcription activators use the ARC105 subunit to activate target genes in the regulation of cholesterol and fatty acid homeostasis. In addition, Med15 is a critical transducer of gene activation signals that control early metazoan development.


Pssm-ID: 312941 [Multi-domain]  Cd Length: 732  Bit Score: 45.77  E-value: 1.83e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 530369258   612 MNMAQMGPPGPQGQFRPPGPQGQMGPQGPPLHQGGGGPQGFMGPQGPQgppqglPRPQDMHGPQGMQRHPGPHGPLGPQG 691
Cdd:pfam09606   53 MSKKAAQQQQPQGGQGNGGMGGGQQGMPDPINALQNLAGQGTRPQMMG------PMGPGPGGPMGQQMGGPGTASNLLAS 126
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 530369258   692 PPGPQGSSGpqghmgpqgppgpqghiGPQGPPGPQGHLGPQGPPGTQGMQGPPGPRGMQGPPHPHGIQGGPGSQGIQGPV 771
Cdd:pfam09606  127 LGRPQMPMG-----------------GAGFPSQMSRVGRMQPGGQAGGMMQPSSGQPGSGTPNQMGPNGGPGQGQAGGMN 189
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 530369258   772 --SQGPLMGLNPRGM--QGPPGPRE--NQGPAPQGMIMGHPPQEMRGPhPPGGLLGHGPQEMRGPQEIRGMQGPPPQ--- 842
Cdd:pfam09606  190 ggQQGPMGGQMPPQMgvPGMPGPADagAQMGQQAQANGGMNPQQMGGA-PNQVAMQQQQPQQQGQQSQLGMGINQMQqmp 268
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 530369258   843 -----GSMLGPPQELRGPPGSQSQQGPPQGSLGPPPQGGMQGPPGPQGQqnpARGPHPSQGPipfQQQKTPLLGDGPRAP 917
Cdd:pfam09606  269 qgvggGAGQGGPGQPMGPPGQQPGAMPNVMSIGDQNNYQQQQTRQQQQQ---QGGNHPAAHQ---QQMNQSVGQGGQVVA 342
                          330       340       350       360       370
                   ....*....|....*....|....*....|....*....|....*....|..
gi 530369258   918 FNQEGQS-TGPPPLIPGLGQQGAQGRIPPLNPGQGPGP-NKVSEEEPRRGMR 967
Cdd:pfam09606  343 LGGLNHLeTWNPGNFGGLGANPMQRGQPGMMSSPSPVPgQQVRQVTPNQFMR 394
WD40 smart00320
WD40 repeats; Note that these repeats are permuted with respect to the structural repeats ...
150-188 1.91e-04

WD40 repeats; Note that these repeats are permuted with respect to the structural repeats (blades) of the beta propeller domain.


Pssm-ID: 197651 [Multi-domain]  Cd Length: 40  Bit Score: 39.99  E-value: 1.91e-04
                            10        20        30
                    ....*....|....*....|....*....|....*....
gi 530369258    150 TFNFETILQAHDSPVRAMTWSHNDMWMLTADHGGYVKYW 188
Cdd:smart00320    1 SGELLKTLKGHTGPVTSVAFSPDGKYLASGSDDGTIKLW 39
PTZ00420 PTZ00420
coronin; Provisional
284-397 6.46e-04

coronin; Provisional


Pssm-ID: 240412 [Multi-domain]  Cd Length: 568  Bit Score: 43.79  E-value: 6.46e-04
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 530369258  284 LHAHKNTVMEVKLN-LNGNWLLTASRDHLCKLFDIRN-------LKEELQVFRGHKKEATAVAWHPVHEGLFASGGSDGS 355
Cdd:PTZ00420   70 LKGHTSSILDLQFNpCFSEILASGSEDLTIRVWEIPHndesvkeIKDPQCILKGHKKKISIIDWNPMNYYIMCSSGFDSF 149
                          90       100       110       120
                  ....*....|....*....|....*....|....*....|....*
gi 530369258  356 LLFWHVGVEKEVGGMEMAHEgmIWSLAWHPLGHIL---CSGSNDH 397
Cdd:PTZ00420  150 VNIWDIENEKRAFQINMPKK--LSSLKWNIKGNLLsgtCVGKHMH 192
PTZ00420 PTZ00420
coronin; Provisional
198-274 7.35e-04

coronin; Provisional


Pssm-ID: 240412 [Multi-domain]  Cd Length: 568  Bit Score: 43.40  E-value: 7.35e-04
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 530369258  198 FQAHKEAIREASFSPTDNK-FATCSDDGTVRIWDfLRCHEER---------ILRGHGADVKCVDWHPTKGLVVSGSK-DS 266
Cdd:PTZ00420   70 LKGHTSSILDLQFNPCFSEiLASGSEDLTIRVWE-IPHNDESvkeikdpqcILKGHKKKISIIDWNPMNYYIMCSSGfDS 148

                  ....*...
gi 530369258  267 QqpIKFWD 274
Cdd:PTZ00420  149 F--VNIWD 154
PRK07764 PRK07764
DNA polymerase III subunits gamma and tau; Validated
744-989 1.60e-03

DNA polymerase III subunits gamma and tau; Validated


Pssm-ID: 236090 [Multi-domain]  Cd Length: 824  Bit Score: 42.67  E-value: 1.60e-03
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 530369258  744 PGPRGMQGPPHPHGIQGGPGSQGIQGPVSQGPlmglnprgmQGPPGPRENQGPAPQGMIMGHPPQEMRGPHPPGGLLGHG 823
Cdd:PRK07764  590 PAPGAAGGEGPPAPASSGPPEEAARPAAPAAP---------AAPAAPAPAGAAAAPAEASAAPAPGVAAPEHHPKHVAVP 660
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 530369258  824 PQEMRGPQEIRGMQGPPPQGSMLGPPqelrgPPGSQSQQGPPQGSLGPPPQGGMQGPPGPQGQQNPARGPHPSQGPIPFQ 903
Cdd:PRK07764  661 DASDGGDGWPAKAGGAAPAAPPPAPA-----PAAPAAPAGAAPAQPAPAPAATPPAGQADDPAAQPPQAAQGASAPSPAA 735
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 530369258  904 QQKTPLLGDGPRAPFNQEGQSTGPPPLIPGLGQQGAQgriPPLNPGQGPGPNKVSEEEPRRGMRAVLPPEEGMVFLVLKT 983
Cdd:PRK07764  736 DDPVPLPPEPDDPPDPAGAPAQPPPPPAPAPAAAPAA---APPPSPPSEEEEMAEDDAPSMDDEDRRDAEEVAMELLEEE 812

                  ....*.
gi 530369258  984 LVQRRI 989
Cdd:PRK07764  813 LGAKKI 818
PAT1 pfam09770
Topoisomerase II-associated protein PAT1; Members of this family are necessary for accurate ...
533-643 1.99e-03

Topoisomerase II-associated protein PAT1; Members of this family are necessary for accurate chromosome transmission during cell division.


Pssm-ID: 401645 [Multi-domain]  Cd Length: 846  Bit Score: 42.33  E-value: 1.99e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 530369258   533 EKKKTQAEIEQEMATLQYTNPQL------LEQLKIERLAQKQVEQIQPPPSSGTPLLGPQPFPGQGPMSQIPQGFQQPHP 606
Cdd:pfam09770  167 PKKAAAPAPAPQPAAQPASLPAPsrkmmsLEEVEAAMRAQAKKPAQQPAPAPAQPPAAPPAQQAQQQQQFPPQIQQQQQP 246
                           90       100       110       120
                   ....*....|....*....|....*....|....*....|....
gi 530369258   607 sQQMPMNMAQMGPPGP-------QGQFRPPGPQGQMGPQGPPLH 643
Cdd:pfam09770  247 -QQQPQQPQQHPGQGHpvtilqrPQSPQPDPAQPSIQPQAQQFH 289
WD40 pfam00400
WD domain, G-beta repeat;
151-188 2.14e-03

WD domain, G-beta repeat;


Pssm-ID: 459801 [Multi-domain]  Cd Length: 39  Bit Score: 36.94  E-value: 2.14e-03
                           10        20        30
                   ....*....|....*....|....*....|....*...
gi 530369258   151 FNFETILQAHDSPVRAMTWSHNDMWMLTADHGGYVKYW 188
Cdd:pfam00400    1 GKLLKTLEGHTGSVTSLAFSPDGKLLASGSDDGTVKVW 38
PRK10263 PRK10263
DNA translocase FtsK; Provisional
541-643 3.23e-03

DNA translocase FtsK; Provisional


Pssm-ID: 236669 [Multi-domain]  Cd Length: 1355  Bit Score: 41.61  E-value: 3.23e-03
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 530369258  541 IEQEMATLQYTNPQLLEQLKIERLA-QKQVEQIQPPPSSGTPLLGPQPFPGQGPMSQIPQGFQQPHPSQQMPMNmaqmgP 619
Cdd:PRK10263  749 VEPVQQPQQPVAPQQQYQQPQQPVApQPQYQQPQQPVAPQPQYQQPQQPVAPQPQYQQPQQPVAPQPQYQQPQQ-----P 823
                          90       100
                  ....*....|....*....|....
gi 530369258  620 PGPQGQFRPPGPQGQMGPQGPPLH 643
Cdd:PRK10263  824 VAPQPQYQQPQQPVAPQPQDTLLH 847
PRK07764 PRK07764
DNA polymerase III subunits gamma and tau; Validated
734-917 4.70e-03

DNA polymerase III subunits gamma and tau; Validated


Pssm-ID: 236090 [Multi-domain]  Cd Length: 824  Bit Score: 41.12  E-value: 4.70e-03
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 530369258  734 PPGTQGMQGPPGPRGMQGPPHPHGIQGGPGSQGIQGPVSQGPLMGLNPRGmQGPPGPRENQGPAPQGMIMGHPPQEMRGP 813
Cdd:PRK07764  615 PAAPAAPAAPAAPAPAGAAAAPAEASAAPAPGVAAPEHHPKHVAVPDASD-GGDGWPAKAGGAAPAAPPPAPAPAAPAAP 693
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 530369258  814 HPPGGllghGPQEMRGPQEIRGMQGPPPQGSMLGPPQELRGPPGSQSQQGPPQGSLGPPPQGGMQGPPGPQGQQNPARGP 893
Cdd:PRK07764  694 AGAAP----AQPAPAPAATPPAGQADDPAAQPPQAAQGASAPSPAADDPVPLPPEPDDPPDPAGAPAQPPPPPAPAPAAA 769
                         170       180
                  ....*....|....*....|....
gi 530369258  894 HPSQGPIPFQQQKTPLLGDGPRAP 917
Cdd:PRK07764  770 PAAAPPPSPPSEEEEMAEDDAPSM 793
PHA03247 PHA03247
large tegument protein UL36; Provisional
733-974 6.41e-03

large tegument protein UL36; Provisional


Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 41.08  E-value: 6.41e-03
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 530369258  733 GPPGTQGMQGPPGPRGMQGPPH--PHGIQGGPGsqgiQGPVSQGPLMGLNPRGMQG----------PPGPRENQGPAPQG 800
Cdd:PHA03247 2639 DPHPPPTVPPPERPRDDPAPGRvsRPRRARRLG----RAAQASSPPQRPRRRAARPtvgsltsladPPPPPPTPEPAPHA 2714
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 530369258  801 MIMGHP----PQEMRGPHPP------------GGLLGHGPQEMRGPQEIRGMQGP-PPQGSMLGPPQELRGPPGSQSQQG 863
Cdd:PHA03247 2715 LVSATPlppgPAAARQASPAlpaapappavpaGPATPGGPARPARPPTTAGPPAPaPPAAPAAGPPRRLTRPAVASLSES 2794
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 530369258  864 PPQGSLGPPPQGGMQGPPGPQGQQNPARGPHPSQGPIPFQQQKTPLLGDGPRAPFNQEGQSTGP-------PPLIPGLGQ 936
Cdd:PHA03247 2795 RESLPSPWDPADPPAAVLAPAAALPPAASPAGPLPPPTSAQPTAPPPPPGPPPPSLPLGGSVAPggdvrrrPPSRSPAAK 2874
                         250       260       270       280
                  ....*....|....*....|....*....|....*....|....*
gi 530369258  937 QGAQGRI-------PPLNPGQGPGPNKVSEEEPRRGMRAVLPPEE 974
Cdd:PHA03247 2875 PAAPARPpvrrlarPAVSRSTESFALPPDQPERPPQPQAPPPPQP 2919
Atrophin-1 pfam03154
Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian ...
729-948 8.39e-03

Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian atrophy (DRPLA) gene. DRPLA OMIM:125370 is a progressive neurodegenerative disorder. It is caused by the expansion of a CAG repeat in the DRPLA gene on chromosome 12p. This results in an extended polyglutamine region in atrophin-1, that is thought to confer toxicity to the protein, possibly through altering its interactions with other proteins. The expansion of a CAG repeat is also the underlying defect in six other neurodegenerative disorders, including Huntington's disease. One interaction of expanded polyglutamine repeats that is thought to be pathogenic is that with the short glutamine repeat in the transcriptional coactivator CREB binding protein, CBP. This interaction draws CBP away from its usual nuclear location to the expanded polyglutamine repeat protein aggregates that are characteriztic of the polyglutamine neurodegenerative disorders. This interferes with CBP-mediated transcription and causes cytotoxicity.


Pssm-ID: 460830 [Multi-domain]  Cd Length: 991  Bit Score: 40.52  E-value: 8.39e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 530369258   729 LGPQGPPGTQgmqgpPGPRGMQGPPHPHG-IQGGPGSQGIQGPVSQGPLMGLNPrgmqgPPgprenqgPAPQGMIMGHPP 807
Cdd:pfam03154  206 VPPQGSPATS-----QPPNQTQSTAAPHTlIQQTPTLHPQRLPSPHPPLQPMTQ-----PP-------PPSQVSPQPLPQ 268
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 530369258   808 QEMRGPHPPGG-LLGHGPQEMRGP----------QEIRGMQGPPPQGSMLGPPQELRGPPGSQSQQGPPQGS-------- 868
Cdd:pfam03154  269 PSLHGQMPPMPhSLQTGPSHMQHPvppqpfpltpQSSQSQVPPGPSPAAPGQSQQRIHTPPSQSQLQSQQPPreqplppa 348
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 530369258   869 ------LGPPPQGGMQGPPGPQGQQNPargPHPSqGPIPFQQQKT----PLLGDGPRAPfNQEGQSTGPPPLipGLGQQG 938
Cdd:pfam03154  349 plsmphIKPPPTTPIPQLPNPQSHKHP---PHLS-GPSPFQMNSNlpppPALKPLSSLS-THHPPSAHPPPL--QLMPQS 421
                          250
                   ....*....|
gi 530369258   939 AQGRIPPLNP 948
Cdd:pfam03154  422 QQLPPPPAQP 431
 
Blast search parameters
Data Source: Precalculated data, version = cdd.v.3.21
Preset Options:Database: CDSEARCH/cdd   Low complexity filter: no  Composition Based Adjustment: yes   E-value threshold: 0.01

References:

  • Wang J et al. (2023), "The conserved domain database in 2023", Nucleic Acids Res.51(D)384-8.
  • Lu S et al. (2020), "The conserved domain database in 2020", Nucleic Acids Res.48(D)265-8.
  • Marchler-Bauer A et al. (2017), "CDD/SPARCLE: functional classification of proteins via subfamily domain architectures.", Nucleic Acids Res.45(D)200-3.
Help | Disclaimer | Write to the Help Desk
NCBI | NLM | NIH