>> BTB Database Home >> BTB Database Statistics >> BTB Domain Analysis >> BACK Domain home >> Prive Lab

Panel of HMM's describing the families of BTB proteins

BTB domains were identified using this collection of profile HMMs, generated by HMMER 2.3.2, that were trained on families of BTB proteins. These families were defined on the basis of sequence similarity, secondary structure content, and domain architecture of the full-length proteins. These HMMs allow detection of remote homologs of BTB domains and accurate alignment of the BTB domain.

These steps were taken in the generation of all HMM's:

1. Generation of structure-based multiple sequence alignment (MSA) from structural superposition of solved BTB structures.
2. Generation of initial HMM based on structure-based MSA.
3. Search of genomes at Ensembl with this HMM.
4. Detection of "partner domains" (non BTB domains) in the full-length proteins with Pfam, SMART and Interpro.
5. Phylogenetic clustering of BTB domain sequences with a variety of methods (distance-based, maximum parsimony, maximum likelihood). This consistently showed an inability to generate a statistically-relevant phylogenetic tree, therefore the BTB domain sequences are very divergent from each other.
6. Secondary structure predictions using PSIPRED and PHD on examples of BTB domain sequences.
*steps 4 through 6 showed the best way of grouping BTB domain sequences into families is based on the domain architecture of the full-length proteins.
7. Generation of MSA's for each BTB family using early HMM's and Clustalw, followed by manual adjustment guided by secondary structure predictions.
8. Training of family-specific HMM's based on these MSA's.

The following lists the BTB families, their features and their corresponding HMM's.

Vertebrate BTB-ZF proteins
Two HMM's: btb_zf_with_nterminal_extension.hmm and btb_zf_without_nterminal_extension.hmm
All BTB-ZF proteins contain the secondary structure elements B1 through A5. Those that contain the N-terminal extension alpha1 and beta1 score higher to btb_zf_with_nterminal_extension.hmm.

Vertebrate BTB-BACK-Kelch (BBK) proteins
mammalian_bbk.hmm
Most BBK proteins contain the secondary structure elements beta1 through A5 (this includes the N-terminal extension beta1, alpha1).

T1 domain
all_t1s_round2.hmm
T1 domains contain secondary structure elements B1 through A5 but lack beta5.

Skp1 domain
skp1.hmm
Skp1 domains contain secondary structure elements B1 through alpha7 (this includes the C-terminal extension elements alpha6 and alpha7).

BTB domain from BTB-NPH3 proteins
btb_from_btb-nph3.hmm
BTB domains from BTB-NPH3 proteins appear to contain two leading beta-strands in a region preceding the core fold (B1 through A5), with an additional beta-strand between A1 and A2. These BTB domains are thus very different from the typical domain.

BTB domain from MATH-BTB proteins
btb_from_math-btb.hmm
BTB domains from MATH-BTB proteins are predicted to contain the core fold plus the N-terminal extension elements beta1 and alpha1.

"Catch-all huge alignment" HMM
huge_alignment.hmm
This HMM was generated with samples of BTB domains from many different protein types. It was generated from a multiple sequence alignment with as much sequence diversity as could be accomodated. As such, this HMM is very sensitive to all BTB domain types, but as a trade-off it is the least specific and least accurate with regards to BTB domain termini and alignment. This HMM is useful to identify BTB domains that do not fit the subfamilies described by the above collection of HMM's, and is used to help guide future subfamily/HMM generation.

Further HMMs on the way for more precise alignment and classification.


This page last updated December 14, 2005