Interpreting the Scores
 


Interpreting the scores

The scores produced by this method are not a guarrantee that the alignments are correct in a structural sense, but testing has shown that regions defined as reliably aligned here are much more likely to have the correct structural alignment. Used in conjunction with the optimal alignments and with information from functionally important sites and conserved residues, the scores calculated by SQUARE can give you an important insight into the quality of your alignment.

Those regions of the alignment that are low scoring might be poorly aligned, might just be regions of low sequence conservation or might be indicative that this region of the target sequence has shifted away from template sequence/structure towards a different structure or function.


Profile-derived alignment score graphs

These bar charts depict the actual profile-derived alignment score for each template residue in the alignment. Gaps inserted into the template sequence and any start and end gaps in the alignment are ignored. Insertions in the target sequence are penalised equally, so long gaps in the target sequence will lead to flat, negative scoring troughs.


Reliably aligned regions

This is a measure of per residue reliability for alignments between query sequences and sequences of known structure (ie those from the PDB). Results are based on the profile-derived alignment scores and are output in the form of a text alignemnt with an indication of the reliability of each residue and the size of the islands of reliable residues depends on the three options (Residues in Peak, Peak Cut-off and Tail Cut-off) on the front page.


Residue conservation (HSSP)

A conservation score is also included for each residue. The degree of conservation is calculated from the VAR column in the HSSP file that corresponds to the template chain. The conservation score output in SQUARE is calculated relative to the most conserved residue in the HSSP alignment, the higher the number the more conserved the residue.


Secondary structure

Secondary structural notation for the template is taken directly from the template chain HSSP file. This is not a secondary structure prediction for the target sequence. Aligned strand residues defined as reliable are most likely to be structurally reliable, while those regions defined as loops are slightly less likely to indicate the correct structural alignment than the two main secondary structural classes.


PDB functionally important sites

These are residues defined as important sites by the crystallographers etc. who deposited the structures in the PDB. This is not a complete list of all sites (in fact most PDB files do not have a "SITES" section) and the definition of a site depends on the group that deposited the structure. However, where a site residue is also defined as reliably aligned the residue is almost certainly still an important site in the target sequence.


Regions where there is little evolutionary information in the PSSM

Occasionally parts (or all) of the template profile may contain little information. This is because the PDB template chain has only a few very close relatives at these residue positions, and therefore there is not enough information to generate reliable scores. With so little evolutionary variation to go on, only those sequences that are closely related to the structural template will score well. As sequence databases grow the number and size of these unreliable regions should decrease a little.


Tree-determinant residues

Tree determinant residues have been withdrawn from this server and will soon be found on a related server.


Sample Output 1

Sample Output 2

Return to SQUARE