Current analyses of protein sequence/structure relationships have focused on expected similarity relationships for structurally similar proteins. To survey and explore the basis of these relationships, we present a general sequence/structure map that covers all combinations of similarity/dissimilarity relationships and provide novel energetic analyses of these relationships. To aid our analysis, we divide protein relationships into four categories: expected/unexpected similarity (S and S?) and expected/unexpected dissimilarity (D and D?) relationships. In the expected similarity region S, we show that trends in the sequence/structure relation can be derived based on the requirement of protein stability and the energetics of sequence and structural changes. Specifically, we derive a formula relating sequence and structural deviations to a parameter characterizing protein stiffness; the formula fits the data reasonably well, We suggest that the absence of data in region S? (high structural but low sequence similarity) is due to unfavorable energetics. In contrast to region S, region D? (high sequence but low structural similarity) is well-represented by proteins that can accommodate large structural changes. Our analyses indicate that there are several categories of similarity relationships and that protein energetics provide a basis for understanding these relationships.
ASJC Scopus subject areas