Research

Ongoing Projects | Historical Projects


graphicGeneral Research Interests of the Group:

  • Understanding Biochemical Mechanisms
    • Isofunctional clustering of proteins at the molecular functional level
    • Protein structure and function relationships
    • Protein active site/functional site characterization and analysis
    • Classification of enzymes
    • Role of cysteine modifications
  • Structural and Computational Biophysics
    • Protein electrostatics
    • Allostery and long-range communication in proteins
    • Protein motion/dynamics

Ongoing projects:

Development of automatable methods to cluster protein sequences into functionally relevant groups

In collaboration with Professors Patsy Babbitt and Tom Ferrin (UCSF), Leslie Poole (Wake Forest), and Carol Parish (University of Richmond)

Our research group aims at understanding the molecular functional details of proteins. This problem is significant because of the ever-increasing numbers of protein sequences that are being determined through the genome sequencing projects. The vast majority of these proteins are of uncharacterized function; another large fraction of these proteins are mis-annotated at the molecular functional level (Schnoes, et al. 2009Fetrow, et al. 2001). It is impossible, both methodologically and monetarily, to experimentally determine the function of every protein. Robust and automated computational approaches to this problem are essential.

We have developed methods to automatically cluster proteins into functionally relevant groups and to identify the mechanistic and functional determinants within each group. The foundation of the work is active site profiling (Cammer, et al. 2003), which was expanded into a method for searching sequences called the Deacon Active Site Profiler (DASP; Huff 2005) and its successor, DASP3 (Leuthaeuser, et al. 2016). The group has also produced iterative search processes for identifying functionally relevant groups of proteins in the structure and sequence databases, TuLIP (Knutson, et al. 2017) and MISST (Harper, et al. 2017), repectively.

Our long term goal is to develop automated methods that would allow us to cluster the universe of protein structures into functionally relevant groups. This analysis will yield insights into biological mechanisms, creating experimentally testable hypotheses and, ultimately, enabling more accurate identification and classification of many protein functions. The development of these general concepts will allow for the modification of enzymes (to improve or alter their activity) and the design of enzyme inhibitors (lead compounds), an early step in pharmaceutical drug discovery.

Relevant references:

  • Knutson ST, Westwood BM, Leuthaeuser JB, Turner B, Nguyendac D, Shea G, Kumar K, Hayden J, Harper A, Brown SD, Morris JH, Ferrin TE, Babbitt PC, Fetrow JS. An approach to functionally relevant clustering of the protein universe: Active site profile-based clustering of protein structures and sequences. Protein Sci. 2017 Apr;26(4):677-699.
  • Harper AF, Leuthaeuser JB, Babbitt PC, Morris JH, Ferrin TE, Poole LP, Fetrow JS. An atlas of peroxiredoxins created using an active site profile-based approach to functionally relevant clustering of proteins. PLoS Comput Biol. 2017 Feb 10;13(2):e1005284.
  • Leuthaeuser JB, Morris JH, Harper AF, Ferrin TE, Babbitt PC, Fetrow JS. DASP3: identification of protein sequences belonging to functionally relevant groups. BMC Bioinformatics. 2016 Nov 11;17(1):458.
  • Gober JG, Rydeen A, Gibson-O’Grady E, Leuthaeuser J, Fetrow J, Brustad E. Mutating a Highly Conserved Residue in Diverse Cytochrome P450s Facilitates Diastereoselective Olefin Cyclopropanation. Chembiochem. 2016 Mar 2;17(5):394-7.
  • Leuthaeuser JB, Knutson ST, Kumar K, Babbitt PC, Fetrow JS. Comparison of topological clustering within protein networks using edge metrics that evaluate full sequence, full structure, and active site microenvironment similarity. Protein Sci. 2015 Sep;24(9):1423-39.
  • Fetrow JS. Active site profiling to identify protein functional sites in sequences and structures using the Deacon Active Site Profiler (DASP). Current Protocols in Bioinformatics 2006, Chapter 8:Unit 8 10.
  • Huff RG, Bayram E, Tan H, Knutson ST, Knaggs MH, Richon AB, Santago II P, Fetrow JS. Chemical and Structural Diversity in Cyclooxygenase Protein Active Sites. Chemistry and Biodiversity. 2005; 2:1533-1552.
  • Huff R. DASP: Active Site Profiling for Identification of Functional Sites in Protein Sequences and Structures. M.S. Thesis. Winston-Salem: Wake Forest University; 2005.
  • Baxter SM, Rosenblum JS, Knutson ST, Nelson MR, Montimurro JS, Di Gennaro JA, Speir JA, Burbaum JJ, Fetrow JS. Synergistic computational and experimental proteomics approaches for more accurate detection of active serine hydrolases in yeast. Mol. Cell. Proteomics. 2004 Mar; 3(3):209-25.
  • Cammer SA, Hoffman BT, Speir JA, Canady M, Nelson MR, Knutson ST, Gallina M, Baxter SM, Fetrow JS. Structure-based active site profiles for genome analysis and sub-family classification. J. Mol. Biol. 2003; 334(3):387-401.

Historical Projects:

Integrated functional-site feature analysis, with application to peroxiredoxin and other redoxin proteins

In collaboration with Professors Leslie Poole (Wake Forest Baptist Medical Center, Biochemistry) and Freddie R. Salsbury Jr. (Wake Forest, Physics); Funded by the NSF

Sequence genomics projects have produced many methods for predicting protein function based on sequence motifs, pairwise sequence alignment, or multiple sequence alignment clustering, providing information on molecular function but not insight into biological mechanism. We are crossing the gap from molecular function to biological mechanism by developing and using computational sequence, structure, bioinformatics and biophysical methods to characterize the molecular function sites of proteins within functionally divergent superfamilies. The thioredoxin fold family of proteins (Copley, Novak and Babbitt 2004Atkinson and Babbitt 2009), a very large and biologically important fold family, is a focus of our efforts.

We use a method, active site profiling (Cammer, et al. 2003) implemented in the DASP software (Huff 2005), that we previously developed. As validation of the approach, the methods were applied to the peroxiredoxins, a protein superfamily within the thioredoxin fold family. When applied to the peroxiredoxins, this method easily identified members of the expertly identified subgroups of proteins (Nelson, et al. 2011). Those data are stored in the PREX database (Soito, et al. 2011). Most recently our automatable MISST approach was validated in the peroxiredoxins (Harper, et al. 2017).

Classifications identified by our methods are compared to known biological features and mechanisms, allowing progressive improvement in computational approaches and our understanding of the underlying structure/function relationships. Previously unknown functionally relevant residues were predicted from our results, hypotheses subsequently supported by molecular dynamics and electrostatics calculations (Yuan, et al. 2010).

These approaches will provide significant understanding of the functionally relevant clusters and the mechanistic determinants of function within each cluster of the large and biologically important thioredoxin fold family of proteins.

Relevant references:

  • Harper AF, Leuthaeuser JB, Babbitt PC, Morris JH, Ferrin TE, Poole LP, Fetrow JS. An atlas of peroxiredoxins created using an active site profile-based approach to functionally relevant clustering of proteins. PLoS Comput Biol. 2017 Feb 10;13(2):e1005284.
  • Nelson KJ, Knutson ST, Soito L, Klomsiri C, Poole LB, Fetrow JS. Analysis of the peroxiredoxin family: Using active-site structure and sequence information for global classification and residue analysis. Proteins: Structure, Function, and Bioinformatics, 2011, 79:947-964.
  • Soito L, Williamson C, Knutson ST, Fetrow JS, Poole LB, Nelson KJ. PREX: PeroxiRedoxin classification indEX, a database of subfamily assignments across the diverse peroxiredoxin family. Nucleic Acids Res. 2011 Jan;39(Database issue):D332-7.
  • Yuan Y, Knaggs MH, Poole LB, Fetrow JS, and Salsbury FR Jr. Conformational and oligomeric effects on the cysteine pK(a) of tryparedoxin peroxidase. J Biomol Struct Dyn. 2010, 28(1):51-70.
  • Fetrow JS. Active site profiling to identify protein functional sites in sequences and structures using the Deacon Active Site Profiler (DASP). Current Protocols in Bioinformatics 2006, Chapter 8:Unit 8 10.
  • Huff RG, Bayram E, Tan H, Knutson ST, Knaggs MH, Richon AB, Santago II P, Fetrow JS. Chemical and Structural Diversity in Cyclooxygenase Protein Active Sites. Chemistry and Biodiversity. 2005; 2:1533-1552.
  • Baxter SM, Rosenblum JS, Knutson ST, Nelson MR, Montimurro JS, Di Gennaro JA, Speir JA, Burbaum JJ, Fetrow JS. Synergistic computational and experimental proteomics approaches for more accurate detection of active serine hydrolases in yeast. Mol. Cell. Proteomics. 2004 Mar; 3(3):209-25.
  • Cammer SA, Hoffman BT, Speir JA, Canady M, Nelson MR, Knutson ST, Gallina M, Baxter SM, Fetrow JS. Structure-based active site profiles for genome analysis and sub-family classification. J. Mol. Biol. 2003; 334(3):387-401.
  • Fetrow JS, Skolnick J. Method for prediction of protein function from sequence using the sequence-to-structure-to-function paradigm with application to glutaredoxins/thioredoxins and T1 ribonucleases. J. Mol. Biol. 1998 Sep 4; 281(5):949-968.

Flavonoid signaling and pathway modeling in Arabidopsis

In collaboration with Professors Gloria Muday, Edward Allen, and William Turkett (Wake Forest); and Brenda Winkle (Virginia Tech)

Phenylpropanoid biosynthesis is an important component of plant secondary metabolism that has been extremely well characterized at the genetic, biochemical, and molecular levels. Research interest has been spurred by the importance of the endproducts in such diverse functions as flower pigmentation, UV protection, signaling (including regulation of auxin transport), male fertility, and defense against pathogens as well as their anti-oxidant and anti-cancer properties in humans. The pathway also offers a highly tractable genetic system that is characterized by easily-identifiable (i.e., flower, seed, or leaf color), non-lethal mutations that factored into Mendel’s elucidation of heritable traits, McClintock’s work on transposable elements, and the discovery of cosuppression. Extensive molecular, biochemical, and physiological characterization of this pathway and its many branches make it an ideal system in which to begin to address fundamental questions about Arabidopsis systems biology.

We are utilizing new methods for producing quantitative genomics, proteomic, and metabolomic data for identification of novel components and developing new tools for defining the relationships among those components. Recent insights into the physiological functions of the metabolic products of this pathway will allow us to place these molecular and biochemical events into a physiological context. This project is unique in attempting to collect time course gene expression, protein expression, and metabolite data and combining these comprehensive data sets to create integrated biological networks to aid in understanding of the relationships between components.

The project combines modeling, theory, and experimentation to produce the outcome of systems-level understanding of the phenylpropanoid biosynthetic, transcriptional and regulatory pathways, as exemplary networks, and the biological consequences of hormonal controls of this pathway and will provide a systems level understanding of a metabolic network that synthesizes molecules that are important regulators of plant growth, development, and defense, as well as serving as important antioxidants in human diet. Understanding the controls of this pathway will provide insights into how to engineer the synthetic, signaling and regulatory pathways for both improving plant growth and facilitating production of these important compounds.

Relevant references:

  • Lewis DR, Olex AL, Lundy SR, Turkett WH, Fetrow JS, Muday GK. A kinetic analysis of the auxin transcriptome reveals cell wall remodeling proteins that modulate lateral root development in Arabidopsis. Plant Cell. 2013 Sep;25(9):3329-46.
  • Buer CS, Sukumar P, Muday GK. Ethylene induced flavonoid synthesis modulates root gravitropism. Plant Physiol. 2006 Apr;140(4):1384-96.
  • Buer CS, Muday GK. The transparent testa4 mutation prevents flavonoid synthesis and alters auxin transport and the response of Arabidopsis roots to gravity and light. Plant Cell. 2004 May;16(5):1191-205.
  • Brown DE, Rashotte AM, Murphy AS, Normanly J, Tague BW, Peer WS, Taiz L, Muday GK. Flavonoids act as negative regulators of auxin transport in vivo in Arabidopsis. Plant Physiol. 2001 Jun;126(2):524-35.

Modeling signaling networks and transcriptional regulatory networks in osteoarthritis

In collaboration with Professors David John, Edward Allen, William Turkett and James Norris (Wake Forest); Xiaoyan ‘Iris’ Leng, Cristin Ferguson and Richard Loeser (Wake Forest Baptist Medical Center); and Cathy Carlson (University of Minnesota)

The long-term goal of this project is to provide a better understanding of the basic cellular and molecular mechanisms driving joint tissue destruction during the development of osteoarthritis (OA). We are utilizing a systems and computational biology approach to map the transcriptional regulatory networks that underlie development of OA in a stage-specific, whole organ, manner. By integrating this transcriptional regulatory network with publicly available information on signaling pathways and protein-protein interaction networks, we are: 1) identifying key genes and proteins that could serve as novel targets for disease modifying therapy, as well as novel stage-specific biomarkers; and 2) identifying pathways that are involved in the disease process, which will enhance our understanding of mechanism.

Our approach utilizes a recently developed mouse model of osteoarthritis (destabilization of the medial meniscus). Advantages of this model include: it is biomechanical; damage to the meniscus is a common feature of human OA; it mimics the joint pathology of human OA; and it allows for collection of time course data (early, middle, and late disease stages). Furthermore, the wide availability of transgenic animals permits the future manipulation of identified pathways to test the role of candidate genes and proteins in the network that underlies the development of OA.

This project brings together a team of scientists with expertise in computational biology, basic molecular and translational research in OA, surgical models of OA, and the histological evaluation of OA. We aim to provide a comprehensive picture of the OA disease process, thus providing unprecedented insight into the mechanism of that process with the future promise of discovering novel pathways and drug targets responsible for the initiation and progression of the disease.

Relevant references:

  • Loeser RF, Olex AL, McNulty MA, Carlson CS, Callahan M, Ferguson C, Fetrow JS. Disease progression and phasic changes in gene expression in a mouse model of osteoarthritis. PLoS One. 2013;8(1):e54633.
  • Loeser RF, Olex A, McNulty MA, Carlson CS, Callahan M, Ferguson C, Chou J, Leng X, Fetrow JS. Microarray analysis reveals age-related differences in gene expression during the development of osteoarthritis in mice. Arthritis Rheum. 2012 Mar;64(3):705-17.

Modeling the transcriptional regulation involved in dendritic cell maturation

In collaboration with Professors David John, Edward Allen, and William Turkett (Wake Forest); and Elizabeth (Hiltbold) Schwartz (Wake Forest Baptist Medical Center)

Dendritic cells (DC) are essential to the development of protective immunity to a number of infectious pathogens. These cells alert the adaptive immune system to the presence of pathogenic invaders and activate these cells to clear infections. To stimulate such activation, however, they must undergo a process termed maturation that increases their potency. DC maturation is a tightly regulated process involving changes in gene expression, intracellular trafficking, cytoskeletal modifications, and mobilization to lymphoid organs. The gene expression network, the dynamic process of interaction among gene expression, regulatory sequences, and trans-acting factors, underlying this process is extremely important for controlling many of the observed changes. Very few studies have examined this process over a comprehensive time course and none have attempted to derive network models of this process.

Our long-term goal is to understand, at a systems level, the biology that underlies DC maturation following stimulation by infectious agents. We aim to identify novel, previously undefined components of the DC maturation network and to identify cause-and-effect relationships that explain how DC maturation is controlled upon exposure to various infectious stimuli. In this project, we are assessing the dynamics of DC maturation by identifying and clustering genes that are significantly expressed during DC maturation over a comprehensive time course following treatment of DC with poly I:C as a model of viral infection. We are also identifying relationships between significantly expressed genes, thus beginning to identify networks of interactions. Ultimately, we will demonstrate that we can identify groups genes involved in subnetworks and model the resulting network neighborhoods, thus beginning to establish cause-and-effect versus correlative relationships within the gene expression network.

Because DC maturation is such a pivotal event for protective immunity, a broader understanding of the gene expression program and the comprehensive transcriptional regulatory network underlying their maturation is a key to the identification of new targets for the design and development of vaccines and therapies against infectious agents.

Relevant references:

  • Olex AL, Turkett WH, Brzoza-Lewis KL, Fetrow JS, Hiltbold EM. Impact of the Type I Interferon Receptor on the Global Gene Expression Program During the Course of Dendritic Cell Maturation Induced by Polyinosinic Polycytidylic Acid. J Interferon Cytokine Res. 2016 Jun;36(6):382-400.
  • Olex, A.L., John, D.J., Hiltbold, E.M., and Fetrow, J.S. Additional limitations of the clustering validation method figure of merit. Proceedings of the 45th ACM Southeast Regional Conference, Winston-Salem, NC. March 2007.
  • Olex A.L., Hiltbold E.M., Leng X., and Fetrow J.S. Dynamics of dendritic cell maturation are identified through a novel filtering strategy applied to biological time-course microarray replicates. BMC Immunology 2010, Aug 8, 11:41.

Development of computational algebra and Bayesian tools for biological modeling

In collaboration with Professors David John, Edward Allen, James Norris and William Turkett (Wake Forest); and Leslie Poole, Larry Daniel and Richard Loeser (Wake Forest Baptist Medical Center); Funded by the NIH

Predicting biological networks that underlie experimental data is a major, unsolved problem in modern biology. Constructing models from time course experimental data is particularly difficult, as the number of time points is usually fewer than the number of measured genes or proteins. We are developing computational algebra and Bayesian approaches to modeling such data. Although the number of modified proteins and measured biological endpoints that respond (i.e., the number of variables) exceeds the number of time points that can be collected (i.e., the number of equations), by considering the network under various conditions and by applying game theoretic methods to multiple discretizations of the data, consensus models can be constructed. These models represent aspects of the underlying biological network, identifying dependencies between protein modifications and biological responses.

This collaboration among researchers in the departments of Biochemistry, Computer Science, Mathematics, and Physics at Wake Forest University aims to develop theory, algorithms, computational tools, and research methodologies for the network modeling of time course data.

Relevant references:

  • John, D.J., Fetrow, J.S. and J.L. Norris. Metropolis-Hastings Algorithm and Continuous Regression for finding Next-State Models of Protein Modification using Information Scores. Proceedings of the 7th International Symposium of IEEE Bioinformatics and Bioengineering. 2007. Jack Y. Yang and Mary Qu Yang and Michelle M. Zhu and Yanqing Zhang and Hamid R. Arabnia and Youping Deng and Nikolaos Bourbakis, eds. p. 35-41.
  • Allen, E.E., Diao, L., Fetrow, J.S., John, D.J., Loeser, R.F. Jr., and Poole, L.B. The shuffle index and evaluation of models of signal transduction pathways. Proceedings of the 45th ACM Southeast Regional Conference, Winston-Salem, NC. March 2007, p. 250-255.
  • Allen, E.E., Fetrow, J.S., John, D.J., Pecorella A. and Turkett, W. Re-constructing networks using co-temporal functions. Proceedings of the 44th ACM Southeast Conference, (Marius Silaghi, ed), Melbourne, Florida. March 2006, 417-422.
  • Allen, E.E., Fetrow, J.S., Daniel, L.W., Thomas, S.J., John, D.J. Algebraic dependency models of protein signal transduction networks from time-series data. J. Theor. Biol. 2006 Jan 21;238(2):317-30.
  • Allen, E.E., Fetrow, J.S., John, D.J., Thomas, S.J. Heuristic dependency conjectures in proteomic signaling pathways. Proceedings of the 43rd Annual Association for Computing Machinery Southeast Conference (Victor A. Clincy, ed.) Kennesaw, Georgia, March 2005.

Functional site analysis and drug discovery

In collaboration with Professors William Turkett and Fred Salsbury (Wake Forest); Leslie Poole (Wake Forest Baptist Medical Center); and Jeffrey Skolnick (SUNY Buffalo)

Sequence and structural genomics projects have identified and predicted molecular functions in proteins, yet researchers still cannot determine biological mechanisms of, for example, catalysis or substrate specificity or inhibitor binding, without detailed biochemical and biophysical analysis of a single protein. While structural genomics projects are providing the necessary data, they are not being used to reveal the general principles underlying biological mechanism. We are using sequence, structure, bioinformatics, and biophysical methods to characterize the molecular function sites of protein superfamilies. Our tools include fuzzy functional forms (FFFs), active site profilling (DASP), PASSS, and MEAD for electrostatic analysis.

The research program focuses on the following objectives: 1) characterizing the sequence and structure of functional-site features and using the results to develop methods for clustering the peroxiredoxin family; 2) analyzing the electrostatics, including ionizable residue pKas, residues affecting these pKas, and electrostatic potential, at peroxiredoxin functional sites and testing them experimentally; 3) integrating the electrostatic, sequence and structural information to create a robust profiling method that can identify peroxiredoxin subfamilies, then making it available; and 4) using it to create active-site signatures and profiles for a well-studied and important set of protein superfamilies and making these data available. Crossing the gap from molecular function to biological mechanism requires integrating sequence, structure, and physical-chemical data.

The detailed functional site analysis of protein superfamilies is yielding insights into biological mechanisms, leading to hypotheses that can be experimentally tested. In the long term, the resulting methods will enable more accurate functional site identification from sequence. The development of general concepts for identifying and classifying molecular functional-site features will advance the design of enzymes with improved, altered, or novel activity, and of inhibitors (or lead compounds), an early step in the pharmaceutical drug-discovery process.

Relevant references:

  • Pryor, E.E., Jr. and Fetrow, J.S. PDBSQL: A Storage Engine for Macromolecular Data. Proceedings of the 45th ACM Southeast Regional Conference, Winston-Salem, NC. March 2007.
  • Huff, R. G., Bayram, E., Tan, H., Knutson, S.T., Knaggs, M.H., Richon, A.B., Santago II, P., and Fetrow, J.S. Chemical and Structural Diversity in Cyclooxygenase Protein Active Sites. Chemistry and Biodiversity. 2005. 2:1533-1552.
  • Baxter, S.M., Rosenblum, J.S., Knutson, S.T., Nelson, M.R., Montimurro, J.S., Di Gennaro, J.A., Speir, J.A., Burbaum, J.J. and Fetrow, J.S. Synergistic computational and experimental proteomics approaches for more accurate detection of active serine hydrolases in yeast. Mol Cell Proteomics. 2004 Mar;3(3):209-25.
  • Cammer, S.A., Hoffman, B.T., Speir, J.A., Canady, M., Nelson, M.R., Knutson, S.T., Gallina, M., Baxter, S.M., and Fetrow, J.S. Structure-based active site profiles for genome analysis and sub-family classification. J. Mol. Biol. 2003 Nov 28;334(3):387-401.
  • Di Gennaro, J.A., Siew, N., Hoffman, B.T., Zhang, L., Skolnick, J., Neilson, L.I., Fetrow, J.S. Enhanced functional annotation of protein sequences via the use of structural descriptors. J Struct Biol. 2001 May-Jun;134(2-3):232-245.
  • Fetrow, J.S., Siew, N., and Skolnick, J. Structure-based functional motif identifies a potential disulfide oxidoreductase active site in the serine-threonine protein phosphatase-1 subfamily. FASEB J. 1999 Oct;13(13):1866-1874.
  • Fetrow, J.S., Godzik, A. and Skolnick, J. Functional analysis of the Escherichia coli genome using the sequence-to-structure-to-function paradigm: Identification of proteins exhibiting the glutaredoxin/thioredoxin disulfide oxidoreductase activity. J. Mol. Biol. 1998 Oct 2;282(4):703-711.
  • Fetrow, J.S. and Skolnick, J. Method for prediction of protein function from sequence using the sequence-to-structure-to-function paradigm with application to glutaredoxins/thioredoxins and T1 ribonucleases. J. Mol. Biol. 1998 Sep 4;281(5):949-968.

Experimental and computational analysis of the interaction networks in proteins

In collaboration with Professors Freddie R. Salsbury Jr. (Wake Forest) and Marshall Hale Edgell (UNC-Chapel Hill)

Nonadditive effects (in which the sum of the free energy changes resulting from two single mutations do not equal the measured free energy change for the double mutant) are are common in proteins. They are the basis of a crucial functional feature of proteins, allostery, and are also associated with site pairs that are not involved in allostery. The physical basis for nonadditive effects is poorly understood and our predictive capacity, in either qualitative or quantitative terms, is marginal at best. Current generalizations are based on the analysis of a modest number of site pairs and a small number of mutations at those sites.

We will develop new generalizations about the interaction network in proteins by doing thermodynamic cycle measurements with several thousand mutant proteins. This will be accomplished using previously developed high throughput mutagenesis techniques and high precision stability measurements. Another objective of this project is to identify parameters extractable from conformational ensembles generated by molecular dynamics simulations that correlate qualitatively and quantitatively with the equilibrium thermodynamic measurements. Extensive correlations between thermodynamic measurements and computer simulation parameters will be a significant step towards a capacity to predict features of the interaction network.

Relevant references:

  • Knaggs, M.H., Salsbury, F.R., Edgell, M.H., Fetrow, J.S. Insights into CheY relaxation and relaxation derived from molecular dynamics simulations. Biophys J. 2007 Mar 15;92(6):2062-79.
  • Fetrow, J.S., Knutson, S.T. and Edgell, M.H. Mutations in α-helical solvent exposed sites of eglin c have long-range effects: evidence from molecular dynamics simulations. Proteins: Struct Funct Bioinform. 2006 May 1; 63(2):356-72.

Classification and dynamics simulations of omega loops and and other protein loops

The regular secondary structures, alpha helices and beta strands, are easily recognizable in protein structures. The non-regular secondary structures, such as the various types of loops and turns, are less easily recognized, but no less important in the structure and function of proteins. Omega loops, a type of non-regular secondary structure first described in 1986 (Leszczynski (Fetrow) and Rose, 1986Fetrow 1995), are segments of non-regular secondary protein structure that are six or more residues in length and are shaped so that the loop ends are close in three-dimensional space. Omega loops constitute approximately 20-23% of protein structure and have been recognized as playing a variety of roles in protein function.

Additional loop types, including S-loops and strap loops, were described in the 1990s. These structures are almost always found at the protein surface and it is generally assumed that these non-regular structures are more flexible than other parts of the protein.

In 1995, however, it was suggested that loops could be classified based on their roles in function, folding and stability. Recently, trigger loops were proposed to play a specific role in protein function. We hypothesize that loops playing these different roles will exhibit different dynamic characteristics. We are testing this hypothesis by performing simulations on proteins containing loops of various types that have been well-studied experimentally. Recently omega loops have been identified as functionally relevant structures in many proteins (see, for example, Cui, et al. 2017Karsisiotis, et al. 2016Lahir, et al. 2016Guo 2016Mottin, et al. 2015Sohail & Rashid 2014Johnson Holyoak 2012.)

Relevant references:

  • Fetrow, J.S., Schaak, D.L., Dreher, U., Wiland, D.J., and Boose, T.L. Mutagenesis of histidine 26 demonstrates the importance of loop-loop and loop-protein attachments for the function of iso-1-cytochrome c. Protein Sci. 1998; 27(4):994-1005.
  • Fetrow, J.S., Horner, S.R., Oehrl,W., Schaak, D.L., Boose, T.L., and Burton, R.E. Analysis of the structure and stability of omega loop A replacements in yeast iso-1-cytochrome cProtein Sci. 1997 Jan; 6(1):197-210.
  • Mulligan-Pullyblank, P. Spitzer, J.S., Gilden, B.M., and Fetrow, J.S. Loop replacement and random mutagenesis of omega loop D, residues 70-84, in iso-1-cytochrome cJ. Biol. Chem. 1996 Apr 12; 271(15):8633-8645.
  • Fetrow, J.S. Omega Loops: Nonregular secondary structures significant in protein function and stability. FASEB J. 1995 Jun; 9(9):708-717.
  • Murphy, M.E.P., Fetrow, J.S., Burton, R.E. and Brayer, G.D. The structure and function of omega loop A replacements in cytochrome cProtein Sci. 1993; 2(9):1429-1440.
  • Fetrow, J.S., Cardillo, T.S., and Sherman, F. Deletions and replacements of omega loops in yeast iso-1-cytochrome cProteins. 1989; 6(4):372-381.
  • Leszczynski (Fetrow), J.F. and Rose, G.D. Loops in globular proteins: Identification of a novel category of secondary structure. Science. 1986 Nov 14; 234(4778):849-55.

Motion and dynamics, structure and function in yeast iso-1-cytochrome c

Proteins are not static structures. Rather, they exhibit many different kinds of motions on a very wide range of time scales. To better understand these motions in a well-studied protein system, we have used NMR and EPR spectroscopy to study protein motion and dynamics in yeast iso-1-cytochrome c. Beginning with the first successful isotopic labeling of cytochrome c, we have studied the psec-nsec dynamics of the protein backbone using both hydrogen exchange and 15N relaxation measurements. Site-directed spin labeling of the cysteine in the C-terminal helix of iso-1-cytochrome c provides an indication of the flexibility of the C-terminus of the protein.

To better understand the role that omega loops play in proteins, we have developed methods of directed, random mutagenesis of yeast cytochrome c. Yeast cytochrome c can be analyzed for both function and structure in vivo, which makes it an ideal protein for analysis of structure-function relationships. Using directed, random mutagenesis, we mutagenized several pairs of residues in yeast cytochrome c to identify which residue pairs are consistent with structure and function of this protein in vivo.

Relevant references:

  • DeWeerd, K., Grigoryants, V., Sun, Y., Fetrow, J.S., Scholes, C.P. EPR-detected folding kinetics of externally located cysteine-directed spin-labeled mutants of iso-1-cytochrome cBiochemistry. 2001 Dec 25; 40(51):15846-15855.
  • Fetrow, J.S. and Baxter, S.M. Assignment of 15N chemical shifts and 15N relaxation measurements for oxidized and reduced iso-1-cytochrome c. Biochemistry. 1999 Apr 6; 38(14):4480-4492.
  • Baxter, S.M. and Fetrow, J.S. Hydrogen exchange behavior of [U15N]-labeled oxidized and reduced iso-1-cytochrome c. Biochemistry. 1999 Apr 6; 38(14):4493-4503.
  • Baxter, S.M., Boose, T.L., and Fetrow, J.S. 15N isotopic labeling and amide hydrogen exchange rates of oxidized iso-1-cytochrome cJ. Am. Chem. Soc. 1998; 119(41):9899-9900.
  • Qu, K. Vaughn, J.L., Sienkiewicz, A. Scholes, C.P., and Fetrow, J.S. Kinetics and motional dynamics of spin labeled yeast iso-1-cytochrome c: 1. Stopped-flow EPR as a probe for protein folding/unfolding of the C-terminal helix spin labeled at cysteine 102. Biochemistry. 1998; 36(10):2884-2897.
  • Fetrow, J.S., Spitzer, J.S., Gilden, B.M., Mellender, S.J., Begley, T., Haas, B., and Boose, T.L. Structure, function, and temperature sensitivity analysis of directed, random mutants of proline 76 and glycine 77 in omega loop D of yeast iso-1-cytochrome cBiochemistry.1998; 37(8):2477-2487.
  • Fetrow, J.S., Schaak, D.L., Dreher, U., Wiland, D.J., and Boose, T.L. Mutagenesis of histidine 26 demonstrates the importance of loop-loop and loop-protein attachments for the function of iso-1-cytochrome cProtein Sci. 1998; 27(4):994-1005.
  • Fumo, G. and Fetrow, J.S. A method of directed random mutagenesis of the yeast chromosome shows that iso-1-cytochrome c heme ligand His18 is essential. Gene. 1995 Oct 16; 164(1):33-39

Structural Building Blocks (SBBs) and automatic identification of protein secondary structures

Automated methods for identification of secondary structure are necessary for large-scale analysis of protein structure. We developed a method for identification and classification of protein secondary structure, without previous knowledge of the types of secondary structures. This method utilizes artificial neural networks to classify and cluster segments of protein structure based on their geometry.

Clustering of six-residue segments in a large group of protein structures results in the identification of six classes of secondary structures, which we term structural building blocks (SBBs). Two of these are the canonical alpha helix and beta strand structures, while two other SBBs coincide with N- and C-terminal helix capping structures.

Relevant references:

  • Fetrow, J.S., Palumbo, M.J., and Berg, G. Patterns, structures, and amino acid frequencies in structural building blocks, a protein secondary structure classification scheme. Proteins. 1997 Feb; 27(2):249-71.
  • Zhang, X., Fetrow, J.S., Rennie, W.A., Waltz, D.L., and Berg, G. Automatic derivation of substructures yields novel structural building blocks in globular proteins. (1993) Proceedings: First International Conference on Intelligent Systems for Molecular Biology. p. 438-446. L. Hunter, D. Searls, J. Shavlik, eds.