-
Development of hidden Markov modeling method for molecular orientations and structure estimation from high-speed atomic force microscopy time-series images
Tomonori Ogane, Daisuke Noshiro, Toshio Ando, Atsuko Yamashita, Yuji Sugita, Yasuhiro Matsunaga
High-speed atomic force microscopy (HS-AFM) is a powerful technique for capturing the time-resolved behavior of biomolecules. However, structural information in HS-AFM images is limited to the surface geometry of a sample molecule. Inferring latent three-dimensional structures from the surface geometry is thus important for getting more insights into conformational dynamics of a target biomolecule. Existing methods for estimating the structures are based on the rigid-body fitting of candidate structures to each frame of HS-AFM images. Here, we extend the existing frame-by-frame rigid-body fitting analysis to multiple frames to exploit orientational correlations of a sample molecule between adjacent frames in HS-AFM data due to the interaction with the stage. In the method, we treat HS-AFM data as time-series data, and they are analyzed with the hidden Markov modeling. Using simulated HS-AFM images of the taste receptor type 1 as a test case, the proposed method shows a more robust estimation of molecular orientations than the frame-by-frame analysis. The method is applicable in integrative modeling of conformational dynamics using HS-AFM data.
-
Use of multistate Bennett acceptance ratio method for free-energy calculations from enhanced sampling and free-energy perturbation
Yasuhiro Matsunaga, Motoshi Kamiya, Hiraku Oshima, Jaewoon Jung, Shingo Ito, and Yuji Sugita
Multistate Bennett acceptance ratio (MBAR) works as a method to analyze molecular dynamics (MD) simulation data after the simulations have been finished. It is widely used to estimate free-energy changes between different states and averaged properties at the states of interest. MBAR allows us to treat a wide range of states from those at different temperature/pressure to those with different model parameters. Due to the broad applicability, the MBAR equations are rather difficult to apply for free-energy calculations using different types of MD simulations including enhanced conformational sampling methods and free-energy perturbation. In this review, we first summarize the basic theory of the MBAR equations and categorize the representative usages into the following four: (i) perturbation, (ii) scaling, (iii) accumulation, and (iv) full potential energy. For each, we explain how to prepare input data using MD simulation trajectories for solving the MBAR equations. MBAR is also useful to estimate reliable free-energy differences using MD trajectories based on a semi-empirical quantum mechanics/molecular mechanics (QM/MM) model and ab initio QM/MM energy calculations on the MD snapshots. We also explain how to use the MBAR software in the GENESIS package, which we call mbar_analysis, for the four representative cases. The proposed estimations of free-energy changes and thermodynamic averages are effective and useful for various biomolecular systems.
-
Multiple sub state structures of SERCA2b reveal conformational overlap at transition steps during the catalytic cycle
Yuxia Zhang, Chigusa Kobayashi, Xiaohan Cai, Satoshi Watanabe, Akihisa Tsutsumi, Masahide Kikkawa, Yuji Sugita, Kenji Inaba
Sarco/endoplasmic reticulum Ca2+ ATPase (SERCA) pumps Ca2+ into the endoplasmic reticulum (ER). Herein, we present cryo-electron microscopy (EM) structures of three intermediates of SERCA2b: Ca2+-bound phosphorylated (E1P·2Ca2+) and Ca2+-unbound dephosphorylated (E2·Pi) intermediates and another between the E2P and E2·Pi states. Our cryo-EM analysis demonstrates that the E1P·2Ca2+ state exists in low abundance and preferentially transitions to an E2P-like structure by releasing Ca2+ and that the Ca2+ release gate subsequently undergoes stepwise closure during the dephosphorylation processes. Importantly, each intermediate adopts multiple sub-state structures including those like the next one in the catalytic series, indicating conformational overlap at transition steps, as further substantiated by atomistic molecular dynamic simulations of SERCA2b in a lipid bilayer. The present findings provide insight into how enzymes accelerate catalytic cycles.
-
Formation of extramembrane β-strands controls dimerization of transmembrane helices in amyloid precursor protein C99
George A. Pantelopulos, Daisuke Matsuoka, Yuji Sugita, John E. Straub
The 99-residue C-terminal domain of amyloid precursor protein (APP-C99), precursor to amyloid beta (Aβ), is a transmembrane (TM) protein containing intrinsically disordered N- and C-terminal extramembrane domains. Using molecular dynamics (MD) simulations, we show that the structural ensemble of the C99 monomer is best described in terms of thousands of states. The C99 monomer has a propensity to form β-strand in the C-terminal extramembrane domain, which explains the slow spin relaxation times observed in paramagnetic probe NMR experiments. Surprisingly, homodimerization of C99 not only narrows the conformational ensemble from thousands to a few states through the formation of metastable β-strands in extramembrane domains but also stabilizes extramembrane α-helices. The extramembrane domain structure is observed to dramatically impact the homodimerization motif, resulting in the modification of TM domain conformations. Our study provides an atomic-level structural basis for communication between the extramembrane domains of the C99 protein and TM homodimer formation. This finding could serve as a general model for understanding the influence of disordered extramembrane domains on TM protein structure.
-
Modified protein-water interactions in CHARMM36m for thermodynamics and kinetics of proteins in dilute and crowded solutions
Daiki Matsubara, Kento Kasahara, Hisham Dokainish, Hiraku Oshima, Yuji Sugita
Proper balance between protein-protein and protein-water interactions is vital for atomistic molecular dynamics (MD) simulations of globular proteins as well as intrinsically disordered proteins (IDPs). The overestimation of protein-protein interactions tends to make IDPs more compact than those in experiments. Likewise, multiple proteins in crowded solutions are aggregated with each other too strongly. To optimize the balance, Lennard-Jones (LJ) interactions between protein and water are often increased about 10% (with a scaling parameter, λ = 1.1) from the existing force fields. Here, we explore the optimal scaling parameter of protein-water LJ interactions for CHARMM36m in conjunction with the modified TIP3P water model, by performing enhanced sampling MD simulations of several peptides in dilute solutions and conventional MD simulations of globular proteins in dilute and crowded solutions. In our simulations, 10% increase of protein-water LJ interaction for the CHARMM36m cannot maintain stability of a small helical peptide, (AAQAA)3 in a dilute solution and only a small modification of protein-water LJ interaction up to the 3% increase (λ = 1.03) is allowed. The modified protein-water interactions are applicable to other peptides and globular proteins in dilute solutions without changing thermodynamic properties from the original CHARMM36m. However, it has a great impact on the diffusive properties of proteins in crowded solutions, avoiding the formation of too sticky protein-protein interactions.
-
General Design Strategy to Precisely Control the Emission of Fluorophores via a Twisted Intramolecular Charge Transfer (TICT) Process
Kenjiro Hanaoka, Shimpei Iwaki, Kiyoshi Yagi, Takuya Myochin, Takayuki Ikeno, Hisashi Ohno, Eita Sasaki, Toru Komatsu, Tasuku Ueno, Motokazu Uchigashima, Takayasu Mikuni, Kazuki Tainaka, Shinya Tahara, Satoshi Takeuchi, Tahei Tahara, Masanobu Uchiyama, Tetsuo Nagano, and Yasuteru Urano
Fluorogenic probes for bioimaging have become essential tools for life science and medicine, and the key to their development is a precise understanding of the mechanisms available for fluorescence off/on control, such as photoinduced electron transfer (PeT) and Förster resonance energy transfer (FRET). Here we establish a new molecular design strategy to rationally develop activatable fluorescent probes, which exhibit a fluorescence off/on change in response to target biomolecules, by controlling the twisted intramolecular charge transfer (TICT) process. This approach was developed on the basis of a thorough investigation of the fluorescence quenching mechanism of N-phenyl rhodamine dyes (commercially available as the QSY series) by means of time-dependent density functional theory (TD-DFT) calculations and photophysical evaluation of their derivatives. To illustrate and validate this TICT-based design strategy, we employed it to develop practical fluorogenic probes for HaloTag and SNAP-tag. We further show that the TICT-controlled fluorescence off/on mechanism is generalizable by synthesizing a Si–rhodamine-based fluorogenic probe for HaloTag, thus providing a palette of chemical dyes that spans the visible and near-infrared range.
-
Protein folding intermediates on the dimensionality reduced landscape with UMAP and native contact likelihood
Mao Oide and Yuji Sugita
To understand protein folding mechanisms from molecular dynamics (MD) simulations, it is important to explore not only folded/unfolded states but also representative intermediate structures on the conformational landscape. Here, we propose a novel approach to construct the landscape using the uniform manifold approximation and projection (UMAP) method, which reduces the dimensionality without losing data-point proximity. In the approach, native contact likelihood is used as feature variables rather than the conventional Cartesian coordinates or dihedral angles of protein structures. We tested the performance of UMAP for coarse-grained MD simulation trajectories of B1 domain in protein G and observed on-pathway transient structures and other metastable states on the UMAP conformational landscape. In contrast, these structures were not clearly distinguished on the dimensionality reduced landscape using principal component analysis or time-lagged independent component analysis. This approach is also useful to obtain dynamical information through Markov state modeling and would be applicable to large-scale conformational changes in many other biomacromolecules.
-
Crystal structure of the lipid flippase MurJ in a “squeezed” form distinct from its inward- and outward-facing forms
Hidetaka Kohga, Takaharu Mori, Yoshiki Tanaka, Kunihito Yoshikaie, Katsuhide Taniguchi, Kei Fujimoto, Lisa Fritz, Tanja Schneider, and Tomoya Tsukazaki
The bacterial peptidoglycan enclosing the cytoplasmic membrane is a fundamental cellular architecture. The integral membrane protein MurJ plays an essential role in flipping the cell wall building block Lipid II across the cytoplasmic membrane for peptidoglycan biosynthesis. Previously reported crystal structures of MurJ have elucidated its V-shaped inward- or outward-facing forms with an internal cavity for substrate binding. MurJ transports Lipid II using its cavity through conformational transitions between these two forms. Here, we report two crystal structures of inward-facing forms from Arsenophonus endosymbiont MurJ and an unprecedented crystal structure of Escherichia coli MurJ in a “squeezed” form, which lacks a cavity to accommodate the substrate, mainly because of the increased proximity of transmembrane helices 2 and 8. Subsequent molecular dynamics simulations supported the hypothesis that the squeezed form is an intermediate conformation. This study fills a gap in our understanding of the Lipid II flipping mechanism.
-
Modified Hamiltonian in FEP calculations for reducing the computational cost of electrostatic interactions
Hiraku Oshima and Yuji Sugita
The free-energy perturbation (FEP) method predicts relative and absolute free-energy changes of biomolecules in solvation and binding with other molecules. FEP is, therefore, one of the most essential tools in in-silico drug design. In conventional FEP, to smoothly connect two thermodynamic states, the potential energy is modified as a linear combination of the end state potential energies by introducing scaling factors. When the particle mesh Ewald is used for electrostatic calculations, conventional FEP requires two reciprocal-space calculations per time step, which largely decreases the computational performance. To overcome this problem, we propose a new FEP scheme by introducing a modified Hamiltonian instead of interpolation of the end state potential energies. The scheme introduces non-uniform scaling into the electrostatic potential as used in Replica-Exchange with Solute Tempering 2 (REST2) and does not require additional reciprocal-space calculations. We tested this modified Hamiltonian in FEP calculations in several biomolecular systems. In all cases, the calculated free-energy changes with the current scheme are in good agreement with those from conventional FEP. The modified Hamiltonian in FEP greatly improves the computational performance, which is particularly marked for large biomolecular systems whose reciprocal-space calculations are the major bottleneck of total computational time.
-
Practical protocols for efficient sampling of kinase-inhibitor binding pathways using two-dimensional replica-exchange molecular dynamics
Ai Shinobu, Suyong Re, and Yuji Sugita
Molecular dynamics (MD) simulations are increasingly used to study various biological processes such as protein folding, conformational changes, and ligand binding. These processes generally involve slow dynamics that occur on the millisecond or longer timescale, which are difficult to simulate by conventional atomistic MD. Recently, we applied a two-dimensional (2D) replica-exchange MD (REMD) method, which combines the generalized replica exchange with solute tempering (gREST) with the replica-exchange umbrella sampling (REUS) in kinase-inhibitor binding simulations, and successfully observed multiple ligand binding/unbinding events. To efficiently apply the gREST/REUS method to other kinase-inhibitor systems, we establish modified, practical protocols with non-trivial simulation parameter tuning. The current gREST/REUS simulation protocols are tested for three kinase-inhibitor systems: c-Src kinase with PP1, c-Src kinase with Dasatinib, and c-Abl kinase with Imatinib. We optimized the definition of kinase-ligand distance as a collective variable (CV), the solute temperatures in gREST, and replica distributions and umbrella forces in the REUS simulations. Also, the initial structures of each replica in the 2D replica space were prepared carefully by pulling each ligand from and toward the protein binding sites for keeping stable kinase conformations. These optimizations were carried out individually in multiple short MD simulations. The current gREST/REUS simulation protocol ensures good random walks in 2D replica spaces, which are required for enhanced sampling of inhibitor dynamics around a target kinase.
-
Computational analysis on the allostery of tryptophan synthase: relationship between α/β-ligand binding and distal domain closure
Shingo Ito, Kiyoshi Yagi, and Yuji Sugita
Tryptophan synthase (TRPS) is a bifunctional enzyme consisting of α and β-subunits and catalyzes the last two steps of l-tryptophan (L-Trp) biosynthesis, namely, cleavage of 3-indole-d-glycerol-3′-phosphate (IGP) into indole and glyceraldehyde-3-phosphate (G3P) in the α-subunit, and a pyridoxal phosphate (PLP)-dependent reaction of indole and l-serine (L-Ser) to produce L-Trp in the β-subunit. Importantly, the IGP binding at the α-subunit affects the β-subunit conformation and its ligand-binding affinity, which, in turn, enhances the enzymatic reaction at the α-subunit. The intersubunit communications in TRPS have been investigated extensively for decades because of the fundamental and pharmaceutical importance, while it is still difficult to answer how TRPS allostery is regulated at the atomic detail. Here, we investigate the allosteric regulation of TRPS by all-atom classical molecular dynamics (MD) simulations and analyze the potential of mean-force (PMF) along conformational changes of the α- and β-subunits. The present simulation has revealed a widely opened conformation of the β-subunit, which provides a pathway for L-Ser to enter into the β-active site. The IGP binding closes the α-subunit and induces a wide opening of the β-subunit, thereby enhancing the binding affinity of L-Ser to the β-subunit. Structural analyses have identified critical hydrogen bonds (HBs) at the interface of the two subunits (αG181-βS178, αP57-βR175, etc.) and HBs between the β-subunit (βT110 – βH115) and a complex of PLP and L-Ser (an α-aminoacrylate intermediate). The former HBs regulate the allosteric, β-subunit opening, whereas the latter HBs are essential for closing the β-subunit in a later step. The proposed mechanism for how the interdomain communication in TRPS is realized with ligand bindings is consistent with the previous experimental data, giving a general idea to interpret the allosteric regulations in multidomain proteins.
-
Implementation of residue-level coarse-grained models in GENESIS for large-scale molecular dynamics simulations
Cheng Tan, Jaewoon Jung, Chigusa Kobayashi, Diego Ugarte La Torre, Shoji Takada, and Yuji Sugita
Residue-level coarse-grained (CG) models have become one of the most popular tools in biomolecular simulations in the trade-off between modeling accuracy and computational efficiency. To investigate large-scale biological phenomena in molecular dynamics (MD) simulations with CG models, unified treatments of proteins and nucleic acids, as well as efficient parallel computations, are indispensable. In the GENESIS MD software, we implement several residue-level CG models, covering structure-based and context-based potentials for both well-folded biomolecules and intrinsically disordered regions. An amino acid residue in protein is represented as a single CG particle centered at the Cα atom position, while a nucleotide in RNA or DNA is modeled with three beads. Then, a single CG particle represents around ten heavy atoms in both proteins and nucleic acids. The input data in CG MD simulations are treated as GROMACS-style input files generated from a newly developed toolbox, GENESIS-CG-tool. To optimize the performance in CG MD simulations, we utilize multiple neighbor lists, each of which is attached to a different nonbonded interaction potential in the cell-linked list method. We found that random number generations for Gaussian distributions in the Langevin thermostat are one of the bottlenecks in CG MD simulations. Therefore, we parallelize the computations with message-passing-interface (MPI) to improve the performance on PC clusters or supercomputers. We simulate Herpes simplex virus (HSV) type 2 B-capsid and chromatin models containing more than 1,000 nucleosomes in GENESIS as examples of large-scale biomolecular simulations with residue-level CG models. This framework extends accessible spatial and temporal scales by multi-scale simulations to study biologically relevant phenomena, such as genome-scale chromatin folding or phase-separated membrane-less condensations.
-
Protein assembly and crowding simulations
Lim Heo, Yuji Sugita, and Michael Feig
Proteins encounter frequent molecular interactions in biological environments. Computer simulations have become an increasingly important tool in providing mechanistic insights into how such interactions in vivo relate to their biological function. The review here focuses on simulations describing protein assembly and molecular crowding effects as two important aspects that are distinguished mainly by how specific and long-lived protein contacts are. On the topic of crowding, recent simulations have provided new insights into how crowding affects protein folding and stability, modulates enzyme activity, and affects diffusive properties. Recent studies of assembly processes focus on assembly pathways, especially for virus capsids, amyloid aggregation pathways, and the role of multivalent interactions leading to phase separation. Also, discussed are technical challenges in achieving increasingly realistic simulations of interactions in cellular environments.
-
The inherent flexibility of receptor binding domains in SARS-CoV-2 spike protein
Hisham M Dokainish, Suyong Re, Takaharu Mori, Chigusa Kobayashi, Jaewoon Jung, and Yuji Sugita
Spike (S) protein is the primary antigenic target for neutralization and vaccine development for the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). It decorates the virus surface and undergoes large motions of its receptor binding domains (RBDs) to enter the host cell. Here, we observe Down, one-Up, one-Open, and two-Up-like structures in enhanced molecular dynamics simulations, and characterize the transition pathways via inter-domain interactions. Transient salt-bridges between RBDA and RBDC and the interaction with glycan at N343B support RBDA motions from Down to one-Up. Reduced interactions between RBDA and RBDB in one-Up induce RBDB motions toward two-Up. The simulations overall agree with cryo-electron microscopy structure distributions and FRET experiments and provide hidden functional structures, namely, intermediates along Down-to-one-Up transition with druggable cryptic pockets as well as one-Open with a maximum exposed RBD. The inherent flexibility of S-protein thus provides essential information for antiviral drug rational design or vaccine development.
-
Weight average approaches for predicting dynamical properties of biomolecules
Kiyoshi Yagi, Suyong Re, Takaharu Mori, and Yuji Sugita
Recent advances in atomistic molecular dynamics (MD) simulations of biomolecules allow us to explore their conformational spaces widely, observing large-scale conformational fluctuations or transitions between distinct structures. To reproduce or refine experimental data using MD simulations, structure ensembles, which are characterized by multiple structures and their statistical weights on the rugged free-energy landscapes, are often used. Here, we summarize weight average approaches for various experimental measurements. Weight average approaches are now applied to hybrid quantum mechanics/molecular mechanics MD simulations to predict fast vibrational motions in a protein with a high accuracy for better understanding of molecular functions from atomic structures.