出版物2022-2026

2024

    • GENESIS 2.1: High-Performance Molecular Dynamics Software for Enhanced Sampling and Free-Energy Calculations for Atomistic, Coarse-Grained, and Quantum Mechanics/Molecular Mechanics Models

      Jaewoon Jung, Kiyoshi Yagi, Cheng Tan, Hiraku Oshima, Takaharu Mori, Isseki Yu, Yasuhiro Matsunaga, Chigusa Kobayashi, Shingo Ito, Diego Ugarte La Torre, Yuji Sugita
      The machine learning (ML) method emerges as an efficient and precise surrogate model for high-level electronic structure theory. Its application has been limited to closed chemical systems without considering external potentials from the surrounding environment. To address this limitation and incorporate the influence of external potentials, polarization effects, and long-range interactions between a chemical system and its environment, the first two terms of the Taylor expansion of an electrostatic operator have been used as extra input to the existing ML model to represent the electrostatic environments. However, high-order electrostatic interaction is often essential to account for external potentials from the environment. The existing models based only on invariant features cannot capture significant distribution patterns of the external potentials. Here, we propose a novel ML model that includes high-order terms of the Taylor expansion of an electrostatic operator and uses an equivariant model, which can generate a high-order tensor covariant with rotations as a base model. Therefore, we can use the multipole-expansion equation to derive a useful representation by accounting for polarization and intermolecular interaction. Moreover, to deal with long-range interactions, we follow the same strategy adopted to derive long-range interactions between a target system and its environment media. Our model achieves higher prediction accuracy and transferability among various environment media with these modifications.
    • Learning QM/MM potential using equivariant multiscale model

      Yao-Kun Lei, Kiyoshi Yagi, Yuji Sugita
      The machine learning (ML) method emerges as an efficient and precise surrogate model for high-level electronic structure theory. Its application has been limited to closed chemical systems without considering external potentials from the surrounding environment. To address this limitation and incorporate the influence of external potentials, polarization effects, and long-range interactions between a chemical system and its environment, the first two terms of the Taylor expansion of an electrostatic operator have been used as extra input to the existing ML model to represent the electrostatic environments. However, high-order electrostatic interaction is often essential to account for external potentials from the environment. The existing models based only on invariant features cannot capture significant distribution patterns of the external potentials. Here, we propose a novel ML model that includes high-order terms of the Taylor expansion of an electrostatic operator and uses an equivariant model, which can generate a high-order tensor covariant with rotations as a base model. Therefore, we can use the multipole-expansion equation to derive a useful representation by accounting for polarization and intermolecular interaction. Moreover, to deal with long-range interactions, we follow the same strategy adopted to derive long-range interactions between a target system and its environment media. Our model achieves higher prediction accuracy and transferability among various environment media with these modifications.
    • GENESIS CGDYN: large-scale coarse-grained MD simulation with dynamic load balancing for heterogeneous biomolecular systems

      Jaewoon Jung, Cheng Tan, Yuji Sugita
      Residue-level coarse-grained (CG) molecular dynamics (MD) simulation is widely used to investigate slow biological processes that involve multiple proteins, nucleic acids, and their complexes. Biomolecules in a large simulation system are distributed non-uniformly, limiting computational efficiency with conventional methods. Here, we develop a hierarchical domain decomposition scheme with dynamic load balancing for heterogeneous biomolecular systems to keep computational efficiency even after drastic changes in particle distribution. These schemes are applied to the dynamics of intrinsically disordered protein (IDP) droplets. During the fusion of two droplets, we find that the changes in droplet shape correlate with the mixing of IDP chains. Additionally, we simulate large systems with multiple IDP droplets, achieving simulation sizes comparable to those observed in microscopy. In our MD simulations, we directly observe Ostwald ripening, a phenomenon where small droplets dissolve and their molecules redeposit into larger droplets. These methods have been implemented in CGDYN of the GENESIS software, offering a tool for investigating mesoscopic biological processes using the residue-level CG models.
    • Molecular dynamics in multidimensional space explains how mutations affect the association path of neomycin to a riboswitch

      Piotr Chyży, Marta Kulik, Ai Shinobu, Suyong Re, Yuji Sugita, Joanna Trylska
      Riboswitches are messenger RNA (mRNA) fragments binding specific small molecules to regulate gene expression. A synthetic N1 riboswitch, inserted into yeast mRNA controls the translation of a reporter gene in response to neomycin. However, its regulatory activity is sensitive to single-point RNA mutations, even those distant from the neomycin binding site. While the association paths of neomycin to N1 and its variants remain unknown, recent fluorescence kinetic experiments indicate a two-step process driven by conformational selection. This raises the question of which step is affected by mutations. To address this, we performed all-atom two-dimensional replica-exchange molecular dynamics simulations for N1 and U14C, U14C, U15A, and A17G mutants, ensuring extensive conformational sampling of both RNA and neomycin. The obtained neomycin association and binding paths, along with multidimensional free-energy profiles, revealed a two-step binding mechanism, consisting of conformational selection and induced fit. Neomycin binds to a preformed N1 conformation upon identifying a stable upper stem and U-turn motif in the riboswitch hairpin. However, the positioning of neomycin in the binding site occurs at different RNA–neomycin distances for each mutant, which may explain their different regulatory activities. The subsequent induced fit arises from the interactions of the neomycin’s N3 amino group with RNA, causing the G9 backbone to rearrange. In the A17G mutant, the critical C6-A17/G17 stacking forms at a closer RNA–neomycin distance compared to N1. These findings together with estimated binding free energies coincide with experiments and elucidate why the A17G mutation decreases and U15A enhances N1 activity in response to neomycin.
    • Free-energy landscapes of transmembrane homodimers by bias-exchange adaptively biased molecular dynamics

      Shingo Ito, Yuji Sugita
      Membrane proteins play essential roles in various biological functions within the cell. One of the most common functional regulations involves the dimerization of two single-pass transmembrane (TM) helices. Glycophorin A (GpA) and amyloid precursor protein (APP) form TM homodimers in the membrane, which have been investigated both experimentally and computationally. The homodimer structures are well characterized using only four collective variables (CVs) when each TM helix is stable. The CVs are the interhelical distance, the crossing angle, and the Crick angles for two TM helices. However, conformational sampling with multi-dimensional replica-exchange umbrella sampling (REUS) requires too many replicas to sample all the CVs for exploring the conformational landscapes. Here, we show that the bias-exchange adaptively biased molecular dynamics (BE-ABMD) with the four CVs effectively explores the free-energy landscapes of the TM helix dimers of GpA, wild-type APP and its mutants in the IMM1 implicit membrane. Compared to the original ABMD, the bias-exchange algorithm in BE-ABMD can provide a more rapidly converged conformational landscape. The BE-ABMD simulations could also reveal TM packing interfaces of the membrane proteins and the dependence of the free-energy landscapes on the membrane thickness. This approach is valuable for numerous other applications, including those involving explicit solvent and a lipid bilayer in all-atom force fields or Martini coarse-grained models, and enhances our understanding of protein-protein interactions in biological membranes.
    • SPANA: Spatial decomposition analysis for cellular‐scale molecular dynamics simulations

      Isseki Yu, Takaharu Mori, Daisuke Matsuoka, Donatas Surblys, Yuji Sugita
      The rapid increase in computational power with the latest supercomputers has enabled atomistic molecular dynamics (MDs) simulations of biomolecules in biological membrane, cytoplasm, and other cellular environments. These environments often contain a million or more atoms to be simulated simultaneously. Therefore, their trajectory analyses involve heavy computations that can become a bottleneck in the computational studies. Spatial decomposition analysis (SPANA) is a set of analysis tools in the Generalized-Ensemble Simulation System (GENESIS) software package that can carry out MD trajectory analyses of large-scale biological simulations using multiple CPU cores in parallel. SPANA applies the spatial decomposition of a large biological system to distribute structural and dynamical analyses into individual CPU cores, which reduces the computational time and the memory size, significantly. SPANA opens new possibilities for detailed atomistic analyses of biomacromolecules as well as solvent water molecules, ions, and metabolites in MD simulation trajectories of very large biological systems containing more than millions of atoms in cellular environments.
    • Deciphering the Multi-state Conformational Equilibrium of HDM2 in the Regulation of p53 Binding: Perspectives from Molecular Dynamics Simulation and NMR Analysis

      Kazuki Watanabe, Qingci Zhao, Ryosuke Iwatsuki, Ryota Fukui, Weitong Ren, Yuji Sugita, Noritaka Nishida
      HDM2 negatively regulates the activity of the tumor suppressor p53. Previous NMR studies have shown that apo-HDM2 interconverts between an “open” state in which the N-terminal “lid” is disordered and a “closed” state in which the lid covers the p53-binding site in the core region. Molecular dynamics (MD) simulation studies have been performed to elucidate the conformational dynamics of HDM2, but the direct relevance of the experimental and computational analyses is unclear. In addition, how the phosphorylation of S17 in the lid contributes to the inhibition of p53 binding remains controversial. Here, we used both NMR and MD simulations to investigate the conformational dynamics of apo-HDM2. The NMR analysis revealed that apo-HDM2 exists in a fast-exchanging equilibrium within two closed states, closed 1 and closed 2, in addition to a previously demonstrated slow-exchanging “open-closed” equilibrium. MD simulations visualized two characteristic closed states, where the spatial orientation of the key residues corresponds well to the chemical shift changes of the NMR spectra. Furthermore, the phosphorylation of S17 induced an equilibrium shift toward closed 1, thereby suppressing the binding of p53 to HDM2. This study reveals a multi-state equilibrium of apo-HDM2 and provides new insights into the regulation mechanism of HDM2-p53 interactions.
    • Pseudo-Luciferase Activity of the SARS-CoV-2 Spike Protein for Cypridina Luciferin

      Ryo Nishihara, Hisham M Dokainish, Yoshiki Kihara, Hiroki Ashiba, Yuji Sugita, Ryoji Kurita
      Enzymatic reactions that involve a luminescent substrate (luciferin) and enzyme (luciferase) from luminous organisms enable a luminescence detection of target proteins and cells with high specificity, albeit that conventional assay design requires a prelabeling of target molecules with luciferase. Here, we report a luciferase-independent luminescence assay in which the target protein directly catalyzes the oxidative luminescence reaction of luciferin. The SARS-CoV-2 antigen (spike) protein catalyzes the light emission of Cypridina luciferin, whereas no such catalytic function was observed for salivary proteins. This selective luminescence reaction is due to the enzymatic recognition of the 3-(1-guanidino)propyl group in luciferin at the interfaces between the units of the spike protein, allowing a specific detection of the spike protein in human saliva without sample pretreatment. This method offers a novel platform to detect virus antigens simply and rapidly without genetic manipulation or antibodies.
    • Similarity scores of vibrational spectra reveal the atomistic structure of pentapeptides in multiple basins

      H Otaki, S Ishiuchi, M Fujii, Y Sugita, K Yagi
      Vibrational spectroscopy combined with theoretical calculations is a powerful tool for analyzing the interaction and conformation of peptides at the atomistic level. Nonetheless, identifying the structure becomes increasingly difficult as the peptide size grows large. One example is acetyl-SIVSF-N-methylamide, a capped pentapeptide, whose atomistic structure has remained unknown since its first observation [T. Sekiguchi, M. Tamura, H. Oba, P. Çarçarbal, R. R. Lozada-Garcia, A. Zehnacker-Rentien, G. Grégoire, S. Ishiuchi and M. Fujii, Angew. Chem., Int. Ed., 2018, 57, 5626–5629]. Here, we propose a novel conformational search method, which exploits the structure-spectrum correlation using a similarity score that measures the agreement of theoretical and experimental spectra. Surprisingly, the two conformers have distinctly different energy and geometry. The second conformer is 25 kJ mol−1 higher in energy than the other, lowest-energy conformer. The result implies that there are multiple pathways in the early stage of the folding process: one to the global minimum and the other to a different basin. Once such a structure is established, the second conformer is unlikely to overcome the barrier to produce the most stable structure due to a vastly different hydrogen bond network of the backbone. Our proposed method can characterize the lowest-energy conformer and kinetically trapped, high-energy conformers of complex biomolecules.

2023

    • Allosteric regulation of β-reaction stage I in tryptophan synthase upon the α-ligand binding

      Shingo Ito, Kiyoshi Yagi, Yuji Sugita
      Tryptophan synthase (TRPS) is a bifunctional enzyme consisting of α- and β-subunits that catalyzes the last two steps of L-tryptophan (L-Trp) biosynthesis. The first stage of the reaction at the β-subunit is called β-reaction stage I, which converts the β-ligand from an internal aldimine [E(Ain)] to an α-aminoacrylate [E(A-A)] intermediate. The activity is known to increase 3–10-fold upon the binding of 3-indole-D-glycerol-3′-phosphate (IGP) at the α-subunit. The effect of α-ligand binding on β-reaction stage I at the distal β-active site is not well understood despite the abundant structural information available for TRPS. Here, we investigate the β-reaction stage I by carrying out minimum-energy pathway searches based on a hybrid quantum mechanics/molecular mechanics (QM/MM) model. The free-energy differences along the pathway are also examined using QM/MM umbrella sampling simulations with QM calculations at the B3LYP-D3/aug-cc-pVDZ level of theory. Our simulations suggest that the sidechain orientation of βD305 near the β-ligand likely plays an essential role in the allosteric regulation: a hydrogen bond is formed between βD305 and the β-ligand in the absence of the α-ligand, prohibiting a smooth rotation of the hydroxyl group in the quinonoid intermediate, whereas the dihedral angle rotates smoothly after the hydrogen bond is switched from βD305-β-ligand to βD305-βR141. This switch could occur upon the IGP-binding at the α-subunit, as evidenced by the existing TRPS crystal structures.
    • Highly Charged Proteins and Their Repulsive Interactions Antagonize Biomolecular Condensation

      Cheng Tan, Ai Niitsu, Yuji Sugita
      Biomolecular condensation is involved in various cellular processes; therefore, regulation of condensation is crucial to prevent deleterious protein aggregation and maintain a stable cellular environment. Recently, a class of highly charged proteins, known as heat-resistant obscure (Hero) proteins, was shown to protect other client proteins from pathological aggregation. However, the molecular mechanisms by which Hero proteins protect other proteins from aggregation remain unknown. In this study, we performed multiscale molecular dynamics (MD) simulations of Hero11, a Hero protein, and the C-terminal low-complexity domain (LCD) of the transactive response DNA-binding protein 43 (TDP-43), a client protein of Hero11, under various conditions to examine their interactions with each other. We found that Hero11 permeates into the condensate formed by the LCD of TDP-43 (TDP-43-LCD) and induces changes in conformation, intermolecular interactions, and dynamics of TDP-43-LCD. We also examined possible Hero11 structures in atomistic and coarse-grained MD simulations and found that Hero11 with a higher fraction of disordered region tends to assemble on the surface of the condensates. Based on the simulation results, we have proposed three possible mechanisms for Hero11’s regulatory function: (i) In the dense phase, TDP-43-LCD reduces contact with each other and shows faster diffusion and decondensation due to the repulsive Hero11–Hero11 interactions. (ii) In the dilute phase, the saturation concentration of TDP-43-LCD is increased, and its conformation is relatively more extended and variant, induced by the attractive Hero11–TDP-43-LCD interactions. (iii) Hero11 on the surface of small TDP-43-LCD condensates can contribute to avoiding their fusion due to repulsive interactions. The proposed mechanisms provide new insights into the regulation of biomolecular condensation in cells under various conditions.
    • Towards de novo design of transmembrane α-helical assemblies using structural modelling and molecular dynamics simulation

      Ai Niitsu, Yuji Sugita
      Computational de novo protein design involves iterative processes consisting of amino acid sequence design, structural modelling and scoring, and design validation by synthesis and experimental characterisation. Recent advances in protein structure prediction and modelling methods have enabled the highly efficient and accurate design of water-soluble proteins. However, the design of membrane proteins remains a major challenge. To advance membrane protein design, considering the higher complexity of membrane protein folding, stability, and dynamic interactions between water, ions, lipids, and proteins is an important task. For introducing explicit solvents and membranes to these design methods, all-atom molecular dynamics (MD) simulations of designed proteins provide useful information that cannot be obtained experimentally. In this review, we first describe two major approaches to designing transmembrane α-helical assemblies, consensus and de novo design. We further illustrate recent MD studies of membrane protein folding related to protein design, as well as advanced treatments in molecular models and conformational sampling techniques in the simulations. Finally, we discuss the possibility to introduce MD simulations after the existing static modelling and screening of design decoys as an additional step for refinement of the design, which considers membrane protein folding dynamics and interactions with explicit membranes.
    • Structural Effects of Spike Protein D614G Mutation in SARS-CoV-2

      Hisham Dokainish, Yuji Sugita
      A single mutation from aspartate to glycine at position 614 has dominated all circulating variants of the severe acute respiratory syndrome coronavirus 2. D614G mutation induces structural changes in the spike (S) protein that strengthen the virus infectivity. Here, we use molecular dynamics simulations to dissect the effects of mutation and 630-loop rigidification on S-protein structure. The introduction of the mutation orders the 630-loop structure and thereby induces global structural changes toward the cryoelectron microscopy structure of the D614G S-protein. The ordered 630-loop weakens local interactions between the 614th residue and others in contrast to disordered structures in the wild-type protein. The mutation allosterically alters global interactions between receptor-binding domains, forming an asymmetric and mobile down conformation and facilitating transitions toward up conformation. The loss of salt bridge between D614 and K854 upon the mutation generally stabilizes S-protein protomer, including the fusion peptide proximal region that mediates membrane fusion. Understanding the molecular basis of D614G mutation is crucial as it dominates in all variants of concern, including Delta and Omicron.

2022

    • Development of hidden Markov modeling method for molecular orientations and structure estimation from high-speed atomic force microscopy time-series images

      Tomonori Ogane, Daisuke Noshiro, Toshio Ando, Atsuko Yamashita, Yuji Sugita, Yasuhiro Matsunaga
      High-speed atomic force microscopy (HS-AFM) is a powerful technique for capturing the time-resolved behavior of biomolecules. However, structural information in HS-AFM images is limited to the surface geometry of a sample molecule. Inferring latent three-dimensional structures from the surface geometry is thus important for getting more insights into conformational dynamics of a target biomolecule. Existing methods for estimating the structures are based on the rigid-body fitting of candidate structures to each frame of HS-AFM images. Here, we extend the existing frame-by-frame rigid-body fitting analysis to multiple frames to exploit orientational correlations of a sample molecule between adjacent frames in HS-AFM data due to the interaction with the stage. In the method, we treat HS-AFM data as time-series data, and they are analyzed with the hidden Markov modeling. Using simulated HS-AFM images of the taste receptor type 1 as a test case, the proposed method shows a more robust estimation of molecular orientations than the frame-by-frame analysis. The method is applicable in integrative modeling of conformational dynamics using HS-AFM data.
    • Use of multistate Bennett acceptance ratio method for free-energy calculations from enhanced sampling and free-energy perturbation

      Yasuhiro Matsunaga, Motoshi Kamiya, Hiraku Oshima, Jaewoon Jung, Shingo Ito, and Yuji Sugita
      Multistate Bennett acceptance ratio (MBAR) works as a method to analyze molecular dynamics (MD) simulation data after the simulations have been finished. It is widely used to estimate free-energy changes between different states and averaged properties at the states of interest. MBAR allows us to treat a wide range of states from those at different temperature/pressure to those with different model parameters. Due to the broad applicability, the MBAR equations are rather difficult to apply for free-energy calculations using different types of MD simulations including enhanced conformational sampling methods and free-energy perturbation. In this review, we first summarize the basic theory of the MBAR equations and categorize the representative usages into the following four: (i) perturbation, (ii) scaling, (iii) accumulation, and (iv) full potential energy. For each, we explain how to prepare input data using MD simulation trajectories for solving the MBAR equations. MBAR is also useful to estimate reliable free-energy differences using MD trajectories based on a semi-empirical quantum mechanics/molecular mechanics (QM/MM) model and ab initio QM/MM energy calculations on the MD snapshots. We also explain how to use the MBAR software in the GENESIS package, which we call mbar_analysis, for the four representative cases. The proposed estimations of free-energy changes and thermodynamic averages are effective and useful for various biomolecular systems.
    • Multiple sub state structures of SERCA2b reveal conformational overlap at transition steps during the catalytic cycle

      Yuxia Zhang, Chigusa Kobayashi, Xiaohan Cai, Satoshi Watanabe, Akihisa Tsutsumi, Masahide Kikkawa, Yuji Sugita, Kenji Inaba
      Sarco/endoplasmic reticulum Ca2+ ATPase (SERCA) pumps Ca2+ into the endoplasmic reticulum (ER). Herein, we present cryo-electron microscopy (EM) structures of three intermediates of SERCA2b: Ca2+-bound phosphorylated (E1P·2Ca2+) and Ca2+-unbound dephosphorylated (E2·Pi) intermediates and another between the E2P and E2·Pi states. Our cryo-EM analysis demonstrates that the E1P·2Ca2+ state exists in low abundance and preferentially transitions to an E2P-like structure by releasing Ca2+ and that the Ca2+ release gate subsequently undergoes stepwise closure during the dephosphorylation processes. Importantly, each intermediate adopts multiple sub-state structures including those like the next one in the catalytic series, indicating conformational overlap at transition steps, as further substantiated by atomistic molecular dynamic simulations of SERCA2b in a lipid bilayer. The present findings provide insight into how enzymes accelerate catalytic cycles.
    • Formation of extramembrane β-strands controls dimerization of transmembrane helices in amyloid precursor protein C99

      George A. Pantelopulos, Daisuke Matsuoka, Yuji Sugita, John E. Straub
      The 99-residue C-terminal domain of amyloid precursor protein (APP-C99), precursor to amyloid beta (Aβ), is a transmembrane (TM) protein containing intrinsically disordered N- and C-terminal extramembrane domains. Using molecular dynamics (MD) simulations, we show that the structural ensemble of the C99 monomer is best described in terms of thousands of states. The C99 monomer has a propensity to form β-strand in the C-terminal extramembrane domain, which explains the slow spin relaxation times observed in paramagnetic probe NMR experiments. Surprisingly, homodimerization of C99 not only narrows the conformational ensemble from thousands to a few states through the formation of metastable β-strands in extramembrane domains but also stabilizes extramembrane α-helices. The extramembrane domain structure is observed to dramatically impact the homodimerization motif, resulting in the modification of TM domain conformations. Our study provides an atomic-level structural basis for communication between the extramembrane domains of the C99 protein and TM homodimer formation. This finding could serve as a general model for understanding the influence of disordered extramembrane domains on TM protein structure.
    • Modified protein-water interactions in CHARMM36m for thermodynamics and kinetics of proteins in dilute and crowded solutions

      Daiki Matsubara, Kento Kasahara, Hisham Dokainish, Hiraku Oshima, Yuji Sugita
      Proper balance between protein-protein and protein-water interactions is vital for atomistic molecular dynamics (MD) simulations of globular proteins as well as intrinsically disordered proteins (IDPs). The overestimation of protein-protein interactions tends to make IDPs more compact than those in experiments. Likewise, multiple proteins in crowded solutions are aggregated with each other too strongly. To optimize the balance, Lennard-Jones (LJ) interactions between protein and water are often increased about 10% (with a scaling parameter, λ = 1.1) from the existing force fields. Here, we explore the optimal scaling parameter of protein-water LJ interactions for CHARMM36m in conjunction with the modified TIP3P water model, by performing enhanced sampling MD simulations of several peptides in dilute solutions and conventional MD simulations of globular proteins in dilute and crowded solutions. In our simulations, 10% increase of protein-water LJ interaction for the CHARMM36m cannot maintain stability of a small helical peptide, (AAQAA)3 in a dilute solution and only a small modification of protein-water LJ interaction up to the 3% increase (λ = 1.03) is allowed. The modified protein-water interactions are applicable to other peptides and globular proteins in dilute solutions without changing thermodynamic properties from the original CHARMM36m. However, it has a great impact on the diffusive properties of proteins in crowded solutions, avoiding the formation of too sticky protein-protein interactions.
    • General Design Strategy to Precisely Control the Emission of Fluorophores via a Twisted Intramolecular Charge Transfer (TICT) Process

      Kenjiro Hanaoka, Shimpei Iwaki, Kiyoshi Yagi, Takuya Myochin, Takayuki Ikeno, Hisashi Ohno, Eita Sasaki, Toru Komatsu, Tasuku Ueno, Motokazu Uchigashima, Takayasu Mikuni, Kazuki Tainaka, Shinya Tahara, Satoshi Takeuchi, Tahei Tahara, Masanobu Uchiyama, Tetsuo Nagano, and Yasuteru Urano
      Fluorogenic probes for bioimaging have become essential tools for life science and medicine, and the key to their development is a precise understanding of the mechanisms available for fluorescence off/on control, such as photoinduced electron transfer (PeT) and Förster resonance energy transfer (FRET). Here we establish a new molecular design strategy to rationally develop activatable fluorescent probes, which exhibit a fluorescence off/on change in response to target biomolecules, by controlling the twisted intramolecular charge transfer (TICT) process. This approach was developed on the basis of a thorough investigation of the fluorescence quenching mechanism of N-phenyl rhodamine dyes (commercially available as the QSY series) by means of time-dependent density functional theory (TD-DFT) calculations and photophysical evaluation of their derivatives. To illustrate and validate this TICT-based design strategy, we employed it to develop practical fluorogenic probes for HaloTag and SNAP-tag. We further show that the TICT-controlled fluorescence off/on mechanism is generalizable by synthesizing a Si–rhodamine-based fluorogenic probe for HaloTag, thus providing a palette of chemical dyes that spans the visible and near-infrared range.
    • Protein folding intermediates on the dimensionality reduced landscape with UMAP and native contact likelihood

      Mao Oide and Yuji Sugita
      To understand protein folding mechanisms from molecular dynamics (MD) simulations, it is important to explore not only folded/unfolded states but also representative intermediate structures on the conformational landscape. Here, we propose a novel approach to construct the landscape using the uniform manifold approximation and projection (UMAP) method, which reduces the dimensionality without losing data-point proximity. In the approach, native contact likelihood is used as feature variables rather than the conventional Cartesian coordinates or dihedral angles of protein structures. We tested the performance of UMAP for coarse-grained MD simulation trajectories of B1 domain in protein G and observed on-pathway transient structures and other metastable states on the UMAP conformational landscape. In contrast, these structures were not clearly distinguished on the dimensionality reduced landscape using principal component analysis or time-lagged independent component analysis. This approach is also useful to obtain dynamical information through Markov state modeling and would be applicable to large-scale conformational changes in many other biomacromolecules.
    • Crystal structure of the lipid flippase MurJ in a “squeezed” form distinct from its inward- and outward-facing forms

      Hidetaka Kohga, Takaharu Mori, Yoshiki Tanaka, Kunihito Yoshikaie, Katsuhide Taniguchi, Kei Fujimoto, Lisa Fritz, Tanja Schneider, and Tomoya Tsukazaki
      The bacterial peptidoglycan enclosing the cytoplasmic membrane is a fundamental cellular architecture. The integral membrane protein MurJ plays an essential role in flipping the cell wall building block Lipid II across the cytoplasmic membrane for peptidoglycan biosynthesis. Previously reported crystal structures of MurJ have elucidated its V-shaped inward- or outward-facing forms with an internal cavity for substrate binding. MurJ transports Lipid II using its cavity through conformational transitions between these two forms. Here, we report two crystal structures of inward-facing forms from Arsenophonus endosymbiont MurJ and an unprecedented crystal structure of Escherichia coli MurJ in a “squeezed” form, which lacks a cavity to accommodate the substrate, mainly because of the increased proximity of transmembrane helices 2 and 8. Subsequent molecular dynamics simulations supported the hypothesis that the squeezed form is an intermediate conformation. This study fills a gap in our understanding of the Lipid II flipping mechanism.
    • Modified Hamiltonian in FEP calculations for reducing the computational cost of electrostatic interactions

      Hiraku Oshima and Yuji Sugita
      The free-energy perturbation (FEP) method predicts relative and absolute free-energy changes of biomolecules in solvation and binding with other molecules. FEP is, therefore, one of the most essential tools in in-silico drug design. In conventional FEP, to smoothly connect two thermodynamic states, the potential energy is modified as a linear combination of the end state potential energies by introducing scaling factors. When the particle mesh Ewald is used for electrostatic calculations, conventional FEP requires two reciprocal-space calculations per time step, which largely decreases the computational performance. To overcome this problem, we propose a new FEP scheme by introducing a modified Hamiltonian instead of interpolation of the end state potential energies. The scheme introduces non-uniform scaling into the electrostatic potential as used in Replica-Exchange with Solute Tempering 2 (REST2) and does not require additional reciprocal-space calculations. We tested this modified Hamiltonian in FEP calculations in several biomolecular systems. In all cases, the calculated free-energy changes with the current scheme are in good agreement with those from conventional FEP. The modified Hamiltonian in FEP greatly improves the computational performance, which is particularly marked for large biomolecular systems whose reciprocal-space calculations are the major bottleneck of total computational time.
    • Practical protocols for efficient sampling of kinase-inhibitor binding pathways using two-dimensional replica-exchange molecular dynamics

      Ai Shinobu, Suyong Re, and Yuji Sugita
      Molecular dynamics (MD) simulations are increasingly used to study various biological processes such as protein folding, conformational changes, and ligand binding. These processes generally involve slow dynamics that occur on the millisecond or longer timescale, which are difficult to simulate by conventional atomistic MD. Recently, we applied a two-dimensional (2D) replica-exchange MD (REMD) method, which combines the generalized replica exchange with solute tempering (gREST) with the replica-exchange umbrella sampling (REUS) in kinase-inhibitor binding simulations, and successfully observed multiple ligand binding/unbinding events. To efficiently apply the gREST/REUS method to other kinase-inhibitor systems, we establish modified, practical protocols with non-trivial simulation parameter tuning. The current gREST/REUS simulation protocols are tested for three kinase-inhibitor systems: c-Src kinase with PP1, c-Src kinase with Dasatinib, and c-Abl kinase with Imatinib. We optimized the definition of kinase-ligand distance as a collective variable (CV), the solute temperatures in gREST, and replica distributions and umbrella forces in the REUS simulations. Also, the initial structures of each replica in the 2D replica space were prepared carefully by pulling each ligand from and toward the protein binding sites for keeping stable kinase conformations. These optimizations were carried out individually in multiple short MD simulations. The current gREST/REUS simulation protocol ensures good random walks in 2D replica spaces, which are required for enhanced sampling of inhibitor dynamics around a target kinase.
    • Computational analysis on the allostery of tryptophan synthase: relationship between α/β-ligand binding and distal domain closure

      Shingo Ito, Kiyoshi Yagi, and Yuji Sugita
      Tryptophan synthase (TRPS) is a bifunctional enzyme consisting of α and β-subunits and catalyzes the last two steps of l-tryptophan (L-Trp) biosynthesis, namely, cleavage of 3-indole-d-glycerol-3′-phosphate (IGP) into indole and glyceraldehyde-3-phosphate (G3P) in the α-subunit, and a pyridoxal phosphate (PLP)-dependent reaction of indole and l-serine (L-Ser) to produce L-Trp in the β-subunit. Importantly, the IGP binding at the α-subunit affects the β-subunit conformation and its ligand-binding affinity, which, in turn, enhances the enzymatic reaction at the α-subunit. The intersubunit communications in TRPS have been investigated extensively for decades because of the fundamental and pharmaceutical importance, while it is still difficult to answer how TRPS allostery is regulated at the atomic detail. Here, we investigate the allosteric regulation of TRPS by all-atom classical molecular dynamics (MD) simulations and analyze the potential of mean-force (PMF) along conformational changes of the α- and β-subunits. The present simulation has revealed a widely opened conformation of the β-subunit, which provides a pathway for L-Ser to enter into the β-active site. The IGP binding closes the α-subunit and induces a wide opening of the β-subunit, thereby enhancing the binding affinity of L-Ser to the β-subunit. Structural analyses have identified critical hydrogen bonds (HBs) at the interface of the two subunits (αG181-βS178, αP57-βR175, etc.) and HBs between the β-subunit (βT110 – βH115) and a complex of PLP and L-Ser (an α-aminoacrylate intermediate). The former HBs regulate the allosteric, β-subunit opening, whereas the latter HBs are essential for closing the β-subunit in a later step. The proposed mechanism for how the interdomain communication in TRPS is realized with ligand bindings is consistent with the previous experimental data, giving a general idea to interpret the allosteric regulations in multidomain proteins.
    • Implementation of residue-level coarse-grained models in GENESIS for large-scale molecular dynamics simulations

      Cheng Tan, Jaewoon Jung, Chigusa Kobayashi, Diego Ugarte La Torre, Shoji Takada, and Yuji Sugita
      Residue-level coarse-grained (CG) models have become one of the most popular tools in biomolecular simulations in the trade-off between modeling accuracy and computational efficiency. To investigate large-scale biological phenomena in molecular dynamics (MD) simulations with CG models, unified treatments of proteins and nucleic acids, as well as efficient parallel computations, are indispensable. In the GENESIS MD software, we implement several residue-level CG models, covering structure-based and context-based potentials for both well-folded biomolecules and intrinsically disordered regions. An amino acid residue in protein is represented as a single CG particle centered at the Cα atom position, while a nucleotide in RNA or DNA is modeled with three beads. Then, a single CG particle represents around ten heavy atoms in both proteins and nucleic acids. The input data in CG MD simulations are treated as GROMACS-style input files generated from a newly developed toolbox, GENESIS-CG-tool. To optimize the performance in CG MD simulations, we utilize multiple neighbor lists, each of which is attached to a different nonbonded interaction potential in the cell-linked list method. We found that random number generations for Gaussian distributions in the Langevin thermostat are one of the bottlenecks in CG MD simulations. Therefore, we parallelize the computations with message-passing-interface (MPI) to improve the performance on PC clusters or supercomputers. We simulate Herpes simplex virus (HSV) type 2 B-capsid and chromatin models containing more than 1,000 nucleosomes in GENESIS as examples of large-scale biomolecular simulations with residue-level CG models. This framework extends accessible spatial and temporal scales by multi-scale simulations to study biologically relevant phenomena, such as genome-scale chromatin folding or phase-separated membrane-less condensations.
    • Protein assembly and crowding simulations

      Lim Heo, Yuji Sugita, and Michael Feig
      Proteins encounter frequent molecular interactions in biological environments. Computer simulations have become an increasingly important tool in providing mechanistic insights into how such interactions in vivo relate to their biological function. The review here focuses on simulations describing protein assembly and molecular crowding effects as two important aspects that are distinguished mainly by how specific and long-lived protein contacts are. On the topic of crowding, recent simulations have provided new insights into how crowding affects protein folding and stability, modulates enzyme activity, and affects diffusive properties. Recent studies of assembly processes focus on assembly pathways, especially for virus capsids, amyloid aggregation pathways, and the role of multivalent interactions leading to phase separation. Also, discussed are technical challenges in achieving increasingly realistic simulations of interactions in cellular environments.
    • The inherent flexibility of receptor binding domains in SARS-CoV-2 spike protein

      Hisham M Dokainish, Suyong Re, Takaharu Mori, Chigusa Kobayashi, Jaewoon Jung, and Yuji Sugita
      Spike (S) protein is the primary antigenic target for neutralization and vaccine development for the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). It decorates the virus surface and undergoes large motions of its receptor binding domains (RBDs) to enter the host cell. Here, we observe Down, one-Up, one-Open, and two-Up-like structures in enhanced molecular dynamics simulations, and characterize the transition pathways via inter-domain interactions. Transient salt-bridges between RBDA and RBDC and the interaction with glycan at N343B support RBDA motions from Down to one-Up. Reduced interactions between RBDA and RBDB in one-Up induce RBDB motions toward two-Up. The simulations overall agree with cryo-electron microscopy structure distributions and FRET experiments and provide hidden functional structures, namely, intermediates along Down-to-one-Up transition with druggable cryptic pockets as well as one-Open with a maximum exposed RBD. The inherent flexibility of S-protein thus provides essential information for antiviral drug rational design or vaccine development.
    • Weight average approaches for predicting dynamical properties of biomolecules

      Kiyoshi Yagi, Suyong Re, Takaharu Mori, and Yuji Sugita
      Recent advances in atomistic molecular dynamics (MD) simulations of biomolecules allow us to explore their conformational spaces widely, observing large-scale conformational fluctuations or transitions between distinct structures. To reproduce or refine experimental data using MD simulations, structure ensembles, which are characterized by multiple structures and their statistical weights on the rugged free-energy landscapes, are often used. Here, we summarize weight average approaches for various experimental measurements. Weight average approaches are now applied to hybrid quantum mechanics/molecular mechanics MD simulations to predict fast vibrational motions in a protein with a high accuracy for better understanding of molecular functions from atomic structures.