2022-2026 – Theoretical Molecular Science Laboratory

2025

- ColBuilder: Flexible structure generation of crosslinked collagen fibrils
  
  Debora Monego, Matthias Brosz, Johanna Buck, Vsevolod Viliuga, Paulius Greicius, Jaewoon Jung, Torsten Stuehn, Matthias Schmies, Yuji Sugita, Frauke Gräter
  
  Bioinformatics (2025), btaf278.
  
  Motivation: Collagen fibrils are fundamental building blocks of connective tissues, yet generating accurate molecular models of their structure remains challenging due to their hierarchical organization and complex crosslinking patterns. Results: ColBuilder has been developed to automate the generation of atomistic models of crosslinked collagen fibrils and facilitate the setup of molecular simulations. The tool integrates homology modeling, higher-order structure generation and optimization to build complete fibril structures with precise control over sequence composition, crosslinking patterns, and dimensions. Users can explore different collagen sequences, manipulate crosslink chemistry through mixed ratios and densities, and generate fibrils of varying diameter and length. All-atom molecular dynamics simulations of 335 nm-long fibrils validate the generated structures, showing excellent agreement with experimental measurements of D-band periodicity and force-extension behavior. Availability and Implementation: ColBuilder is available both as an open-source command-line application and through a web interface at https://colbuilder.mpip-mainz.mpg.de. Contact: monegod@mpip-mainz.mpg.de Supplementary Information: Supplementary data are available at Bioinformatics online.
- In silico nanoscope to study the interplay of genome organization and transcription regulation
  
  Soundhararajan Gopi, Giovanni B Brandani, Cheng Tan, Jaewoon Jung, Chenyang Gu, Azuki Mizutani, Hiroshi Ochiai, Yuji Sugita, Shoji Takada
  
  Nucleic Acids Research 53, gkaf189 (2025).
  
  In eukaryotic genomes, regulated access and communication between cis-regulatory elements (CREs) are necessary for enhancer-mediated transcription of genes. The molecular framework of the chromatin organization underlying such communication remains poorly understood. To better understand it, we develop a multiscale modeling pipeline to build near-atomistic models of the 200 kb Nanog gene locus in mouse embryonic stem cells comprising nucleosomes, transcription factors, co-activators, and RNA polymerase II–mediator complexes. By integrating diverse experimental data, including protein localization, genomic interaction frequencies, cryo-electron microscopy, and single-molecule fluorescence studies, our model offers novel insights into chromatin organization and its role in enhancer–promoter communication. The models equilibrated by high-performance molecular dynamics simulations span a scale of ∼350 nm, revealing an experimentally consistent local and global organization of chromatin and transcriptional machinery. Our models elucidate that the sequence-regulated chromatin accessibility facilitates the recruitment of transcription regulatory proteins exclusively at CREs, guided by the contrasting nucleosome organization compared to other regions. By constructing an experimentally consistent near-atomic model of chromatin in the cellular environment, our approach provides a robust framework for future studies on nuclear compartmentalization, chromatin organization, and transcription regulation.
- Heterogeneous condensates of transcription factors in embryonic stem cells: Molecular simulations
  
  Azuki Mizutani, Cheng Tan, Yuji Sugita, Shoji Takada
  
  Biophysical Journal (2025), in press.
  
  Biomolecular condensates formed via liquid-liquid phase separation are ubiquitous in cells, especially in the nucleus. While condensates containing one or two kinds of biomolecules have been relatively well characterized, those with more heterogeneous biomolecular components and interactions between biomolecules inside are largely unknown. This study used residue-resolution molecular dynamics simulations to investigate heterogeneous protein assemblies that include four master transcription factors in mammalian embryonic stem cells: Oct4, Sox2, Klf4, and Nanog. Molecular dynamics simulations of the mixture systems showed highly heterogeneous and dynamic behaviors; protein condensates mainly contain Sox2, Klf4, and Nanog, while most Oct4 are dissolved into the dilute phase. The condensate forms loosely interacting clusters where Klf4 is the most abundant, suggesting that Klf4 serves as a scaffold of the condensate, and Sox2 and Nanog are bound to Klf4 for stabilizing the condensate. Oct4 is moderately recruited to the condensate, serving as a client mainly via its interaction with Sox2. This study highlights the importance of intermolecular interaction between different transcription factors on the condensate formations with heterogeneous biomolecular components.
- High-performance QM/MM Enhanced Sampling Molecular Dynamics Simulations with GENESIS SPDYN and QSimulate-QM
  
  Kiyoshi Yagi, Klaas Gunst, Toru Shiozaki, Yuji Sugita
  
  J. Chem. Theory Comp. 21, 8, 4016–4029 (2025).
  
  A new module for quantum mechanical/molecular mechanical (QM/MM) calculations is implemented in a molecular dynamics (MD) program, SPDYN in GENESIS, interfaced with an electronic structure program, QSimulate-QM. The periodic boundary condition (PBC) in QM/MM simulation is incorporated via QM calculation in real space with duplicated MM charges and particle mesh Ewald (PME) calculation with QM and MM charges. A highly optimized code is implemented in QSimulate-QM, particularly for the density functional tight-binding (DFTB) method, where the interaction between the QM and MM regions is computed utilizing multipole expansions. Together with highly parallelized algorithms in SPDYN, the developed program performs MD simulations based on DFTB in the QM size of ∼100 atoms and MM of ∼100,000 atoms with a better performance than 1 ns/day using one computer node. This feature paves the way for QM/MM-MD enhanced sampling simulations. Various enhanced sampling methods in GENESIS, namely, generalized replica exchange solute tempering (gREST), replica-exchange umbrella sampling (REUS), and path sampling with the string method, are demonstrated at the QM/MM level to compute the Ramachandran plot of alanine dipeptide and the potential of mean force (PMF) of a proton transfer reaction in an enzyme.
- The need to implement FAIR principles in biomolecular simulations
  
  Rommie E Amaro, Johan Åqvist, Ivet Bahar, Federica Battistini, Adam Bellaiche, Daniel Beltran, Philip C Biggin, Massimiliano Bonomi, Gregory R Bowman, Richard A Bryce, Giovanni Bussi, Paolo Carloni, David A Case, Andrea Cavalli, Chia-En A Chang, Thomas E Cheatham III, Margaret S Cheung, Christophe Chipot, Lillian T Chong, Preeti Choudhary, G Andres Cisneros, Cecilia Clementi, Rosana Collepardo-Guevara, Peter Coveney, Roberto Covino, T Daniel Crawford, Matteo Dal Peraro, Bert L De Groot, Lucie Delemotte, Marco De Vivo, Jonathan W Essex, Franca Fraternali, Jiali Gao, Josep Ll Gelpí, Francesco L Gervasio, Fernando D González-Nilo, Helmut Grubmüller, Marina G Guenza, Horacio V Guzman, Sarah Harris, Teresa Head-Gordon, Rigoberto Hernandez, Adam Hospital, Niu Huang, Xuhui Huang, Gerhard Hummer, Javier Iglesias-Fernández, Jan H Jensen, Shantenu Jha, Wanting Jiao, William L Jorgensen, Shina CL Kamerlin, Syma Khalid, Charles Laughton, Michael Levitt, Vittorio Limongelli, Erik Lindahl, Kresten Lindorff-Larsen, Sharon Loverde, Magnus Lundborg, Yun L Luo, F Javier Luque, Charlotte I Lynch, Alexander D MacKerell Jr, Alessandra Magistrato, Siewert J Marrink, Hugh Martin, J Andrew McCammon, Kenneth Merz, Vicent Moliner, Adrian J Mulholland, Sohail Murad, Athi N Naganathan, Shikha Nangia, Frank Noe, Agnes Noy, Julianna Oláh, Megan L O’Mara, Mary Jo Ondrechen, Jose N Onuchic, Alexey Onufriev, Sílvia Osuna, Giulia Palermo, Anna R Panchenko, Sergio Pantano, Carol Parish, Michele Parrinello, Alberto Perez, Tomas Perez-Acle, Juan R Perilla, B Montgomery Pettitt, Adriana Pietropaolo, Jean-Philip Piquemal, Adolfo B Poma, Matej Praprotnik, Maria J Ramos, Pengyu Ren, Nathalie Reuter, Adrian Roitberg, Edina Rosta, Carme Rovira, Benoit Roux, Ursula Rothlisberger, Karissa Y Sanbonmatsu, Tamar Schlick, Alexey K Shaytan, Carlos Simmerling, Jeremy C Smith, Yuji Sugita, Katarzyna Świderek, Makoto Taiji, Peng Tao, D Peter Tieleman, Irina G Tikhonova, Julian Tirado-Rives, Iñaki Tuñón, Marc W van der Kamp, David van der Spoel, Sameer Velankar, Gregory A Voth, Rebecca Wade, Ariel Warshel, Valerie Vaissier Welborn, Stacey D Wetmore, Travis J Wheeler, Chung F Wong, Lee-Wei Yang, Martin Zacharias, Modesto Orozco
  
  Nature Methods 22, 641–645 (2025).
  
  In the Big Data era, a change of paradigm in the use of molecular dynamics is required. Trajectories should be stored under FAIR (findable, accessible, interoperable and reusable) requirements to favor its reuse by the community under an open science paradigm.
- Rational Design Principles for De Novo α-Helical Peptide Barrels with Dynamic Conductive Channels
  
  Ai Niitsu, Andrew R Thomson, Alistair J Scott, Jason T Sengel, Jaewoon Jung, Kozhinjampara R Mahendran, Mikiko Sodeoka, Hagan Bayley, Yuji Sugita, Derek N Woolfson, Mark I Wallace
  
  J. Am. Chem. Soc. 147, 14, 11741–11753 (2025).
  
  Despite advances in peptide and protein design, the rational design of membrane-spanning peptides that form conducting channels remains challenging due to our imperfect understanding of the sequence-to-structure relationships that drive membrane insertion, assembly, and conductance. Here, we describe the design and computational and experimental characterization of a series of coiled coil-based peptides that form transmembrane α-helical barrels with conductive channels. Through a combination of rational and computational design, we obtain barrels with 5 to 7 helices, as characterized in detergent micelles. In lipid bilayers, these peptide assemblies exhibit two conductance states with relative populations dependent on the applied potential: (i) low-conductance states that correlate with variations in the designed amino-acid sequences and modeled coiled-coil barrel geometries, indicating stable transmembrane α-helical barrels; and (ii) high-conductance states in which single channels change size in discrete steps. Notably, the high-conductance states are similar for all peptides in contrast to the low-conductance states. This indicates the formation of large, dynamic channels, as observed in natural barrel-stave peptide channels. These findings establish rational routes to design and tune functional membrane-spanning peptide channels with specific conductance and geometry.
- Langevin integration for isothermal–isobaric condition with a large time step
  
  Jaewoon Jung, Yuji Sugita
  
  J. Chem. Phys. 162, 104108 (2025)
  
  We propose an accurate method for evaluating temperature and pressure in Langevin integration, based on the approach by Leimkuhler and Matthews (J. Chem. Phys. 138, 174102). This method improves the quality of configuration space than other Langevin dynamics methods. However, it encounters issues in pressure evaluation due to inaccuracies in momentum space. In particular, the conventional approach for calculating kinetic temperature using the full-time step momentum introduces errors proportional to the square of the time step (Δt2), leading to unreliable results when employing a large time step under isothermal–isobaric conditions. By calculating kinetic energy using the half-time step momentum in pressure evaluation, we can reduce the numerical errors. We performed molecular dynamics (MD) simulations using our refined pressure evaluation and improved accuracy and stability in the isothermal–isobaric MD simulations even with a long time step (Δt = 5 fs).
- Efficient Training of Neural Network Potentials for Chemical and Enzymatic Reactions by Continual Learning
  
  Yao-Kun Lei, Kiyoshi Yagi, Yuji Sugita
  
  J. Chem. Theory Comp. 21, 5, 2695–2711 (2025).
  
  Machine learning (ML) methods have emerged as an efficient surrogate for high-level electronic structure theory, offering precision and computational efficiency. However, the vast conformational and chemical space remains challenging when constructing a general force field. Training data sets typically cover only a limited region of this space, resulting in poor extrapolation performance. Traditional strategies must address this problem by training models from scratch using old and new data sets. In addition, model transferability is crucial for general force field construction. Existing ML force fields, designed for closed systems with no external environmental potential, exhibit limited transferability to complex condensed phase systems such as enzymatic reactions, resulting in inferior performance and high memory costs. Our ML/MM model, based on the Taylor expansion of the electrostatic operator, showed high transferability between reactions in several simple solvents. This work extends the strategy to enzymatic reactions to explore the transferability between more complex heterogeneous environments. In addition, we also apply continual learning strategies based on memory data sets to enable autonomous and on-the-fly training on a continuous stream of new data. By combining these two methods, we can efficiently construct a force field that can be applied to chemical reactions in various environmental media.
- Structural dynamics of a designed peptide pore under an external electric field
  
  Ai Niitsu, Jaewoon Jung, Yuji Sugita
  
  Biophys. Chem. 318, March 2025, 107380 (2025).
  
  Membrane potential is essential in biological signaling and homeostasis maintained by voltage-sensitive membrane proteins. Molecular dynamics (MD) simulations incorporating membrane potentials have been extensively used to study the structures and functions of ion channels and protein pores. They can also be beneficial in designing and characterizing artificial ion channels and pores, which will guide further amino acid sequence optimization through comparison between the predicted models and experimental data. In this study, we implemented a uniform external electric field function in the GENESIS MD simulation package to investigate the conformational dynamics of de novo-designed peptide pores. Our simulations and single-channel current recording experiments demonstrate that both charged amino acid residues in the N-terminal sequence of the peptide and the membrane potential are crucial for the structural stability and dynamics of the peptide pores. This suggests that MD simulations with an external electric field enable more accurate screening of designed proteins functioning under membrane potentials, which will ultimately contribute to a deeper understanding of voltage-sensitive membrane proteins from a bottom-up synthetic biology perspective.
- Scaling molecular dynamics for large-scale simulation of biological systems on AMD CPU/GPU supercomputers: Lessons from LUMI: Optimizing GENESIS for maximizing the computational efficiency of CPU and GPU kernels on the LUMI supercomputer
  
  Diego Ugarte La Torre, Jaewoon Jung, Yuji Sugita
  
  HPCASIA ’25: Proceedings of the International Conference on High Performance Computing in Asia-Pacific Region 1-12 (2025).
  
  This paper presents optimizations of the GENESIS molecular dynamics (MD) software for LUMI. GENESIS is a state-of-the-art MD program capable of simulating large-scale biological systems, yet it requires significant optimizations and investigations to fully utilize the computational resources of the LUMI-G hardware partition of LUMI, a hybrid architecture featuring AMD EPYC processors and AMD Instinct GPUs. Our work focuses on (1) addressing CPU-GPU communication bottlenecks, (2) tuning electrostatic interaction calculations using the Particle Mesh Ewald (PME) method, and (3) adapting GENESIS kernels for the AMD GPUs. A significant improvement in the MD performance is achieved through the NUMA mapping on each LUMI-G node. This improvement is shown in benchmark results for different molecular systems comprising from 2×104 to 1.6×107 atoms. The reciprocal-space calculation in the PME method on LUMI-G is faster than that on FUGAKU. Since the reciprocal-space calculation is carried out using only CPUs on both LUMI-G and FUGAKU, the results highlight the superior network performances on the former. The overall performance of a large system containing 1.6×107 atoms shows 200.8 ns/days when using 1024 nodes on LUMI-G, which is a better performance than previously reported results using general-purpose supercomputers. We conclude by providing recommendations for the optimal configurations of GENESIS on LUMI-G, including guidelines for selecting PME schemes for different system sizes and computational requirements. This optimization allows us to study cellular systems’ stability, dynamics, and function relationships using atomistic MD simulations on a large scale.

2024

- Allosteric changes in the conformational landscape of Src kinase upon substrate binding
  
  Song-Ho Chong, Hiraku Oshima, Yuji Sugita
  
  J. Mol. Biol. (2024), in press.
  
  Precise regulation of protein kinase activity is crucial in cell functions, and its loss is implicated in various diseases. The kinase activity is regulated by interconverting active and inactive states in the conformational landscape. However, how protein kinases switch conformations in response to different signals such as the binding at distinct sites remains incompletely understood. Here, we predict the binding mode for the peptide substrate to Src tyrosine kinase using enhanced conformational sampling simulations (totaling 24 μs) and then investigate changes in the conformational landscape upon substrate binding by conducting unbiased molecular dynamics simulations (totaling 50 μs) initiated from the apo and substrate-bound forms. Unexpectedly, the peptide substrate binding significantly facilitates the transitions from active to inactive conformations in which the αC helix is directed outward, the regulatory spine is broken, and the ATP-binding domain is perturbed. We also explore an underlying residue-contact network responsible for the allosteric conformational changes. Our results are in accord with the recent experiments reporting the negative cooperativity between the peptide substrate and ATP binding to tyrosine kinases and will contribute to advancing our understanding of the regulation mechanisms for kinase activity.
- GENESIS 2.1: High-Performance Molecular Dynamics Software for Enhanced Sampling and Free-Energy Calculations for Atomistic, Coarse-Grained, and Quantum Mechanics/Molecular Mechanics Models
  
  Jaewoon Jung, Kiyoshi Yagi, Cheng Tan, Hiraku Oshima, Takaharu Mori, Isseki Yu, Yasuhiro Matsunaga, Chigusa Kobayashi, Shingo Ito, Diego Ugarte La Torre, Yuji Sugita
  
  J. Phys. Chem. B 128, 6028–6048(2024)
  
  The machine learning (ML) method emerges as an efficient and precise surrogate model for high-level electronic structure theory. Its application has been limited to closed chemical systems without considering external potentials from the surrounding environment. To address this limitation and incorporate the influence of external potentials, polarization effects, and long-range interactions between a chemical system and its environment, the first two terms of the Taylor expansion of an electrostatic operator have been used as extra input to the existing ML model to represent the electrostatic environments. However, high-order electrostatic interaction is often essential to account for external potentials from the environment. The existing models based only on invariant features cannot capture significant distribution patterns of the external potentials. Here, we propose a novel ML model that includes high-order terms of the Taylor expansion of an electrostatic operator and uses an equivariant model, which can generate a high-order tensor covariant with rotations as a base model. Therefore, we can use the multipole-expansion equation to derive a useful representation by accounting for polarization and intermolecular interaction. Moreover, to deal with long-range interactions, we follow the same strategy adopted to derive long-range interactions between a target system and its environment media. Our model achieves higher prediction accuracy and transferability among various environment media with these modifications.
- Learning QM/MM potential using equivariant multiscale model
  
  Yao-Kun Lei, Kiyoshi Yagi, Yuji Sugita
  
  J. Chem. Phys. 160, 214109(2024)
  
  The machine learning (ML) method emerges as an efficient and precise surrogate model for high-level electronic structure theory. Its application has been limited to closed chemical systems without considering external potentials from the surrounding environment. To address this limitation and incorporate the influence of external potentials, polarization effects, and long-range interactions between a chemical system and its environment, the first two terms of the Taylor expansion of an electrostatic operator have been used as extra input to the existing ML model to represent the electrostatic environments. However, high-order electrostatic interaction is often essential to account for external potentials from the environment. The existing models based only on invariant features cannot capture significant distribution patterns of the external potentials. Here, we propose a novel ML model that includes high-order terms of the Taylor expansion of an electrostatic operator and uses an equivariant model, which can generate a high-order tensor covariant with rotations as a base model. Therefore, we can use the multipole-expansion equation to derive a useful representation by accounting for polarization and intermolecular interaction. Moreover, to deal with long-range interactions, we follow the same strategy adopted to derive long-range interactions between a target system and its environment media. Our model achieves higher prediction accuracy and transferability among various environment media with these modifications.
- GENESIS CGDYN: large-scale coarse-grained MD simulation with dynamic load balancing for heterogeneous biomolecular systems
  
  Jaewoon Jung, Cheng Tan, Yuji Sugita
  
  Nat. Commun. 15, 3370(2024)
  
  Residue-level coarse-grained (CG) molecular dynamics (MD) simulation is widely used to investigate slow biological processes that involve multiple proteins, nucleic acids, and their complexes. Biomolecules in a large simulation system are distributed non-uniformly, limiting computational efficiency with conventional methods. Here, we develop a hierarchical domain decomposition scheme with dynamic load balancing for heterogeneous biomolecular systems to keep computational efficiency even after drastic changes in particle distribution. These schemes are applied to the dynamics of intrinsically disordered protein (IDP) droplets. During the fusion of two droplets, we find that the changes in droplet shape correlate with the mixing of IDP chains. Additionally, we simulate large systems with multiple IDP droplets, achieving simulation sizes comparable to those observed in microscopy. In our MD simulations, we directly observe Ostwald ripening, a phenomenon where small droplets dissolve and their molecules redeposit into larger droplets. These methods have been implemented in CGDYN of the GENESIS software, offering a tool for investigating mesoscopic biological processes using the residue-level CG models.
- Molecular dynamics in multidimensional space explains how mutations affect the association path of neomycin to a riboswitch
  
  Piotr Chyży, Marta Kulik, Ai Shinobu, Suyong Re, Yuji Sugita, Joanna Trylska
  
  Proc Natl Acad Sci U. S. A. 121(15), e2317197121(2024)
  
  Riboswitches are messenger RNA (mRNA) fragments binding specific small molecules to regulate gene expression. A synthetic N1 riboswitch, inserted into yeast mRNA controls the translation of a reporter gene in response to neomycin. However, its regulatory activity is sensitive to single-point RNA mutations, even those distant from the neomycin binding site. While the association paths of neomycin to N1 and its variants remain unknown, recent fluorescence kinetic experiments indicate a two-step process driven by conformational selection. This raises the question of which step is affected by mutations. To address this, we performed all-atom two-dimensional replica-exchange molecular dynamics simulations for N1 and U14C, U14C, U15A, and A17G mutants, ensuring extensive conformational sampling of both RNA and neomycin. The obtained neomycin association and binding paths, along with multidimensional free-energy profiles, revealed a two-step binding mechanism, consisting of conformational selection and induced fit. Neomycin binds to a preformed N1 conformation upon identifying a stable upper stem and U-turn motif in the riboswitch hairpin. However, the positioning of neomycin in the binding site occurs at different RNA–neomycin distances for each mutant, which may explain their different regulatory activities. The subsequent induced fit arises from the interactions of the neomycin’s N3 amino group with RNA, causing the G9 backbone to rearrange. In the A17G mutant, the critical C6-A17/G17 stacking forms at a closer RNA–neomycin distance compared to N1. These findings together with estimated binding free energies coincide with experiments and elucidate why the A17G mutation decreases and U15A enhances N1 activity in response to neomycin.
- Free-energy landscapes of transmembrane homodimers by bias-exchange adaptively biased molecular dynamics
  
  Shingo Ito, Yuji Sugita
  
  Biophys. Chem. 307, 107190(2024)
  
  Membrane proteins play essential roles in various biological functions within the cell. One of the most common functional regulations involves the dimerization of two single-pass transmembrane (TM) helices. Glycophorin A (GpA) and amyloid precursor protein (APP) form TM homodimers in the membrane, which have been investigated both experimentally and computationally. The homodimer structures are well characterized using only four collective variables (CVs) when each TM helix is stable. The CVs are the interhelical distance, the crossing angle, and the Crick angles for two TM helices. However, conformational sampling with multi-dimensional replica-exchange umbrella sampling (REUS) requires too many replicas to sample all the CVs for exploring the conformational landscapes. Here, we show that the bias-exchange adaptively biased molecular dynamics (BE-ABMD) with the four CVs effectively explores the free-energy landscapes of the TM helix dimers of GpA, wild-type APP and its mutants in the IMM1 implicit membrane. Compared to the original ABMD, the bias-exchange algorithm in BE-ABMD can provide a more rapidly converged conformational landscape. The BE-ABMD simulations could also reveal TM packing interfaces of the membrane proteins and the dependence of the free-energy landscapes on the membrane thickness. This approach is valuable for numerous other applications, including those involving explicit solvent and a lipid bilayer in all-atom force fields or Martini coarse-grained models, and enhances our understanding of protein-protein interactions in biological membranes.
- SPANA: Spatial decomposition analysis for cellular‐scale molecular dynamics simulations
  
  Isseki Yu, Takaharu Mori, Daisuke Matsuoka, Donatas Surblys, Yuji Sugita
  
  J. Comp. Chem. 45,(8) 498-505(2024)
  
  The rapid increase in computational power with the latest supercomputers has enabled atomistic molecular dynamics (MDs) simulations of biomolecules in biological membrane, cytoplasm, and other cellular environments. These environments often contain a million or more atoms to be simulated simultaneously. Therefore, their trajectory analyses involve heavy computations that can become a bottleneck in the computational studies. Spatial decomposition analysis (SPANA) is a set of analysis tools in the Generalized-Ensemble Simulation System (GENESIS) software package that can carry out MD trajectory analyses of large-scale biological simulations using multiple CPU cores in parallel. SPANA applies the spatial decomposition of a large biological system to distribute structural and dynamical analyses into individual CPU cores, which reduces the computational time and the memory size, significantly. SPANA opens new possibilities for detailed atomistic analyses of biomacromolecules as well as solvent water molecules, ions, and metabolites in MD simulation trajectories of very large biological systems containing more than millions of atoms in cellular environments.
- Deciphering the Multi-state Conformational Equilibrium of HDM2 in the Regulation of p53 Binding: Perspectives from Molecular Dynamics Simulation and NMR Analysis
  
  Kazuki Watanabe, Qingci Zhao, Ryosuke Iwatsuki, Ryota Fukui, Weitong Ren, Yuji Sugita, Noritaka Nishida
  
  J. Am, Chem. Soc. 146(14), 9790-9800(2024)
  
  HDM2 negatively regulates the activity of the tumor suppressor p53. Previous NMR studies have shown that apo-HDM2 interconverts between an “open” state in which the N-terminal “lid” is disordered and a “closed” state in which the lid covers the p53-binding site in the core region. Molecular dynamics (MD) simulation studies have been performed to elucidate the conformational dynamics of HDM2, but the direct relevance of the experimental and computational analyses is unclear. In addition, how the phosphorylation of S17 in the lid contributes to the inhibition of p53 binding remains controversial. Here, we used both NMR and MD simulations to investigate the conformational dynamics of apo-HDM2. The NMR analysis revealed that apo-HDM2 exists in a fast-exchanging equilibrium within two closed states, closed 1 and closed 2, in addition to a previously demonstrated slow-exchanging “open-closed” equilibrium. MD simulations visualized two characteristic closed states, where the spatial orientation of the key residues corresponds well to the chemical shift changes of the NMR spectra. Furthermore, the phosphorylation of S17 induced an equilibrium shift toward closed 1, thereby suppressing the binding of p53 to HDM2. This study reveals a multi-state equilibrium of apo-HDM2 and provides new insights into the regulation mechanism of HDM2-p53 interactions.
- Pseudo-Luciferase Activity of the SARS-CoV-2 Spike Protein for Cypridina Luciferin
  
  Ryo Nishihara, Hisham M Dokainish, Yoshiki Kihara, Hiroki Ashiba, Yuji Sugita, Ryoji Kurita
  
  ACS Central Science. 10(2), 283-290 (2024)
  
  Enzymatic reactions that involve a luminescent substrate (luciferin) and enzyme (luciferase) from luminous organisms enable a luminescence detection of target proteins and cells with high specificity, albeit that conventional assay design requires a prelabeling of target molecules with luciferase. Here, we report a luciferase-independent luminescence assay in which the target protein directly catalyzes the oxidative luminescence reaction of luciferin. The SARS-CoV-2 antigen (spike) protein catalyzes the light emission of Cypridina luciferin, whereas no such catalytic function was observed for salivary proteins. This selective luminescence reaction is due to the enzymatic recognition of the 3-(1-guanidino)propyl group in luciferin at the interfaces between the units of the spike protein, allowing a specific detection of the spike protein in human saliva without sample pretreatment. This method offers a novel platform to detect virus antigens simply and rapidly without genetic manipulation or antibodies.
- Similarity scores of vibrational spectra reveal the atomistic structure of pentapeptides in multiple basins
  
  H Otaki, S Ishiuchi, M Fujii, Y Sugita, K Yagi
  
  Phys. Chem. Chem. Phys. 26(13), 9906-9914 (2024)
  
  Vibrational spectroscopy combined with theoretical calculations is a powerful tool for analyzing the interaction and conformation of peptides at the atomistic level. Nonetheless, identifying the structure becomes increasingly difficult as the peptide size grows large. One example is acetyl-SIVSF-N-methylamide, a capped pentapeptide, whose atomistic structure has remained unknown since its first observation [T. Sekiguchi, M. Tamura, H. Oba, P. Çarçarbal, R. R. Lozada-Garcia, A. Zehnacker-Rentien, G. Grégoire, S. Ishiuchi and M. Fujii, Angew. Chem., Int. Ed., 2018, 57, 5626–5629]. Here, we propose a novel conformational search method, which exploits the structure-spectrum correlation using a similarity score that measures the agreement of theoretical and experimental spectra. Surprisingly, the two conformers have distinctly different energy and geometry. The second conformer is 25 kJ mol−1 higher in energy than the other, lowest-energy conformer. The result implies that there are multiple pathways in the early stage of the folding process: one to the global minimum and the other to a different basin. Once such a structure is established, the second conformer is unlikely to overcome the barrier to produce the most stable structure due to a vastly different hydrogen bond network of the backbone. Our proposed method can characterize the lowest-energy conformer and kinetically trapped, high-energy conformers of complex biomolecules.

2023

- Allosteric regulation of β-reaction stage I in tryptophan synthase upon the α-ligand binding
  
  Shingo Ito, Kiyoshi Yagi, Yuji Sugita
  
  J. Chem. Phys. 158, 115101 (2023)
  
  Tryptophan synthase (TRPS) is a bifunctional enzyme consisting of α- and β-subunits that catalyzes the last two steps of L-tryptophan (L-Trp) biosynthesis. The first stage of the reaction at the β-subunit is called β-reaction stage I, which converts the β-ligand from an internal aldimine [E(Ain)] to an α-aminoacrylate [E(A-A)] intermediate. The activity is known to increase 3–10-fold upon the binding of 3-indole-D-glycerol-3′-phosphate (IGP) at the α-subunit. The effect of α-ligand binding on β-reaction stage I at the distal β-active site is not well understood despite the abundant structural information available for TRPS. Here, we investigate the β-reaction stage I by carrying out minimum-energy pathway searches based on a hybrid quantum mechanics/molecular mechanics (QM/MM) model. The free-energy differences along the pathway are also examined using QM/MM umbrella sampling simulations with QM calculations at the B3LYP-D3/aug-cc-pVDZ level of theory. Our simulations suggest that the sidechain orientation of βD305 near the β-ligand likely plays an essential role in the allosteric regulation: a hydrogen bond is formed between βD305 and the β-ligand in the absence of the α-ligand, prohibiting a smooth rotation of the hydroxyl group in the quinonoid intermediate, whereas the dihedral angle rotates smoothly after the hydrogen bond is switched from βD305-β-ligand to βD305-βR141. This switch could occur upon the IGP-binding at the α-subunit, as evidenced by the existing TRPS crystal structures.
- Highly Charged Proteins and Their Repulsive Interactions Antagonize Biomolecular Condensation
  
  Cheng Tan, Ai Niitsu, Yuji Sugita
  
  J. Am. Chem. Soc. Au, 3, 834-848 (2023)
  
  Biomolecular condensation is involved in various cellular processes; therefore, regulation of condensation is crucial to prevent deleterious protein aggregation and maintain a stable cellular environment. Recently, a class of highly charged proteins, known as heat-resistant obscure (Hero) proteins, was shown to protect other client proteins from pathological aggregation. However, the molecular mechanisms by which Hero proteins protect other proteins from aggregation remain unknown. In this study, we performed multiscale molecular dynamics (MD) simulations of Hero11, a Hero protein, and the C-terminal low-complexity domain (LCD) of the transactive response DNA-binding protein 43 (TDP-43), a client protein of Hero11, under various conditions to examine their interactions with each other. We found that Hero11 permeates into the condensate formed by the LCD of TDP-43 (TDP-43-LCD) and induces changes in conformation, intermolecular interactions, and dynamics of TDP-43-LCD. We also examined possible Hero11 structures in atomistic and coarse-grained MD simulations and found that Hero11 with a higher fraction of disordered region tends to assemble on the surface of the condensates. Based on the simulation results, we have proposed three possible mechanisms for Hero11’s regulatory function: (i) In the dense phase, TDP-43-LCD reduces contact with each other and shows faster diffusion and decondensation due to the repulsive Hero11–Hero11 interactions. (ii) In the dilute phase, the saturation concentration of TDP-43-LCD is increased, and its conformation is relatively more extended and variant, induced by the attractive Hero11–TDP-43-LCD interactions. (iii) Hero11 on the surface of small TDP-43-LCD condensates can contribute to avoiding their fusion due to repulsive interactions. The proposed mechanisms provide new insights into the regulation of biomolecular condensation in cells under various conditions.
- Towards de novo design of transmembrane α-helical assemblies using structural modelling and molecular dynamics simulation
  
  Ai Niitsu, Yuji Sugita
  
  Phys. Chem. Chem. Phys., 25, 3595-3606 (2023)
  
  Computational de novo protein design involves iterative processes consisting of amino acid sequence design, structural modelling and scoring, and design validation by synthesis and experimental characterisation. Recent advances in protein structure prediction and modelling methods have enabled the highly efficient and accurate design of water-soluble proteins. However, the design of membrane proteins remains a major challenge. To advance membrane protein design, considering the higher complexity of membrane protein folding, stability, and dynamic interactions between water, ions, lipids, and proteins is an important task. For introducing explicit solvents and membranes to these design methods, all-atom molecular dynamics (MD) simulations of designed proteins provide useful information that cannot be obtained experimentally. In this review, we first describe two major approaches to designing transmembrane α-helical assemblies, consensus and de novo design. We further illustrate recent MD studies of membrane protein folding related to protein design, as well as advanced treatments in molecular models and conformational sampling techniques in the simulations. Finally, we discuss the possibility to introduce MD simulations after the existing static modelling and screening of design decoys as an additional step for refinement of the design, which considers membrane protein folding dynamics and interactions with explicit membranes.
- Structural Effects of Spike Protein D614G Mutation in SARS-CoV-2
  
  Hisham Dokainish, Yuji Sugita
  
  Biophys. J., 122, 1-11 (2023)
  
  A single mutation from aspartate to glycine at position 614 has dominated all circulating variants of the severe acute respiratory syndrome coronavirus 2. D614G mutation induces structural changes in the spike (S) protein that strengthen the virus infectivity. Here, we use molecular dynamics simulations to dissect the effects of mutation and 630-loop rigidification on S-protein structure. The introduction of the mutation orders the 630-loop structure and thereby induces global structural changes toward the cryoelectron microscopy structure of the D614G S-protein. The ordered 630-loop weakens local interactions between the 614th residue and others in contrast to disordered structures in the wild-type protein. The mutation allosterically alters global interactions between receptor-binding domains, forming an asymmetric and mobile down conformation and facilitating transitions toward up conformation. The loss of salt bridge between D614 and K854 upon the mutation generally stabilizes S-protein protomer, including the fusion peptide proximal region that mediates membrane fusion. Understanding the molecular basis of D614G mutation is crucial as it dominates in all variants of concern, including Delta and Omicron.

2022

- Development of hidden Markov modeling method for molecular orientations and structure estimation from high-speed atomic force microscopy time-series images
  
  Tomonori Ogane, Daisuke Noshiro, Toshio Ando, Atsuko Yamashita, Yuji Sugita, Yasuhiro Matsunaga
  
  PLoS Comput. Biol., 18, e1010384 (2022)
  
  High-speed atomic force microscopy (HS-AFM) is a powerful technique for capturing the time-resolved behavior of biomolecules. However, structural information in HS-AFM images is limited to the surface geometry of a sample molecule. Inferring latent three-dimensional structures from the surface geometry is thus important for getting more insights into conformational dynamics of a target biomolecule. Existing methods for estimating the structures are based on the rigid-body fitting of candidate structures to each frame of HS-AFM images. Here, we extend the existing frame-by-frame rigid-body fitting analysis to multiple frames to exploit orientational correlations of a sample molecule between adjacent frames in HS-AFM data due to the interaction with the stage. In the method, we treat HS-AFM data as time-series data, and they are analyzed with the hidden Markov modeling. Using simulated HS-AFM images of the taste receptor type 1 as a test case, the proposed method shows a more robust estimation of molecular orientations than the frame-by-frame analysis. The method is applicable in integrative modeling of conformational dynamics using HS-AFM data.
- Use of multistate Bennett acceptance ratio method for free-energy calculations from enhanced sampling and free-energy perturbation
  
  Yasuhiro Matsunaga, Motoshi Kamiya, Hiraku Oshima, Jaewoon Jung, Shingo Ito, and Yuji Sugita
  
  Biophys. Rev., 14, 1503–1512 (2022)
  
  Multistate Bennett acceptance ratio (MBAR) works as a method to analyze molecular dynamics (MD) simulation data after the simulations have been finished. It is widely used to estimate free-energy changes between different states and averaged properties at the states of interest. MBAR allows us to treat a wide range of states from those at different temperature/pressure to those with different model parameters. Due to the broad applicability, the MBAR equations are rather difficult to apply for free-energy calculations using different types of MD simulations including enhanced conformational sampling methods and free-energy perturbation. In this review, we first summarize the basic theory of the MBAR equations and categorize the representative usages into the following four: (i) perturbation, (ii) scaling, (iii) accumulation, and (iv) full potential energy. For each, we explain how to prepare input data using MD simulation trajectories for solving the MBAR equations. MBAR is also useful to estimate reliable free-energy differences using MD trajectories based on a semi-empirical quantum mechanics/molecular mechanics (QM/MM) model and ab initio QM/MM energy calculations on the MD snapshots. We also explain how to use the MBAR software in the GENESIS package, which we call mbar_analysis, for the four representative cases. The proposed estimations of free-energy changes and thermodynamic averages are effective and useful for various biomolecular systems.
- Multiple sub state structures of SERCA2b reveal conformational overlap at transition steps during the catalytic cycle
  
  Yuxia Zhang, Chigusa Kobayashi, Xiaohan Cai, Satoshi Watanabe, Akihisa Tsutsumi, Masahide Kikkawa, Yuji Sugita, Kenji Inaba
  
  Cell Rep., 41, 111760 (2022)
  
  Sarco/endoplasmic reticulum Ca2+ ATPase (SERCA) pumps Ca2+ into the endoplasmic reticulum (ER). Herein, we present cryo-electron microscopy (EM) structures of three intermediates of SERCA2b: Ca2+-bound phosphorylated (E1P·2Ca2+) and Ca2+-unbound dephosphorylated (E2·Pi) intermediates and another between the E2P and E2·Pi states. Our cryo-EM analysis demonstrates that the E1P·2Ca2+ state exists in low abundance and preferentially transitions to an E2P-like structure by releasing Ca2+ and that the Ca2+ release gate subsequently undergoes stepwise closure during the dephosphorylation processes. Importantly, each intermediate adopts multiple sub-state structures including those like the next one in the catalytic series, indicating conformational overlap at transition steps, as further substantiated by atomistic molecular dynamic simulations of SERCA2b in a lipid bilayer. The present findings provide insight into how enzymes accelerate catalytic cycles.
- Formation of extramembrane β-strands controls dimerization of transmembrane helices in amyloid precursor protein C99
  
  George A. Pantelopulos, Daisuke Matsuoka, Yuji Sugita, John E. Straub
  
  Proc. Natl. Acad. Sci. U.S.A., 119, e2212207119 (2022)
  
  The 99-residue C-terminal domain of amyloid precursor protein (APP-C99), precursor to amyloid beta (Aβ), is a transmembrane (TM) protein containing intrinsically disordered N- and C-terminal extramembrane domains. Using molecular dynamics (MD) simulations, we show that the structural ensemble of the C99 monomer is best described in terms of thousands of states. The C99 monomer has a propensity to form β-strand in the C-terminal extramembrane domain, which explains the slow spin relaxation times observed in paramagnetic probe NMR experiments. Surprisingly, homodimerization of C99 not only narrows the conformational ensemble from thousands to a few states through the formation of metastable β-strands in extramembrane domains but also stabilizes extramembrane α-helices. The extramembrane domain structure is observed to dramatically impact the homodimerization motif, resulting in the modification of TM domain conformations. Our study provides an atomic-level structural basis for communication between the extramembrane domains of the C99 protein and TM homodimer formation. This finding could serve as a general model for understanding the influence of disordered extramembrane domains on TM protein structure.
- Modified protein-water interactions in CHARMM36m for thermodynamics and kinetics of proteins in dilute and crowded solutions
  
  Daiki Matsubara, Kento Kasahara, Hisham Dokainish, Hiraku Oshima, Yuji Sugita
  
  Molecules, 27, 5726 (2022)
  
  Proper balance between protein-protein and protein-water interactions is vital for atomistic molecular dynamics (MD) simulations of globular proteins as well as intrinsically disordered proteins (IDPs). The overestimation of protein-protein interactions tends to make IDPs more compact than those in experiments. Likewise, multiple proteins in crowded solutions are aggregated with each other too strongly. To optimize the balance, Lennard-Jones (LJ) interactions between protein and water are often increased about 10% (with a scaling parameter, λ = 1.1) from the existing force fields. Here, we explore the optimal scaling parameter of protein-water LJ interactions for CHARMM36m in conjunction with the modified TIP3P water model, by performing enhanced sampling MD simulations of several peptides in dilute solutions and conventional MD simulations of globular proteins in dilute and crowded solutions. In our simulations, 10% increase of protein-water LJ interaction for the CHARMM36m cannot maintain stability of a small helical peptide, (AAQAA)3 in a dilute solution and only a small modification of protein-water LJ interaction up to the 3% increase (λ = 1.03) is allowed. The modified protein-water interactions are applicable to other peptides and globular proteins in dilute solutions without changing thermodynamic properties from the original CHARMM36m. However, it has a great impact on the diffusive properties of proteins in crowded solutions, avoiding the formation of too sticky protein-protein interactions.
- General Design Strategy to Precisely Control the Emission of Fluorophores via a Twisted Intramolecular Charge Transfer (TICT) Process
  
  Kenjiro Hanaoka, Shimpei Iwaki, Kiyoshi Yagi, Takuya Myochin, Takayuki Ikeno, Hisashi Ohno, Eita Sasaki, Toru Komatsu, Tasuku Ueno, Motokazu Uchigashima, Takayasu Mikuni, Kazuki Tainaka, Shinya Tahara, Satoshi Takeuchi, Tahei Tahara, Masanobu Uchiyama, Tetsuo Nagano, and Yasuteru Urano
  
  J. Am. Chem. Soc., 144, 19778–19790 (2022)
  
  Fluorogenic probes for bioimaging have become essential tools for life science and medicine, and the key to their development is a precise understanding of the mechanisms available for fluorescence off/on control, such as photoinduced electron transfer (PeT) and Förster resonance energy transfer (FRET). Here we establish a new molecular design strategy to rationally develop activatable fluorescent probes, which exhibit a fluorescence off/on change in response to target biomolecules, by controlling the twisted intramolecular charge transfer (TICT) process. This approach was developed on the basis of a thorough investigation of the fluorescence quenching mechanism of N-phenyl rhodamine dyes (commercially available as the QSY series) by means of time-dependent density functional theory (TD-DFT) calculations and photophysical evaluation of their derivatives. To illustrate and validate this TICT-based design strategy, we employed it to develop practical fluorogenic probes for HaloTag and SNAP-tag. We further show that the TICT-controlled fluorescence off/on mechanism is generalizable by synthesizing a Si–rhodamine-based fluorogenic probe for HaloTag, thus providing a palette of chemical dyes that spans the visible and near-infrared range.
- Protein folding intermediates on the dimensionality reduced landscape with UMAP and native contact likelihood
  
  Mao Oide and Yuji Sugita
  
  J. Chem. Phys., 157, 075101 (2022)
  
  To understand protein folding mechanisms from molecular dynamics (MD) simulations, it is important to explore not only folded/unfolded states but also representative intermediate structures on the conformational landscape. Here, we propose a novel approach to construct the landscape using the uniform manifold approximation and projection (UMAP) method, which reduces the dimensionality without losing data-point proximity. In the approach, native contact likelihood is used as feature variables rather than the conventional Cartesian coordinates or dihedral angles of protein structures. We tested the performance of UMAP for coarse-grained MD simulation trajectories of B1 domain in protein G and observed on-pathway transient structures and other metastable states on the UMAP conformational landscape. In contrast, these structures were not clearly distinguished on the dimensionality reduced landscape using principal component analysis or time-lagged independent component analysis. This approach is also useful to obtain dynamical information through Markov state modeling and would be applicable to large-scale conformational changes in many other biomacromolecules.
- Crystal structure of the lipid flippase MurJ in a “squeezed” form distinct from its inward- and outward-facing forms
  
  Hidetaka Kohga, Takaharu Mori, Yoshiki Tanaka, Kunihito Yoshikaie, Katsuhide Taniguchi, Kei Fujimoto, Lisa Fritz, Tanja Schneider, and Tomoya Tsukazaki
  
  Structure, 30, 1088-1097.e3 (2022)
  
  The bacterial peptidoglycan enclosing the cytoplasmic membrane is a fundamental cellular architecture. The integral membrane protein MurJ plays an essential role in flipping the cell wall building block Lipid II across the cytoplasmic membrane for peptidoglycan biosynthesis. Previously reported crystal structures of MurJ have elucidated its V-shaped inward- or outward-facing forms with an internal cavity for substrate binding. MurJ transports Lipid II using its cavity through conformational transitions between these two forms. Here, we report two crystal structures of inward-facing forms from Arsenophonus endosymbiont MurJ and an unprecedented crystal structure of Escherichia coli MurJ in a “squeezed” form, which lacks a cavity to accommodate the substrate, mainly because of the increased proximity of transmembrane helices 2 and 8. Subsequent molecular dynamics simulations supported the hypothesis that the squeezed form is an intermediate conformation. This study fills a gap in our understanding of the Lipid II flipping mechanism.
- Modified Hamiltonian in FEP calculations for reducing the computational cost of electrostatic interactions
  
  Hiraku Oshima and Yuji Sugita
  
  J. Chem. Inf. Model., 62, 11, 2846 (2022)
  
  The free-energy perturbation (FEP) method predicts relative and absolute free-energy changes of biomolecules in solvation and binding with other molecules. FEP is, therefore, one of the most essential tools in in-silico drug design. In conventional FEP, to smoothly connect two thermodynamic states, the potential energy is modified as a linear combination of the end state potential energies by introducing scaling factors. When the particle mesh Ewald is used for electrostatic calculations, conventional FEP requires two reciprocal-space calculations per time step, which largely decreases the computational performance. To overcome this problem, we propose a new FEP scheme by introducing a modified Hamiltonian instead of interpolation of the end state potential energies. The scheme introduces non-uniform scaling into the electrostatic potential as used in Replica-Exchange with Solute Tempering 2 (REST2) and does not require additional reciprocal-space calculations. We tested this modified Hamiltonian in FEP calculations in several biomolecular systems. In all cases, the calculated free-energy changes with the current scheme are in good agreement with those from conventional FEP. The modified Hamiltonian in FEP greatly improves the computational performance, which is particularly marked for large biomolecular systems whose reciprocal-space calculations are the major bottleneck of total computational time.
- Practical protocols for efficient sampling of kinase-inhibitor binding pathways using two-dimensional replica-exchange molecular dynamics
  
  Ai Shinobu, Suyong Re, and Yuji Sugita
  
  Front. Mol. Biosci., 9, 878830 (2022)
  
  Molecular dynamics (MD) simulations are increasingly used to study various biological processes such as protein folding, conformational changes, and ligand binding. These processes generally involve slow dynamics that occur on the millisecond or longer timescale, which are difficult to simulate by conventional atomistic MD. Recently, we applied a two-dimensional (2D) replica-exchange MD (REMD) method, which combines the generalized replica exchange with solute tempering (gREST) with the replica-exchange umbrella sampling (REUS) in kinase-inhibitor binding simulations, and successfully observed multiple ligand binding/unbinding events. To efficiently apply the gREST/REUS method to other kinase-inhibitor systems, we establish modified, practical protocols with non-trivial simulation parameter tuning. The current gREST/REUS simulation protocols are tested for three kinase-inhibitor systems: c-Src kinase with PP1, c-Src kinase with Dasatinib, and c-Abl kinase with Imatinib. We optimized the definition of kinase-ligand distance as a collective variable (CV), the solute temperatures in gREST, and replica distributions and umbrella forces in the REUS simulations. Also, the initial structures of each replica in the 2D replica space were prepared carefully by pulling each ligand from and toward the protein binding sites for keeping stable kinase conformations. These optimizations were carried out individually in multiple short MD simulations. The current gREST/REUS simulation protocol ensures good random walks in 2D replica spaces, which are required for enhanced sampling of inhibitor dynamics around a target kinase.
- Computational analysis on the allostery of tryptophan synthase: relationship between α/β-ligand binding and distal domain closure
  
  Shingo Ito, Kiyoshi Yagi, and Yuji Sugita
  
  J. Phys. Chem. B, 126(17), 3300-3308 (2022)
  
  Tryptophan synthase (TRPS) is a bifunctional enzyme consisting of α and β-subunits and catalyzes the last two steps of l-tryptophan (L-Trp) biosynthesis, namely, cleavage of 3-indole-d-glycerol-3′-phosphate (IGP) into indole and glyceraldehyde-3-phosphate (G3P) in the α-subunit, and a pyridoxal phosphate (PLP)-dependent reaction of indole and l-serine (L-Ser) to produce L-Trp in the β-subunit. Importantly, the IGP binding at the α-subunit affects the β-subunit conformation and its ligand-binding affinity, which, in turn, enhances the enzymatic reaction at the α-subunit. The intersubunit communications in TRPS have been investigated extensively for decades because of the fundamental and pharmaceutical importance, while it is still difficult to answer how TRPS allostery is regulated at the atomic detail. Here, we investigate the allosteric regulation of TRPS by all-atom classical molecular dynamics (MD) simulations and analyze the potential of mean-force (PMF) along conformational changes of the α- and β-subunits. The present simulation has revealed a widely opened conformation of the β-subunit, which provides a pathway for L-Ser to enter into the β-active site. The IGP binding closes the α-subunit and induces a wide opening of the β-subunit, thereby enhancing the binding affinity of L-Ser to the β-subunit. Structural analyses have identified critical hydrogen bonds (HBs) at the interface of the two subunits (αG181-βS178, αP57-βR175, etc.) and HBs between the β-subunit (βT110 – βH115) and a complex of PLP and L-Ser (an α-aminoacrylate intermediate). The former HBs regulate the allosteric, β-subunit opening, whereas the latter HBs are essential for closing the β-subunit in a later step. The proposed mechanism for how the interdomain communication in TRPS is realized with ligand bindings is consistent with the previous experimental data, giving a general idea to interpret the allosteric regulations in multidomain proteins.
- Implementation of residue-level coarse-grained models in GENESIS for large-scale molecular dynamics simulations
  
  Cheng Tan, Jaewoon Jung, Chigusa Kobayashi, Diego Ugarte La Torre, Shoji Takada, and Yuji Sugita
  
  PLoS Comput. Biol., 18(4), e1009578 (2022)
  
  Residue-level coarse-grained (CG) models have become one of the most popular tools in biomolecular simulations in the trade-off between modeling accuracy and computational efficiency. To investigate large-scale biological phenomena in molecular dynamics (MD) simulations with CG models, unified treatments of proteins and nucleic acids, as well as efficient parallel computations, are indispensable. In the GENESIS MD software, we implement several residue-level CG models, covering structure-based and context-based potentials for both well-folded biomolecules and intrinsically disordered regions. An amino acid residue in protein is represented as a single CG particle centered at the Cα atom position, while a nucleotide in RNA or DNA is modeled with three beads. Then, a single CG particle represents around ten heavy atoms in both proteins and nucleic acids. The input data in CG MD simulations are treated as GROMACS-style input files generated from a newly developed toolbox, GENESIS-CG-tool. To optimize the performance in CG MD simulations, we utilize multiple neighbor lists, each of which is attached to a different nonbonded interaction potential in the cell-linked list method. We found that random number generations for Gaussian distributions in the Langevin thermostat are one of the bottlenecks in CG MD simulations. Therefore, we parallelize the computations with message-passing-interface (MPI) to improve the performance on PC clusters or supercomputers. We simulate Herpes simplex virus (HSV) type 2 B-capsid and chromatin models containing more than 1,000 nucleosomes in GENESIS as examples of large-scale biomolecular simulations with residue-level CG models. This framework extends accessible spatial and temporal scales by multi-scale simulations to study biologically relevant phenomena, such as genome-scale chromatin folding or phase-separated membrane-less condensations.
- Protein assembly and crowding simulations
  
  Lim Heo, Yuji Sugita, and Michael Feig
  
  Curr. Opin. Struct. Biol., 73, 102340 (2022)
  
  Proteins encounter frequent molecular interactions in biological environments. Computer simulations have become an increasingly important tool in providing mechanistic insights into how such interactions in vivo relate to their biological function. The review here focuses on simulations describing protein assembly and molecular crowding effects as two important aspects that are distinguished mainly by how specific and long-lived protein contacts are. On the topic of crowding, recent simulations have provided new insights into how crowding affects protein folding and stability, modulates enzyme activity, and affects diffusive properties. Recent studies of assembly processes focus on assembly pathways, especially for virus capsids, amyloid aggregation pathways, and the role of multivalent interactions leading to phase separation. Also, discussed are technical challenges in achieving increasingly realistic simulations of interactions in cellular environments.
- The inherent flexibility of receptor binding domains in SARS-CoV-2 spike protein
  
  Hisham M Dokainish, Suyong Re, Takaharu Mori, Chigusa Kobayashi, Jaewoon Jung, and Yuji Sugita
  
  eLife, 11, e75720 (2022)
  
  Spike (S) protein is the primary antigenic target for neutralization and vaccine development for the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). It decorates the virus surface and undergoes large motions of its receptor binding domains (RBDs) to enter the host cell. Here, we observe Down, one-Up, one-Open, and two-Up-like structures in enhanced molecular dynamics simulations, and characterize the transition pathways via inter-domain interactions. Transient salt-bridges between RBD_A and RBD_C and the interaction with glycan at N343_B support RBD_A motions from Down to one-Up. Reduced interactions between RBD_A and RBD_B in one-Up induce RBD_B motions toward two-Up. The simulations overall agree with cryo-electron microscopy structure distributions and FRET experiments and provide hidden functional structures, namely, intermediates along Down-to-one-Up transition with druggable cryptic pockets as well as one-Open with a maximum exposed RBD. The inherent flexibility of S-protein thus provides essential information for antiviral drug rational design or vaccine development.
- Weight average approaches for predicting dynamical properties of biomolecules
  
  Kiyoshi Yagi, Suyong Re, Takaharu Mori, and Yuji Sugita
  
  Curr. Opin. Struct. Biol., 72, 88-94 (2022)
  
  Recent advances in atomistic molecular dynamics (MD) simulations of biomolecules allow us to explore their conformational spaces widely, observing large-scale conformational fluctuations or transitions between distinct structures. To reproduce or refine experimental data using MD simulations, structure ensembles, which are characterized by multiple structures and their statistical weights on the rugged free-energy landscapes, are often used. Here, we summarize weight average approaches for various experimental measurements. Weight average approaches are now applied to hybrid quantum mechanics/molecular mechanics MD simulations to predict fast vibrational motions in a protein with a high accuracy for better understanding of molecular functions from atomic structures.

Publications2022-2026

2025

ColBuilder: Flexible structure generation of crosslinked collagen fibrils

In silico nanoscope to study the interplay of genome organization and transcription regulation

Heterogeneous condensates of transcription factors in embryonic stem cells: Molecular simulations

High-performance QM/MM Enhanced Sampling Molecular Dynamics Simulations with GENESIS SPDYN and QSimulate-QM

The need to implement FAIR principles in biomolecular simulations

Rational Design Principles for De Novo α-Helical Peptide Barrels with Dynamic Conductive Channels

Langevin integration for isothermal–isobaric condition with a large time step

Efficient Training of Neural Network Potentials for Chemical and Enzymatic Reactions by Continual Learning

Structural dynamics of a designed peptide pore under an external electric field

Scaling molecular dynamics for large-scale simulation of biological systems on AMD CPU/GPU supercomputers: Lessons from LUMI: Optimizing GENESIS for maximizing the computational efficiency of CPU and GPU kernels on the LUMI supercomputer

2024

Allosteric changes in the conformational landscape of Src kinase upon substrate binding

GENESIS 2.1: High-Performance Molecular Dynamics Software for Enhanced Sampling and Free-Energy Calculations for Atomistic, Coarse-Grained, and Quantum Mechanics/Molecular Mechanics Models

Learning QM/MM potential using equivariant multiscale model

GENESIS CGDYN: large-scale coarse-grained MD simulation with dynamic load balancing for heterogeneous biomolecular systems

Molecular dynamics in multidimensional space explains how mutations affect the association path of neomycin to a riboswitch

Free-energy landscapes of transmembrane homodimers by bias-exchange adaptively biased molecular dynamics

SPANA: Spatial decomposition analysis for cellular‐scale molecular dynamics simulations

Deciphering the Multi-state Conformational Equilibrium of HDM2 in the Regulation of p53 Binding: Perspectives from Molecular Dynamics Simulation and NMR Analysis

Pseudo-Luciferase Activity of the SARS-CoV-2 Spike Protein for Cypridina Luciferin

Similarity scores of vibrational spectra reveal the atomistic structure of pentapeptides in multiple basins

2023

Allosteric regulation of β-reaction stage I in tryptophan synthase upon the α-ligand binding

Highly Charged Proteins and Their Repulsive Interactions Antagonize Biomolecular Condensation

Towards de novo design of transmembrane α-helical assemblies using structural modelling and molecular dynamics simulation

Structural Effects of Spike Protein D614G Mutation in SARS-CoV-2

2022

Development of hidden Markov modeling method for molecular orientations and structure estimation from high-speed atomic force microscopy time-series images

Use of multistate Bennett acceptance ratio method for free-energy calculations from enhanced sampling and free-energy perturbation

Multiple sub state structures of SERCA2b reveal conformational overlap at transition steps during the catalytic cycle

Formation of extramembrane β-strands controls dimerization of transmembrane helices in amyloid precursor protein C99

Modified protein-water interactions in CHARMM36m for thermodynamics and kinetics of proteins in dilute and crowded solutions

General Design Strategy to Precisely Control the Emission of Fluorophores via a Twisted Intramolecular Charge Transfer (TICT) Process

Protein folding intermediates on the dimensionality reduced landscape with UMAP and native contact likelihood

Crystal structure of the lipid flippase MurJ in a “squeezed” form distinct from its inward- and outward-facing forms

Modified Hamiltonian in FEP calculations for reducing the computational cost of electrostatic interactions

Practical protocols for efficient sampling of kinase-inhibitor binding pathways using two-dimensional replica-exchange molecular dynamics

Computational analysis on the allostery of tryptophan synthase: relationship between α/β-ligand binding and distal domain closure

Implementation of residue-level coarse-grained models in GENESIS for large-scale molecular dynamics simulations

Protein assembly and crowding simulations

The inherent flexibility of receptor binding domains in SARS-CoV-2 spike protein

Weight average approaches for predicting dynamical properties of biomolecules