structural information
Recently Published Documents





2022 ◽  
Vol 40 (3) ◽  
pp. 1-30
Zhiwen Xie ◽  
Runjie Zhu ◽  
Kunsong Zhao ◽  
Jin Liu ◽  
Guangyou Zhou ◽  

Cross-lingual entity alignment has attracted considerable attention in recent years. Past studies using conventional approaches to match entities share the common problem of missing important structural information beyond entities in the modeling process. This allows graph neural network models to step in. Most existing graph neural network approaches model individual knowledge graphs (KGs) separately with a small amount of pre-aligned entities served as anchors to connect different KG embedding spaces. However, this characteristic can cause several major problems, including performance restraint due to the insufficiency of available seed alignments and ignorance of pre-aligned links that are useful in contextual information in-between nodes. In this article, we propose DuGa-DIT, a dual gated graph attention network with dynamic iterative training, to address these problems in a unified model. The DuGa-DIT model captures neighborhood and cross-KG alignment features by using intra-KG attention and cross-KG attention layers. With the dynamic iterative process, we can dynamically update the cross-KG attention score matrices, which enables our model to capture more cross-KG information. We conduct extensive experiments on two benchmark datasets and a case study in cross-lingual personalized search. Our experimental results demonstrate that DuGa-DIT outperforms state-of-the-art methods.

2022 ◽  
Vol 16 (4) ◽  
pp. 1-16
Fereshteh Jafariakinabad ◽  
Kien A. Hua

The syntactic structure of sentences in a document substantially informs about its authorial writing style. Sentence representation learning has been widely explored in recent years and it has been shown that it improves the generalization of different downstream tasks across many domains. Even though utilizing probing methods in several studies suggests that these learned contextual representations implicitly encode some amount of syntax, explicit syntactic information further improves the performance of deep neural models in the domain of authorship attribution. These observations have motivated us to investigate the explicit representation learning of syntactic structure of sentences. In this article, we propose a self-supervised framework for learning structural representations of sentences. The self-supervised network contains two components; a lexical sub-network and a syntactic sub-network which take the sequence of words and their corresponding structural labels as the input, respectively. Due to the n -to-1 mapping of words to their structural labels, each word will be embedded into a vector representation which mainly carries structural information. We evaluate the learned structural representations of sentences using different probing tasks, and subsequently utilize them in the authorship attribution task. Our experimental results indicate that the structural embeddings significantly improve the classification tasks when concatenated with the existing pre-trained word embeddings.

Crystals ◽  
2022 ◽  
Vol 12 (1) ◽  
pp. 110
Amit Kumar ◽  
Xu Zhang ◽  
Oscar Vadas ◽  
Fisentzos A. Stylianou ◽  
Nicolas Dos Santos Pacheco ◽  

A model for parasitic motility has been proposed in which parasite filamentous actin (F-actin) is attached to surface adhesins by a large component of the glideosome, known as the glideosome-associated connector protein (GAC). This large 286 kDa protein interacts at the cytoplasmic face of the plasma membrane with the phosphatidic acid-enriched inner leaflet and cytosolic tails of surface adhesins to connect them to the parasite actomyosin system. GAC is observed initially to the conoid at the apical pole and re-localised with the glideosome to the basal pole in gliding parasite. GAC presumably functions in force transmission to surface adhesins in the plasma membrane and not in force generation. Proper connection between F-actin and the adhesins is as important for motility and invasion as motor operation itself. This notion highlights the need for new structural information on GAC interactions, which has eluded the field since its discovery. We have obtained crystals that diffracted to 2.6–2.9 Å for full-length GAC from Toxoplasma gondii in native and selenomethionine-labelled forms. These crystals belong to space group P212121; cell dimensions are roughly a = 119 Å, b = 123 Å, c = 221 Å, α = 90°, β = 90° and γ = 90° with 1 molecule per asymmetric unit, suggesting a more compact conformation than previously proposed

2022 ◽  
Fred Lee ◽  
Xinhao Shao ◽  
Yu Gao ◽  
Alexandra Naba

The extracellular matrix (ECM) is a complex and dynamic meshwork of proteins providing structural support to cells. It also provides biochemical signals governing cellular processes including proliferation and migration. Alterations of ECM structure and/or composition has been shown to lead to, or accompany, many pathological processes including cancer and fibrosis. To understand how the ECM contributes to diseases, we first need to obtain a comprehensive characterization of the ECM of tissues and of its changes during disease progression. Over the past decade, mass-spectrometry-based proteomics has become the state-of-the-art method to profile the protein composition of ECMs. However, existing methods do not fully capture the broad dynamic range of protein abundance in the ECM, nor do they permit to achieve the high coverage needed to gain finer biochemical information, including the presence of isoforms or post-translational modifications. In addition, broadly adopted proteomic methods relying on extended trypsin digestion do not provide structural information on ECM proteins, yet, gaining insights into ECM protein structure is critical to better understanding protein functions. Here, we present the optimization of a time-lapsed proteomic method using limited proteolysis of partially denatured samples and the sequential release of peptides to achieve superior sequence coverage as compared to standard ECM proteomic workflow. Exploiting the spatio-temporal resolution of this method, we further demonstrate how 3-dimensional time-lapsed peptide mapping can identify protein regions differentially susceptible to trypsin and can thus identify sites of post-translational modifications, including protein-protein interactions. We further illustrate how this approach can be leveraged to gain insight on the role of the novel ECM protein SNED1 in ECM homeostasis. We found that the expression of SNED1 expression by mouse embryonic fibroblasts results in the alteration of overall ECM composition and the sequence coverage of certain ECM proteins, raising the possibility that SNED1 could modify accessibility to trypsin by engaging in protein-protein interactions.

2022 ◽  
Yuxuang Zhang ◽  
Qianqian Fang

Significance: Rapid advances in biophotonics techniques require quantitative, model-based computational approaches to obtain functional and structural information from increasingly complex and multi-scaled anatomies. The lack of efficient tools to accurately model tissue structures and subsequently perform quantitative multi-physics modeling greatly impedes the clinical translation of these modalities. Aim: While the mesh-based Monte Carlo (MMC) method expands our capabilities in simulating complex tissues by using tetrahedral meshes, the generation of such domains often requires specialized meshing tools such as Iso2Mesh. Creating a simplified and intuitive interface for tissue anatomical modeling and optical simulations is essential towards making these advanced modeling techniques broadly accessible to the user community. Approach: We responded to the above challenge by combining the powerful, open-source 3-D modeling software, Blender, with state-of-the-art 3-D mesh generation and MC simulation tools, utilizing the interactive graphical user interface (GUI) in Blender as the front-end to allow users to create complex tissue mesh models, and subsequently launch MMC light simulations. Results: We have developed a Python-based Blender add-on -- BlenderPhotonics -- to interface with Iso2Mesh and MMC, allowing users to create, configure and refine complex simulation domains and run hardware-accelerated 3-D light simulations with only a few clicks. In this tutorial, we provide a comprehensive introduction to this new tool and walk readers through 5 examples, ranging from simple shapes to sophisticated realistic tissue models. Conclusion: BlenderPhotonics is user-friendly and open-source, leveraging the vastly rich ecosystem of Blender. It wraps advanced modeling capabilities within an easy-to-use and interactive interface. The latest software can be downloaded at

Minerals ◽  
2022 ◽  
Vol 12 (1) ◽  
pp. 97
Georgy Alexandrovich Peshkov ◽  
Evgeny Mikhailovich Chekhonin ◽  
Dimitri Vladilenovich Pissarenko

Some of the simplifying assumptions frequently used in basin modelling may adversely impact the quality of the constructed models. One such common assumption consists of using a laterally homogeneous crustal basement, despite the fact that lateral variations in its properties may significantly affect the thermal evolution of the model. We propose a new method for the express evaluation of the impact of the basement’s heterogeneity on thermal history reconstruction and on the assessment of maturity of the source rock. The proposed method is based on reduced-rank inversion, aimed at a simultaneous reconstruction of the petrophysical properties of the heterogeneous basement and of its geometry. The method uses structural information taken from geological maps of the basement and gravity anomaly data. We applied our method to a data collection from Western Siberia and carried out a two-dimensional reconstruction of the evolution of the basin and of the lithosphere. We performed a sensitivity analysis of the reconstructed basin model to assess the effect of uncertainties in the basement’s density and its thermal conductivity for the model’s predictions. The proposed method can be used as an express evaluation tool to assess the necessity and relevance of laterally heterogeneous parametrisations prior to a costly three-dimensional full-rank basin modelling. The method is generally applicable to extensional basins except for salt tectonic provinces.

Data ◽  
2022 ◽  
Vol 7 (1) ◽  
pp. 10
Davide Buffelli ◽  
Fabio Vandin

Graph Neural Networks (GNNs) rely on the graph structure to define an aggregation strategy where each node updates its representation by combining information from its neighbours. A known limitation of GNNs is that, as the number of layers increases, information gets smoothed and squashed and node embeddings become indistinguishable, negatively affecting performance. Therefore, practical GNN models employ few layers and only leverage the graph structure in terms of limited, small neighbourhoods around each node. Inevitably, practical GNNs do not capture information depending on the global structure of the graph. While there have been several works studying the limitations and expressivity of GNNs, the question of whether practical applications on graph structured data require global structural knowledge or not remains unanswered. In this work, we empirically address this question by giving access to global information to several GNN models, and observing the impact it has on downstream performance. Our results show that global information can in fact provide significant benefits for common graph-related tasks. We further identify a novel regularization strategy that leads to an average accuracy improvement of more than 5% on all considered tasks.

2022 ◽  
Vol 105 (2) ◽  
Feng Zhang ◽  
Jiale Zhang ◽  
Hongmei Jing ◽  
Zhipeng Li ◽  
Dawei Wang ◽  

2022 ◽  
Ali A Kermani ◽  
Olive E. Burata ◽  
B Ben Koff ◽  
Akiko Koide ◽  
Shohei Koide ◽  

Proteins from the bacterial small multidrug resistance (SMR) family are proton-coupled exporters of diverse antiseptics and antimicrobials, including polyaromatic cations and quaternary ammonium compounds. The transport mechanism of the Escherichia coli transporter, EmrE, has been studied extensively, but a lack of high-resolution structural information has impeded a structural description of its molecular mechanism. Here we apply a novel approach, multipurpose crystallization chaperones, to solve several structures of EmrE, including a 2.9 Å structure at low pH without substrate. We report five additional structures in complex with structurally diverse transported substrates, including quaternary phosphonium, quaternary ammonium, and planar polyaromatic compounds. These structures show that binding site tryptophan and glutamate residues adopt different rotamers to conform to disparate structures without requiring major rearrangements of the backbone structure. Structural and functional comparison to Gdx-Clo, an SMR protein that transports a much narrower spectrum of substrates, suggests that in EmrE, a relatively sparse hydrogen bond network among binding site residues permits increased sidechain flexibility.

2022 ◽  
Vol 13 (1) ◽  
Jianxin Liu ◽  
Jiayi Tian ◽  
Christopher Perry ◽  
April L. Lukowski ◽  
Tzanko I. Doukov ◽  

AbstractRieske oxygenases exploit the reactivity of iron to perform chemically challenging C–H bond functionalization reactions. Thus far, only a handful of Rieske oxygenases have been structurally characterized and remarkably little information exists regarding how these enzymes use a common architecture and set of metallocenters to facilitate a diverse range of reactions. Herein, we detail how two Rieske oxygenases SxtT and GxtA use different protein regions to influence the site-selectivity of their catalyzed monohydroxylation reactions. We present high resolution crystal structures of SxtT and GxtA with the native β-saxitoxinol and saxitoxin substrates bound in addition to a Xenon-pressurized structure of GxtA that reveals the location of a substrate access tunnel to the active site. Ultimately, this structural information allowed for the identification of six residues distributed between three regions of SxtT that together control the selectivity of the C–H hydroxylation event. Substitution of these residues produces a SxtT variant that is fully adapted to exhibit the non-native site-selectivity and substrate scope of GxtA. Importantly, we also found that these selectivity regions are conserved in other structurally characterized Rieske oxygenases, providing a framework for predictively repurposing and manipulating Rieske oxygenases as biocatalysts.

Sign in / Sign up

Export Citation Format

Share Document