scholarly journals Updated MS²PIP web server delivers fast and accurate MS² peak intensity prediction for multiple fragmentation methods, instruments and labeling techniques

2019 ◽  
Vol 47 (W1) ◽  
pp. W295-W299 ◽  
Author(s):  
Ralf Gabriels ◽  
Lennart Martens ◽  
Sven Degroeve

AbstractMS²PIP is a data-driven tool that accurately predicts peak intensities for a given peptide's fragmentation mass spectrum. Since the release of the MS²PIP web server in 2015, we have brought significant updates to both the tool and the web server. In addition to the original models for CID and HCD fragmentation, we have added specialized models for the TripleTOF 5600+ mass spectrometer, for TMT-labeled peptides, for iTRAQ-labeled peptides, and for iTRAQ-labeled phosphopeptides. Because the fragmentation pattern is heavily altered in each of these cases, these additional models greatly improve the prediction accuracy for their corresponding data types. We have also substantially reduced the computational resources required to run MS²PIP, and have completely rebuilt the web server, which now allows predictions of up to 100 000 peptide sequences in a single request. The MS²PIP web server is freely available at https://iomics.ugent.be/ms2pip/.

2019 ◽  
Author(s):  
Ralf Gabriels ◽  
Lennart Martens ◽  
Sven Degroeve

ABSTRACTMS2PIP is a data-driven tool that accurately predicts peak intensities for a given peptide’s fragmentation mass spectrum. Since the release of the MS2PIP web server in 2015, we have brought significant updates to both the tool and the web server. Next to the original models for CID and HCD fragmentation, we have added specific models for the TripleTOF 5600+ mass spectrometer, for TMT-labeled peptides, for iTRAQ-labeled peptides and for iTRAQ-labeled phosphopeptides. Because the fragmentation pattern is heavily altered in each of these cases, these additional models greatly improve the prediction accuracy for their corresponding data types. We have also substantially reduced the computational resources required to run MS2PIP, and have completely rebuilt the web server, which now allows predictions of up to 100.000 peptide sequences in a single request. The MS2PIP web server is freely available at https://iomics.ugent.be/ms2pip/.


2021 ◽  
Author(s):  
I-Chun Sun ◽  
Renchi Cheng ◽  
Kuo-Shen Chen

Abstract The qualities of machined products are largely depended on the status of machines in various aspects. Thus, appropriate condition monitoring would be essential for both quality control and longevity assessment. Recently, with the advance in artificial intelligence and computational power, status monitoring and prognosis based on data driven approach becomes more practical. However, unlike machine vision and image processing, where data types are fixed and the performance index has already well defined, sensor selection and index for machine tools are versatile and not standardized at this moment. Without supporting of appropriate domain knowledge for selecting appropriate sensors and adequate performance index, pure data driven approach might suffer from unsatisfied prediction accuracy and needing of excessive training data, as well as the possibility of misjudgment. This would be a key obstacle for promoting data driven based prognosis in general intelligent manufacturing field. In this work, the status monitoring and prediction of a cutter wear problem is investigated to address the above concerns and to demonstrate the possible solutions by hiring a 5-axis machine center equipped with milling cutters of different wear levels. Transducers including accelerometers, microphones, current transformer, and acoustic emission sensors are mounted on the spindle, fixture, and nearby structures to monitor the milling process. The collected data are processed to extract various signatures and the key dominated indexes are identified. Finally, three multilayer perception (MLP) artificial neural network models are established. These models trained by different input features are compared to examine the influence of selected sensors and indexes on the prediction accuracy. The results show that with appropriate sensors and signatures, even with less amount of experimental data, the model can indeed achieve a better prediction. Therefore, a proper selection of indexes guided by physical knowledge based experiment or theoretical investigation would be critical.


2021 ◽  
Vol 8 ◽  
Author(s):  
Kazuyoshi Ikeda ◽  
Takuo Doi ◽  
Masami Ikeda ◽  
Kentaro Tomii

Given the abundant computational resources and the huge amount of data of compound–protein interactions (CPIs), constructing appropriate datasets for learning and evaluating prediction models for CPIs is not always easy. For this study, we have developed a web server to facilitate the development and evaluation of prediction models by providing an appropriate dataset according to the task. Our web server provides an environment and dataset that aid model developers and evaluators in obtaining a suitable dataset for both proteins and compounds, in addition to attributes necessary for deep learning. With the web server interface, users can customize the CPI dataset derived from ChEMBL by setting positive and negative thresholds to be adjusted according to the user’s definitions. We have also implemented a function for graphic display of the distribution of activity values in the dataset as a histogram to set appropriate thresholds for positive and negative examples. These functions enable effective development and evaluation of models. Furthermore, users can prepare their task-specific datasets by selecting a set of target proteins based on various criteria such as Pfam families, ChEMBL’s classification, and sequence similarities. The accuracy and efficiency of in silico screening and drug design using machine learning including deep learning can therefore be improved by facilitating access to an appropriate dataset prepared using our web server (https://binds.lifematics.work/).


2016 ◽  
Vol 1 (1) ◽  
pp. 001
Author(s):  
Harry Setya Hadi

String searching is a common process in the processes that made the computer because the text is the main form of data storage. Boyer-Moore is the search string from right to left is considered the most efficient methods in practice, and matching string from the specified direction specifically an algorithm that has the best results theoretically. A system that is connected to a computer network that literally pick a web server that is accessed by multiple users in different parts of both good and bad aim. Any activity performed by the user, will be stored in Web server logs. With a log report contained in the web server can help a web server administrator to search the web request error. Web server log is a record of the activities of a web site that contains the data associated with the IP address, time of access, the page is opened, activities, and access methods. The amount of data contained in the resulting log is a log shed useful information.


2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Shawn Gu ◽  
Tijana Milenković

Abstract Background Network alignment (NA) can transfer functional knowledge between species’ conserved biological network regions. Traditional NA assumes that it is topological similarity (isomorphic-like matching) between network regions that corresponds to the regions’ functional relatedness. However, we recently found that functionally unrelated proteins are as topologically similar as functionally related proteins. So, we redefined NA as a data-driven method called TARA, which learns from network and protein functional data what kind of topological relatedness (rather than similarity) between proteins corresponds to their functional relatedness. TARA used topological information (within each network) but not sequence information (between proteins across networks). Yet, TARA yielded higher protein functional prediction accuracy than existing NA methods, even those that used both topological and sequence information. Results Here, we propose TARA++ that is also data-driven, like TARA and unlike other existing methods, but that uses across-network sequence information on top of within-network topological information, unlike TARA. To deal with the within-and-across-network analysis, we adapt social network embedding to the problem of biological NA. TARA++ outperforms protein functional prediction accuracy of existing methods. Conclusions As such, combining research knowledge from different domains is promising. Overall, improvements in protein functional prediction have biomedical implications, for example allowing researchers to better understand how cancer progresses or how humans age.


Information ◽  
2021 ◽  
Vol 12 (7) ◽  
pp. 259
Author(s):  
Ioannis Drivas ◽  
Dimitrios Kouis ◽  
Daphne Kyriaki-Manessi ◽  
Georgios Giannakopoulos

While digitalization of cultural organizations is in full swing and growth, it is common knowledge that websites can be used as a beacon to expand the awareness and consideration of their services on the Web. Nevertheless, recent research results indicate the managerial difficulties in deploying strategies for expanding the discoverability, visibility, and accessibility of these websites. In this paper, a three-stage data-driven Search Engine Optimization schema is proposed to assess the performance of Libraries, Archives, and Museums websites (LAMs), thus helping administrators expand their discoverability, visibility, and accessibility within the Web realm. To do so, the authors examine the performance of 341 related websites from all over the world based on three different factors, Content Curation, Speed, and Security. In the first stage, a statistically reliable and consistent assessment schema for evaluating the SEO performance of LAMs websites through the integration of more than 30 variables is presented. Subsequently, the second stage involves a descriptive data summarization for initial performance estimations of the examined websites in each factor is taking place. In the third stage, predictive regression models are developed to understand and compare the SEO performance of three different Content Management Systems, namely the Drupal, WordPress, and custom approaches, that LAMs websites have adopted. The results of this study constitute a solid stepping-stone both for practitioners and researchers to adopt and improve such methods that focus on end-users and boost organizational structures and culture that relied on data-driven approaches for expanding the visibility of LAMs services.


2020 ◽  
Vol 4 (2) ◽  
pp. 5 ◽  
Author(s):  
Ioannis C. Drivas ◽  
Damianos P. Sakas ◽  
Georgios A. Giannakopoulos ◽  
Daphne Kyriaki-Manessi

In the Big Data era, search engine optimization deals with the encapsulation of datasets that are related to website performance in terms of architecture, content curation, and user behavior, with the purpose to convert them into actionable insights and improve visibility and findability on the Web. In this respect, big data analytics expands the opportunities for developing new methodological frameworks that are composed of valid, reliable, and consistent analytics that are practically useful to develop well-informed strategies for organic traffic optimization. In this paper, a novel methodology is implemented in order to increase organic search engine visits based on the impact of multiple SEO factors. In order to achieve this purpose, the authors examined 171 cultural heritage websites and their retrieved data analytics about their performance and user experience inside them. Massive amounts of Web-based collections are included and presented by cultural heritage organizations through their websites. Subsequently, users interact with these collections, producing behavioral analytics in a variety of different data types that come from multiple devices, with high velocity, in large volumes. Nevertheless, prior research efforts indicate that these massive cultural collections are difficult to browse while expressing low visibility and findability in the semantic Web era. Against this backdrop, this paper proposes the computational development of a search engine optimization (SEO) strategy that utilizes the generated big cultural data analytics and improves the visibility of cultural heritage websites. One step further, the statistical results of the study are integrated into a predictive model that is composed of two stages. First, a fuzzy cognitive mapping process is generated as an aggregated macro-level descriptive model. Secondly, a micro-level data-driven agent-based model follows up. The purpose of the model is to predict the most effective combinations of factors that achieve enhanced visibility and organic traffic on cultural heritage organizations’ websites. To this end, the study contributes to the knowledge expansion of researchers and practitioners in the big cultural analytics sector with the purpose to implement potential strategies for greater visibility and findability of cultural collections on the Web.


2009 ◽  
Vol 43 (1) ◽  
pp. 203-205 ◽  
Author(s):  
Chetan Kumar ◽  
K. Sekar

The identification of sequence (amino acids or nucleotides) motifs in a particular order in biological sequences has proved to be of interest. This paper describes a computing server,SSMBS, which can locate and display the occurrences of user-defined biologically important sequence motifs (a maximum of five) present in a specific order in protein and nucleotide sequences. While the server can efficiently locate motifs specified using regular expressions, it can also find occurrences of long and complex motifs. The computation is carried out by an algorithm developed using the concepts of quantifiers in regular expressions. The web server is available to users around the clock at http://dicsoft1.physics.iisc.ernet.in/ssmbs/.


Author(s):  
Saikou Y Diallo ◽  
Ross Gore ◽  
Jose J Padilla ◽  
Hamdi Kavak ◽  
Christopher J Lynch

The process of developing and running simulations needs to become simple and accessible to audiences ranging from middle school students in a learning environment to subject matter experts in order to make the benefits of modeling and simulation commonly available. However, current simulations are for the most part developed and run on platforms that are: (1) demanding in terms of computational resources, (2) difficult for general audiences to use owing to unintuitive interfaces mired in mathematical syntax, (3) expensive to acquire and maintain and (4) hard to interoperate and compose. The result is a four-dimensional expense that makes simulation inaccessible to the general public. In this paper we show that by embracing the web and its standards, the use and development of simulations can become democratized and be part of a Web of Simulation where people of all skill levels are able to build, upload, retrieve, rate, and connect simulations. We show how the Web of Simulation can be built using the three basic principles of service orientation, platform independence, and interoperability. Finally, we present strategies for implementing the Web of Simulation and discuss challenges and possible approaches.


Sign in / Sign up

Export Citation Format

Share Document