scholarly journals The LAILAPS Search Engine: Relevance Ranking in Life Science Databases

2010 ◽  
Vol 7 (2) ◽  
pp. 1-11 ◽  
Author(s):  
Matthias Lange ◽  
Karl Spies ◽  
Joachim Bargsten ◽  
Gregor Haberhauer ◽  
Matthias Klapperstück ◽  
...  

SummarySearch engines and retrieval systems are popular tools at a life science desktop. The manual inspection of hundreds of database entries, that reflect a life science concept or fact, is a time intensive daily work. Hereby, not the number of query results matters, but the relevance does. In this paper, we present the LAILAPS search engine for life science databases. The concept is to combine a novel feature model for relevance ranking, a machine learning approach to model user relevance profiles, ranking improvement by user feedback tracking and an intuitive and slim web user interface, that estimates relevance rank by tracking user interactions. Queries are formulated as simple keyword lists and will be expanded by synonyms. Supporting a flexible text index and a simple data import format, LAILAPS can easily be used both as search engine for comprehensive integrated life science databases and for small in-house project databases.With a set of features, extracted from each database hit in combination with user relevance preferences, a neural network predicts user specific relevance scores. Using expert knowledge as training data for a predefined neural network or using users own relevance training sets, a reliable relevance ranking of database hits has been implemented.In this paper, we present the LAILAPS system, the concepts, benchmarks and use cases. LAILAPS is public available for SWISSPROT data at http://lailaps.ipk-gatersleben.de

2010 ◽  
Vol 7 (3) ◽  
Author(s):  
Matthias Lange ◽  
Karl Spies ◽  
Christian Colmsee ◽  
Steffen Flemming ◽  
Matthias Klapperstück ◽  
...  

SummaryEfficient and effective information retrieval in life sciences is one of the most pressing challenge in bioinformatics. The incredible growth of life science databases to a vast network of interconnected information systems is to the same extent a big challenge and a great chance for life science research. The knowledge found in the Web, in particular in life-science databases, are a valuable major resource. In order to bring it to the scientist desktop, it is essential to have well performing search engines. Thereby, not the response time nor the number of results is important. The most crucial factor for millions of query results is the relevance ranking.In this paper, we present a feature model for relevance ranking in life science databases and its implementation in the LAILAPS search engine. Motivated by the observation of user behavior during their inspection of search engine result, we condensed a set of 9 relevance discriminating features. These features are intuitively used by scientists, who briefly screen database entries for potential relevance. The features are both sufficient to estimate the potential relevance, and efficiently quantifiable.The derivation of a relevance prediction function that computes the relevance from this features constitutes a regression problem. To solve this problem, we used artificial neural networks that have been trained with a reference set of relevant database entries for 19 protein queries.Supporting a flexible text index and a simple data import format, this concepts are implemented in the LAILAPS search engine. It can easily be used both as search engine for comprehensive integrated life science databases and for small in-house project databases. LAILAPS is publicly available for SWISSPROT data at http://lailaps.ipk-gatersleben.de


2010 ◽  
Author(s):  
Matthias Lange ◽  
Karl Spies ◽  
Christian Colmsee ◽  
Steffen Flemming ◽  
Matthias Klapperstück ◽  
...  

Author(s):  
Andre´ Huppertz ◽  
Peter M. Flassig ◽  
Robert J. Flassig ◽  
Marius Swoboda

This paper presents a method to obtain optimized 2D blade sections using expert knowledge, a multi-criteria optimization approach and a neural network in an automated process. A special focus is put on neural networks, which are used to capture the complex correlations between aerodynamic and geometric parameters. The multi-criteria optimization is used to generate optimal training data for the neural network. The main objective of this investigation is to generate 2D blade sections from scratch including loss prediction using through flow quantities and a neural network approach without any CFD computations. First results are very promising in terms of computation time, model capacities and performance prediction of the neural network.


This paper discusses the idea of capturing an expert’s knowledge in the form of human understandable rules and then inserting these rules into a dynamic cell structure (DCS) neural network. The DCS is a form of self-organizing map that can be used for many purposes, including classification and prediction. This particular neural network is considered to be a topology preserving network that starts with no pre-structure, but assumes a structure once trained. The DCS has been used in mission and safety-critical applications, including adaptive flight control and health-monitoring in aerial vehicles. The approach is to insert expert knowledge into the DCS before training. Rules are translated into a pre-structure and then training data are presented. This idea has been demonstrated using the well-known Iris data set and it has been shown that inserting the pre-structure results in better accuracy with the same training.


Author(s):  
Tomomasa Ohkubo ◽  
Ei-ichi Matsunaga ◽  
Junji Kawanaka ◽  
Takahisa Jitsuno ◽  
Shinji Motokoshi ◽  
...  

Optical devices often achieve their maximum effectiveness by using dielectric mirrors; however, their design techniques depend on expert knowledge in specifying the mirror properties. This expertise can also be achieved by machine learning, although it is not clear what kind of neural network would be effective for learning about dielectric mirrors. In this paper, we clarify that the recurrent neural network (RNN) is an effective approach to machine-learning for dielectric mirror properties. The relation between the thickness distribution of the mirror’s multiple film layers and the average reflectivity in the target wavelength region is used as the indicator in this study. Reflection from the dielectric multilayer film results from the sequence of interfering reflections from the boundaries between film layers. Therefore, the RNN, which is usually used for sequential data, is effective to learn the relationship between average reflectivity and the thickness of individual film layers in a dielectric mirror. We found that a RNN can predict its average reflectivity with a mean squared error (MSE) less than 10-4 from representative thickness distribution data (10 layers with alternating refractive indexes 2.3 and 1.4). Furthermore, we clarified that training data sets generated randomly lead to over-learning. It is necessary to generate training data sets from larger data sets so that the histogram of reflectivity becomes a flat distribution. In the future, we plan to apply this knowledge to design dielectric mirrors using neural network approaches such as generative adversarial networks, which do not require the know-how of experts.


Geophysics ◽  
2021 ◽  
pp. 1-48
Author(s):  
Jan-Willem Vrolijk ◽  
Gerrit Blacquiere

It is well known that source deghosting can best be applied to common-receiver gathers, while receiver deghosting can best be applied to common-shot records. The source-ghost wavefield observed in the common-shot domain contains the imprint of the subsurface, which complicates source deghosting in common-shot domain, in particular when the subsurface is complex. Unfortunately, the alternative, i.e., the common-receiver domain, is often coarsely sampled, which complicates source deghosting in this domain as well. To solve the latter issue, we propose to train a convolutional neural network to apply source deghosting in this domain. We subsample all shot records with and without the receiver ghost wavefield to obtain the training data. Due to reciprocity this training data is a representative data set for source deghosting in the coarse common-receiver domain. We validate the machine-learning approach on simulated data and on field data. The machine learning approach gives a significant uplift to the simulated data compared to conventional source deghosting. The field-data results confirm that the proposed machine-learning approach is able to remove the source-ghost wavefield from the coarsely-sampled common-receiver gathers.


2020 ◽  
Vol 2020 (8) ◽  
pp. 188-1-188-7
Author(s):  
Xiaoyu Xiang ◽  
Yang Cheng ◽  
Jianhang Chen ◽  
Qian Lin ◽  
Jan Allebach

Image aesthetic assessment has always been regarded as a challenging task because of the variability of subjective preference. Besides, the assessment of a photo is also related to its style, semantic content, etc. Conventionally, the estimations of aesthetic score and style for an image are treated as separate problems. In this paper, we explore the inter-relatedness between the aesthetics and image style, and design a neural network that can jointly categorize image by styles and give an aesthetic score distribution. To this end, we propose a multi-task network (MTNet) with an aesthetic column serving as a score predictor and a style column serving as a style classifier. The angular-softmax loss is applied in training primary style classifiers to maximize the margin among classes in single-label training data; the semi-supervised method is applied to improve the network’s generalization ability iteratively. We combine the regression loss and classification loss in training aesthetic score. Experiments on the AVA dataset show the superiority of our network in both image attributes classification and aesthetic ranking tasks.


2018 ◽  
Author(s):  
Roman Zubatyuk ◽  
Justin S. Smith ◽  
Jerzy Leszczynski ◽  
Olexandr Isayev

<p>Atomic and molecular properties could be evaluated from the fundamental Schrodinger’s equation and therefore represent different modalities of the same quantum phenomena. Here we present AIMNet, a modular and chemically inspired deep neural network potential. We used AIMNet with multitarget training to learn multiple modalities of the state of the atom in a molecular system. The resulting model shows on several benchmark datasets the state-of-the-art accuracy, comparable to the results of orders of magnitude more expensive DFT methods. It can simultaneously predict several atomic and molecular properties without an increase in computational cost. With AIMNet we show a new dimension of transferability: the ability to learn new targets utilizing multimodal information from previous training. The model can learn implicit solvation energy (like SMD) utilizing only a fraction of original training data, and archive MAD error of 1.1 kcal/mol compared to experimental solvation free energies in MNSol database.</p>


1992 ◽  
Vol 26 (9-11) ◽  
pp. 2461-2464 ◽  
Author(s):  
R. D. Tyagi ◽  
Y. G. Du

A steady-statemathematical model of an activated sludgeprocess with a secondary settler was developed. With a limited number of training data samples obtained from the simulation at steady state, a feedforward neural network was established which exhibits an excellent capability for the operational prediction and determination.


2019 ◽  
Vol 2019 ◽  
pp. 1-9 ◽  
Author(s):  
Michał Klimont ◽  
Mateusz Flieger ◽  
Jacek Rzeszutek ◽  
Joanna Stachera ◽  
Aleksandra Zakrzewska ◽  
...  

Hydrocephalus is a common neurological condition that can have traumatic ramifications and can be lethal without treatment. Nowadays, during therapy radiologists have to spend a vast amount of time assessing the volume of cerebrospinal fluid (CSF) by manual segmentation on Computed Tomography (CT) images. Further, some of the segmentations are prone to radiologist bias and high intraobserver variability. To improve this, researchers are exploring methods to automate the process, which would enable faster and more unbiased results. In this study, we propose the application of U-Net convolutional neural network in order to automatically segment CT brain scans for location of CSF. U-Net is a neural network that has proven to be successful for various interdisciplinary segmentation tasks. We optimised training using state of the art methods, including “1cycle” learning rate policy, transfer learning, generalized dice loss function, mixed float precision, self-attention, and data augmentation. Even though the study was performed using a limited amount of data (80 CT images), our experiment has shown near human-level performance. We managed to achieve a 0.917 mean dice score with 0.0352 standard deviation on cross validation across the training data and a 0.9506 mean dice score on a separate test set. To our knowledge, these results are better than any known method for CSF segmentation in hydrocephalic patients, and thus, it is promising for potential practical applications.


Sign in / Sign up

Export Citation Format

Share Document