scholarly journals scite: a smart citation index that displays the context of citations and classifies their intent using deep learning

2021 ◽  
pp. 1-38
Author(s):  
Josh M. Nicholson ◽  
Milo Mordaunt ◽  
Patrice Lopez ◽  
Ashish Uppala ◽  
Dominic Rosati ◽  
...  

Abstract Citation indices are tools used by the academic community for research and research evaluation which aggregate scientific literature output and measure impact by collating citation counts. Citation indices help measure the interconnections between scientific papers but fall short because they fail to communicate contextual information about a citation. The usage of citations in research evaluation without consideration of context can be problematic, because a citation that presents contrasting evidence to a paper is treated the same as a citation that presents supporting evidence. To solve this problem, we have used machine learning, traditional document ingestion methods, and a network of researchers to develop a “smart citation index” called scite, which categorizes citations based on context. Scite shows how a citation was used by displaying the surrounding textual context from the citing paper and a classification from our deep learning model that indicates whether the statement provides supporting or contrasting evidence for a referenced work, or simply mentions it. Scite has been developed by analyzing over 25 million full-text scientific articles and currently has a database of more than 880 million classified citation statements. Here we describe how scite works and how it can be used to further research and research evaluation. Peer Review https://publons.com/publon/10.1162/qss_a_00146

2021 ◽  
Author(s):  
Joshua M Nicholson ◽  
Milo Mordaunt ◽  
Patrice Lopez ◽  
Ashish Uppala ◽  
Domenic Rosati ◽  
...  

Citation indices are tools used by the academic community for research and research evaluation which aggregate scientific literature output and measure scientific impact by collating citation counts. Citation indices help measure the interconnections between scientific papers but fall short because they only display paper titles, authors, and the date of publications, and fail to communicate contextual information about why a citation was made. The usage of citations in research evaluation without due consideration to context can be problematic, if only because a citation that disputes a paper is treated the same as a citation that supports it. To solve this problem, we have used machine learning and other techniques to develop a "smart citation index" called scite, which categorizes citations based on context. Scite shows how a citation was used by displaying the surrounding textual context from the citing paper, and a classification from our deep learning model that indicates whether the statement provides supporting or disputing evidence for a referenced work, or simply mentions it. Scite has been developed by analyzing over 23 million full-text scientific articles and currently has a database of more than 800 million classified citation statements. Here we describe how scite works and how it can be used to further research and research evaluation.


2021 ◽  
Vol 53 (2) ◽  
Author(s):  
Sen Yang ◽  
Yaping Zhang ◽  
Siu-Yeung Cho ◽  
Ricardo Correia ◽  
Stephen P. Morgan

AbstractConventional blood pressure (BP) measurement methods have different drawbacks such as being invasive, cuff-based or requiring manual operations. There is significant interest in the development of non-invasive, cuff-less and continual BP measurement based on physiological measurement. However, in these methods, extracting features from signals is challenging in the presence of noise or signal distortion. When using machine learning, errors in feature extraction result in errors in BP estimation, therefore, this study explores the use of raw signals as a direct input to a deep learning model. To enable comparison with the traditional machine learning models which use features from the photoplethysmogram and electrocardiogram, a hybrid deep learning model that utilises both raw signals and physical characteristics (age, height, weight and gender) is developed. This hybrid model performs best in terms of both diastolic BP (DBP) and systolic BP (SBP) with the mean absolute error being 3.23 ± 4.75 mmHg and 4.43 ± 6.09 mmHg respectively. DBP and SBP meet the Grade A and Grade B performance requirements of the British Hypertension Society respectively.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Rajat Garg ◽  
Anil Kumar ◽  
Nikunj Bansal ◽  
Manish Prateek ◽  
Shashi Kumar

AbstractUrban area mapping is an important application of remote sensing which aims at both estimation and change in land cover under the urban area. A major challenge being faced while analyzing Synthetic Aperture Radar (SAR) based remote sensing data is that there is a lot of similarity between highly vegetated urban areas and oriented urban targets with that of actual vegetation. This similarity between some urban areas and vegetation leads to misclassification of the urban area into forest cover. The present work is a precursor study for the dual-frequency L and S-band NASA-ISRO Synthetic Aperture Radar (NISAR) mission and aims at minimizing the misclassification of such highly vegetated and oriented urban targets into vegetation class with the help of deep learning. In this study, three machine learning algorithms Random Forest (RF), K-Nearest Neighbour (KNN), and Support Vector Machine (SVM) have been implemented along with a deep learning model DeepLabv3+ for semantic segmentation of Polarimetric SAR (PolSAR) data. It is a general perception that a large dataset is required for the successful implementation of any deep learning model but in the field of SAR based remote sensing, a major issue is the unavailability of a large benchmark labeled dataset for the implementation of deep learning algorithms from scratch. In current work, it has been shown that a pre-trained deep learning model DeepLabv3+ outperforms the machine learning algorithms for land use and land cover (LULC) classification task even with a small dataset using transfer learning. The highest pixel accuracy of 87.78% and overall pixel accuracy of 85.65% have been achieved with DeepLabv3+ and Random Forest performs best among the machine learning algorithms with overall pixel accuracy of 77.91% while SVM and KNN trail with an overall accuracy of 77.01% and 76.47% respectively. The highest precision of 0.9228 is recorded for the urban class for semantic segmentation task with DeepLabv3+ while machine learning algorithms SVM and RF gave comparable results with a precision of 0.8977 and 0.8958 respectively.


2021 ◽  
Vol 21 (1) ◽  
Author(s):  
William Greig Mitchell ◽  
Edward Christopher Dee ◽  
Leo Anthony Celi

AbstractCho et al. report deep learning model accuracy for tilted myopic disc detection in a South Korean population. Here we explore the importance of generalisability of machine learning (ML) in healthcare, and we emphasise that recurrent underrepresentation of data-poor regions may inadvertently perpetuate global health inequity.Creating meaningful ML systems is contingent on understanding how, when, and why different ML models work in different settings. While we echo the need for the diversification of ML datasets, such a worthy effort would take time and does not obviate uses of presently available datasets if conclusions are validated and re-calibrated for different groups prior to implementation.The importance of external ML model validation on diverse populations should be highlighted where possible – especially for models built with single-centre data.


Electronics ◽  
2020 ◽  
Vol 10 (1) ◽  
pp. 39
Author(s):  
Zhiyuan Xie ◽  
Shichang Du ◽  
Jun Lv ◽  
Yafei Deng ◽  
Shiyao Jia

Remaining Useful Life (RUL) prediction is significant in indicating the health status of the sophisticated equipment, and it requires historical data because of its complexity. The number and complexity of such environmental parameters as vibration and temperature can cause non-linear states of data, making prediction tremendously difficult. Conventional machine learning models such as support vector machine (SVM), random forest, and back propagation neural network (BPNN), however, have limited capacity to predict accurately. In this paper, a two-phase deep-learning-model attention-convolutional forget-gate recurrent network (AM-ConvFGRNET) for RUL prediction is proposed. The first phase, forget-gate convolutional recurrent network (ConvFGRNET) is proposed based on a one-dimensional analog long short-term memory (LSTM), which removes all the gates except the forget gate and uses chrono-initialized biases. The second phase is the attention mechanism, which ensures the model to extract more specific features for generating an output, compensating the drawbacks of the FGRNET that it is a black box model and improving the interpretability. The performance and effectiveness of AM-ConvFGRNET for RUL prediction is validated by comparing it with other machine learning methods and deep learning methods on the Commercial Modular Aero-Propulsion System Simulation (C-MAPSS) dataset and a dataset of ball screw experiment.


2021 ◽  
Author(s):  
Lukman Ismael ◽  
Pejman Rasti ◽  
Florian Bernard ◽  
Philippe Menei ◽  
Aram Ter Minassian ◽  
...  

BACKGROUND The functional MRI (fMRI) is an essential tool for the presurgical planning of brain tumor removal, allowing the identification of functional brain networks in order to preserve the patient’s neurological functions. One fMRI technique used to identify the functional brain network is the resting-state-fMRI (rsfMRI). However, this technique is not routinely used because of the necessity to have a expert reviewer to identify manually each functional networks. OBJECTIVE We aimed to automatize the detection of brain functional networks in rsfMRI data using deep learning and machine learning algorithms METHODS We used the rsfMRI data of 82 healthy patients to test the diagnostic performance of our proposed end-to-end deep learning model to the reference functional networks identified manually by 2 expert reviewers. RESULTS Experiment results show the best performance of 86% correct recognition rate obtained from the proposed deep learning architecture which shows its superiority over other machine learning algorithms that were equally tested for this classification task. CONCLUSIONS The proposed end-to-end deep learning model was the most performant machine learning algorithm. The use of this model to automatize the functional networks detection in rsfMRI may allow to broaden the use of the rsfMRI, allowing the presurgical identification of these networks and thus help to preserve the patient’s neurological status. CLINICALTRIAL Comité de protection des personnes Ouest II, decision reference CPP 2012-25)


2019 ◽  
Author(s):  
Mojtaba Haghighatlari ◽  
Gaurav Vishwakarma ◽  
Mohammad Atif Faiz Afzal ◽  
Johannes Hachmann

<div><div><div><p>We present a multitask, physics-infused deep learning model to accurately and efficiently predict refractive indices (RIs) of organic molecules, and we apply it to a library of 1.5 million compounds. We show that it outperforms earlier machine learning models by a significant margin, and that incorporating known physics into data-derived models provides valuable guardrails. Using a transfer learning approach, we augment the model to reproduce results consistent with higher-level computational chemistry training data, but with a considerably reduced number of corresponding calculations. Prediction errors of machine learning models are typically smallest for commonly observed target property values, consistent with the distribution of the training data. However, since our goal is to identify candidates with unusually large RI values, we propose a strategy to boost the performance of our model in the remoter areas of the RI distribution: We bias the model with respect to the under-represented classes of molecules that have values in the high-RI regime. By adopting a metric popular in web search engines, we evaluate our effectiveness in ranking top candidates. We confirm that the models developed in this study can reliably predict the RIs of the top 1,000 compounds, and are thus able to capture their ranking. We believe that this is the first study to develop a data-derived model that ensures the reliability of RI predictions by model augmentation in the extrapolation region on such a large scale. These results underscore the tremendous potential of machine learning in facilitating molecular (hyper)screening approaches on a massive scale and in accelerating the discovery of new compounds and materials, such as organic molecules with high-RI for applications in opto-electronics.</p></div></div></div>


Symmetry ◽  
2021 ◽  
Vol 13 (12) ◽  
pp. 2293
Author(s):  
Zixiang Yue ◽  
Youliang Ding ◽  
Hanwei Zhao ◽  
Zhiwen Wang

A cable-stayed bridge is a typical symmetrical structure, and symmetry affects the deformation characteristics of such bridges. The main girder of a cable-stayed bridge will produce obvious deflection under the inducement of temperature. The regression model of temperature-induced deflection is hoped to provide a comparison value for bridge evaluation. Based on the temperature and deflection data obtained by the health monitoring system of a bridge, establishing the correlation model between temperature and temperature-induced deflection is meaningful. It is difficult to complete a high-quality model only by the girder temperature. The temperature features based on prior knowledge from the mechanical mechanism are used as the input information in this paper. At the same time, to strengthen the nonlinear ability of the model, this paper selects an independent recurrent neural network (IndRNN) for modeling. The deep learning neural network is compared with machine learning neural networks to prove the advancement of deep learning. When only the average temperature of the main girder is input, the calculation accuracy is not high regardless of whether the deep learning network or the machine learning network is used. When the temperature information extracted by the prior knowledge is input, the average error of IndRNN model is only 2.53%, less than those of BPNN model and traditional RNN. Combining knowledge with deep learning is undoubtedly the best modeling scheme. The deep learning model can provide a comparison value of bridge deformation for bridge management.


Water ◽  
2021 ◽  
Vol 13 (19) ◽  
pp. 2664
Author(s):  
Sunil Saha ◽  
Jagabandhu Roy ◽  
Tusar Kanti Hembram ◽  
Biswajeet Pradhan ◽  
Abhirup Dikshit ◽  
...  

The efficiency of deep learning and tree-based machine learning approaches has gained immense popularity in various fields. One deep learning model viz. convolution neural network (CNN), artificial neural network (ANN) and four tree-based machine learning models, namely, alternative decision tree (ADTree), classification and regression tree (CART), functional tree and logistic model tree (LMT), were used for landslide susceptibility mapping in the East Sikkim Himalaya region of India, and the results were compared. Landslide areas were delimited and mapped as landslide inventory (LIM) after gathering information from historical records and periodic field investigations. In LIM, 91 landslides were plotted and classified into training (64 landslides) and testing (27 landslides) subsets randomly to train and validate the models. A total of 21 landslide conditioning factors (LCFs) were considered as model inputs, and the results of each model were categorised under five susceptibility classes. The receiver operating characteristics curve and 21 statistical measures were used to evaluate and prioritise the models. The CNN deep learning model achieved the priority rank 1 with area under the curve of 0.918 and 0.933 by using the training and testing data, quantifying 23.02% and 14.40% area as very high and highly susceptible followed by ANN, ADtree, CART, FTree and LMT models. This research might be useful in landslide studies, especially in locations with comparable geophysical and climatological characteristics, to aid in decision making for land use planning.


2021 ◽  
Author(s):  
Yew Kee Wong

Deep learning is a type of machine learning that trains a computer to perform human-like tasks, such as recognizing speech, identifying images or making predictions. Instead of organizing data to run through predefined equations, deep learning sets up basic parameters about the data and trains the computer to learn on its own by recognizing patterns using many layers of processing. This paper aims to illustrate some of the different deep learning algorithms and methods which can be applied to artificial intelligence analysis, as well as the opportunities provided by the application in various decision making domains.


Sign in / Sign up

Export Citation Format

Share Document