Machine Learning-Assisted Sampling of SERS Substrates Improves Data Collection Efficiency

2021 ◽  
pp. 000370282110345
Author(s):  
Tatu Rojalin ◽  
Dexter Antonio ◽  
Ambarish Kulkarni ◽  
Randy P. Carney

Surface-enhanced Raman scattering (SERS) is a powerful technique for sensitive label-free analysis of chemical and biological samples. While much recent work has established sophisticated automation routines using machine learning and related artificial intelligence methods, these efforts have largely focused on downstream processing (e.g., classification tasks) of previously collected data. While fully automated analysis pipelines are desirable, current progress is limited by cumbersome and manually intensive sample preparation and data collection steps. Specifically, a typical lab-scale SERS experiment requires the user to evaluate the quality and reliability of the measurement (i.e., the spectra) as the data are being collected. This need for expert user-intuition is a major bottleneck that limits applicability of SERS-based diagnostics for point-of-care clinical applications, where trained spectroscopists are likely unavailable. While application-agnostic numerical approaches (e.g., signal-to-noise thresholding) are useful, there is an urgent need to develop algorithms that leverage expert user intuition and domain knowledge to simplify and accelerate data collection steps. To address this challenge, in this work, we introduce a machine learning-assisted method at the acquisition stage. We tested six common algorithms to measure best performance in the context of spectral quality judgment. For adoption into future automation platforms, we developed an open-source python package tailored for rapid expert user annotation to train machine learning algorithms. We expect that this new approach to use machine learning to assist in data acquisition can serve as a useful building block for point-of-care SERS diagnostic platforms.

2019 ◽  
Vol 14 (5) ◽  
pp. 406-421 ◽  
Author(s):  
Ting-He Zhang ◽  
Shao-Wu Zhang

Background: Revealing the subcellular location of a newly discovered protein can bring insight into their function and guide research at the cellular level. The experimental methods currently used to identify the protein subcellular locations are both time-consuming and expensive. Thus, it is highly desired to develop computational methods for efficiently and effectively identifying the protein subcellular locations. Especially, the rapidly increasing number of protein sequences entering the genome databases has called for the development of automated analysis methods. Methods: In this review, we will describe the recent advances in predicting the protein subcellular locations with machine learning from the following aspects: i) Protein subcellular location benchmark dataset construction, ii) Protein feature representation and feature descriptors, iii) Common machine learning algorithms, iv) Cross-validation test methods and assessment metrics, v) Web servers. Result & Conclusion: Concomitant with a large number of protein sequences generated by highthroughput technologies, four future directions for predicting protein subcellular locations with machine learning should be paid attention. One direction is the selection of novel and effective features (e.g., statistics, physical-chemical, evolutional) from the sequences and structures of proteins. Another is the feature fusion strategy. The third is the design of a powerful predictor and the fourth one is the protein multiple location sites prediction.


mSphere ◽  
2019 ◽  
Vol 4 (3) ◽  
Author(s):  
Artur Yakimovich

ABSTRACT Artur Yakimovich works in the field of computational virology and applies machine learning algorithms to study host-pathogen interactions. In this mSphere of Influence article, he reflects on two papers “Holographic Deep Learning for Rapid Optical Screening of Anthrax Spores” by Jo et al. (Y. Jo, S. Park, J. Jung, J. Yoon, et al., Sci Adv 3:e1700606, 2017, https://doi.org/10.1126/sciadv.1700606) and “Bacterial Colony Counting with Convolutional Neural Networks in Digital Microbiology Imaging” by Ferrari and colleagues (A. Ferrari, S. Lombardi, and A. Signoroni, Pattern Recognition 61:629–640, 2017, https://doi.org/10.1016/j.patcog.2016.07.016). Here he discusses how these papers made an impact on him by showcasing that artificial intelligence algorithms can be equally applicable to both classical infection biology techniques and cutting-edge label-free imaging of pathogens.


Author(s):  
A. Khanwalkar ◽  
R. Soni

Purpose: Diabetes is a chronic disease that pays for a large proportion of the nation's healthcare expenses when people with diabetes want medical care continuously. Several complications will occur if the polymer disorder is not treated and unrecognizable. The prescribed condition leads to a diagnostic center and a doctor's intention. One of the real-world subjects essential is to find the first phase of the polytechnic. In this work, basically a survey that has been analyzed in several parameters within the poly-infected disorder diagnosis. It resembles the classification algorithms of data collection that plays an important role in the data collection method. Automation of polygenic disorder analysis, as well as another machine learning algorithm. Design/methodology/approach: This paper provides extensive surveys of different analogies which have been used for the analysis of medical data, For the purpose of early detection of polygenic disorder. This paper takes into consideration methods such as J48, CART, SVMs and KNN square, this paper also conducts a formal surveying of all the studies, and provides a conclusion at the end. Findings: This surveying has been analyzed on several parameters within the poly-infected disorder diagnosis. It resembles that the classification algorithms of data collection plays an important role in the data collection method in Automation of polygenic disorder analysis, as well as another machine learning algorithm. Practical implications: This paper will help future researchers in the field of Healthcare, specifically in the domain of diabetes, to understand differences between classification algorithms. Originality/value: This paper will help in comparing machine learning algorithms by going through results and selecting the appropriate approach based on requirements.


Circulation ◽  
2020 ◽  
Vol 142 (Suppl_3) ◽  
Author(s):  
Pratik Doshi ◽  
John Tanaka ◽  
Jedrek Wosik ◽  
Natalia M Gil ◽  
Martin Bertran ◽  
...  

Introduction: There is a need for innovative solutions to better screen and diagnose the 7 million patients with chronic heart failure. A key component of assessing these patients is monitoring fluid status by evaluating for the presence and height of jugular venous distension (JVD). We hypothesize that video analysis of a patient’s neck using machine learning algorithms and image recognition can identify the amount of JVD. We propose the use of high fidelity video recordings taken using a mobile device camera to determine the presence or absence of JVD, which we will use to develop a point of care testing tool for early detection of acute exacerbation of heart failure. Methods: In this feasibility study, patients in the Duke cardiac catheterization lab undergoing right heart catheterization were enrolled. RGB and infrared videos were captured of the patient’s neck to detect JVD and correlated with right atrial pressure on the heart catheterization. We designed an adaptive filter based on biological priors that enhances spatially consistent frequency anomalies and detects jugular vein distention, with implementation done on Python. Results: We captured and analyzed footage for six patients using our model. Four of these six patients shared a similar strong signal outliner within the frequency band of 95bpm – 200bpm when using a conservative threshold, indicating the presence of JVD. We did not use statistical analysis given the small nature of our cohort, but in those we detected a positive JVD signal the RA mean was 20.25 mmHg and PCWP mean was 24.3 mmHg. Conclusions: We have demonstrated the ability to evaluate for JVD via infrared video and found a relationship with RHC values. Our project is innovative because it uses video recognition and allows for novel patient interactions using a non-invasive screening technique for heart failure. This tool can become a non-invasive standard to both screen for and help manage heart failure patients.


2021 ◽  
Author(s):  
Thitaree Lertliangchai ◽  
Birol Dindoruk ◽  
Ligang Lu ◽  
Xi Yang

Abstract Dew point pressure (DPP) is a key variable that may be needed to predict the condensate to gas ratio behavior of a reservoir along with some production/completion related issues and calibrate/constrain the EOS models for integrated modeling. However, DPP is a challenging property in terms of its predictability. Recognizing the complexities, we present a state-of-the-art method for DPP prediction using advanced machine learning (ML) techniques. We compare the outcomes of our methodology with that of published empirical correlation-based approaches on two datasets with small sizes and different inputs. Our ML method noticeably outperforms the correlation-based predictors while also showing its flexibility and robustness even with small training datasets provided various classes of fluids are represented within the datasets. We have collected the condensate PVT data from public domain resources and GeoMark RFDBASE containing dew point pressure (the target variable), and the compositional data (mole percentage of each component), temperature, molecular weight (MW), MW and specific gravity (SG) of heptane plus as input variables. Using domain knowledge, before embarking the study, we have extensively checked the measurement quality and the outcomes using statistical techniques. We then apply advanced ML techniques to train predictive models with cross-validation to avoid overfitting the models to the small datasets. We compare our models against the best published DDP predictors with empirical correlation-based techniques. For fair comparisons, the correlation-based predictors are also trained using the underlying datasets. In order to improve the outcomes and using the generalized input data, pseudo-critical properties and artificial proxy features are also employed.


2021 ◽  
Author(s):  
Meng Ji ◽  
Yanmeng Liu ◽  
Tianyong Hao

BACKGROUND Much of current health information understandability research uses medical readability formula (MRF) to assess the cognitive difficulty of health education resources. This is based on an implicit assumption that medical domain knowledge represented by uncommon words or jargons form the sole barriers to health information access among the public. Our study challenged this by showing that for readers from non-English speaking backgrounds with higher education attainment, semantic features of English health texts rather than medical jargons can explain the lack of cognitive access of health materials among readers with better understanding of health terms, yet limited exposure to English health education materials. OBJECTIVE Our study explored combined MRF and multidimensional semantic features (MSF) for developing machine learning algorithms to predict the actual level of cognitive accessibility of English health materials on health risks and diseases for specific populations. We compare algorithms to evaluate the cognitive accessibility of specialised health information for non-native English speaker with advanced education levels yet very limited exposure to English health education environments. METHODS We used 108 semantic features to measure the content complexity and accessibility of original English resources. Using 1000 English health texts collected from international health organization websites, rated by international tertiary students, we compared machine learning (decision tree, SVM, discriminant analysis, ensemble tree and logistic regression) after automatic hyperparameter optimization (grid search for the best combination of hyperparameters of minimal classification errors). We applied 10-fold cross-validation on the whole dataset for the model training and testing, calculated the AUC, sensitivity, specificity, and accuracy as the measured of the model performance. RESULTS Using two sets of predictor features: widely tested MRF and MSF proposed in our study, we developed and compared three sets of machine learning algorithms: the first set of algorithms used MRF as predictors only, the second set of algorithms used MSF as predictors only, and the last set of algorithms used both MRF and MSF as integrated models. The results showed that the integrated models outperformed in terms of AUC, sensitivity, accuracy, and specificity. CONCLUSIONS Our study showed that cognitive accessibility of English health texts is not limited to word length and sentence length conventionally measured by MRF. We compared machine learning algorithms combing MRF and MSF to explore the cognitive accessibility of health information from syntactic and semantic perspectives. The results showed the strength of integrated models in terms of statistically increased AUC, sensitivity, and accuracy to predict health resource accessibility for the target readership, indicating that both MRF and MSF contribute to the comprehension of health information, and that for readers with advanced education, semantic features outweigh syntax and domain knowledge.


Author(s):  
Anitha Elavarasi S. ◽  
Jayanthi J.

Machine learning provides the system to automatically learn without human intervention and improve their performance with the help of previous experience. It can access the data and use it for learning by itself. Even though many algorithms are developed to solve machine learning issues, it is difficult to handle all kinds of inputs data in-order to arrive at accurate decisions. The domain knowledge of statistical science, probability, logic, mathematical optimization, reinforcement learning, and control theory plays a major role in developing machine learning based algorithms. The key consideration in selecting a suitable programming language for implementing machine learning algorithm includes performance, concurrence, application development, learning curve. This chapter deals with few of the top programming languages used for developing machine learning applications. They are Python, R, and Java. Top three programming languages preferred by data scientist are (1) Python more than 57%, (2) R more than 31%, and (3) Java used by 17% of the data scientists.


2019 ◽  
Vol 9 (1) ◽  
Author(s):  
Thomas Kurmann ◽  
Siqing Yu ◽  
Pablo Márquez-Neila ◽  
Andreas Ebneter ◽  
Martin Zinkernagel ◽  
...  

Abstract In ophthalmology, retinal biological markers, or biomarkers, play a critical role in the management of chronic eye conditions and in the development of new therapeutics. While many imaging technologies used today can visualize these, Optical Coherence Tomography (OCT) is often the tool of choice due to its ability to image retinal structures in three dimensions at micrometer resolution. But with widespread use in clinical routine, and growing prevalence in chronic retinal conditions, the quantity of scans acquired worldwide is surpassing the capacity of retinal specialists to inspect these in meaningful ways. Instead, automated analysis of scans using machine learning algorithms provide a cost effective and reliable alternative to assist ophthalmologists in clinical routine and research. We present a machine learning method capable of consistently identifying a wide range of common retinal biomarkers from OCT scans. Our approach avoids the need for costly segmentation annotations and allows scans to be characterized by biomarker distributions. These can then be used to classify scans based on their underlying pathology in a device-independent way.


2021 ◽  
Vol 12 (1) ◽  
pp. 297
Author(s):  
Tamás Orosz ◽  
Renátó Vági ◽  
Gergely Márk Csányi ◽  
Dániel Nagy ◽  
István Üveges ◽  
...  

Many machine learning-based document processing applications have been published in recent years. Applying these methodologies can reduce the cost of labor-intensive tasks and induce changes in the company’s structure. The artificial intelligence-based application can replace the application of trainees and free up the time of experts, which can increase innovation inside the company by letting them be involved in tasks with greater added value. However, the development cost of these methodologies can be high, and usually, it is not a straightforward task. This paper presents a survey result, where a machine learning-based legal text labeler competed with multiple people with different legal domain knowledge. The machine learning-based application used binary SVM-based classifiers to resolve the multi-label classification problem. The used methods were encapsulated and deployed as a digital twin into a production environment. The results show that machine learning algorithms can be effectively utilized for monotonous but domain knowledge- and attention-demanding tasks. The results also suggest that embracing the machine learning-based solution can increase discoverability and enrich the value of data. The test confirmed that the accuracy of a machine learning-based system matches up with the long-term accuracy of legal experts, which makes it applicable to automatize the working process.


2018 ◽  
Vol 37 (6) ◽  
pp. 451-461 ◽  
Author(s):  
Zhen Wang ◽  
Haibin Di ◽  
Muhammad Amir Shafiq ◽  
Yazeed Alaudah ◽  
Ghassan AlRegib

As a process that identifies geologic structures of interest such as faults, salt domes, or elements of petroleum systems in general, seismic structural interpretation depends heavily on the domain knowledge and experience of interpreters as well as visual cues of geologic structures, such as texture and geometry. With the dramatic increase in size of seismic data acquired for hydrocarbon exploration, structural interpretation has become more time consuming and labor intensive. By treating seismic data as images rather than signal traces, researchers have been able to utilize advanced image-processing and machine-learning techniques to assist interpretation directly. In this paper, we mainly focus on the interpretation of two important geologic structures, faults and salt domes, and summarize interpretation workflows based on typical or advanced image-processing and machine-learning algorithms. In recent years, increasing computational power and the massive amount of available data have led to the rise of deep learning. Deep-learning models that simulate the human brain's biological neural networks can achieve state-of-the-art accuracy and even exceed human-level performance on numerous applications. The convolutional neural network — a form of deep-learning model that is effective in analyzing visual imagery — has been applied in fault and salt dome interpretation. At the end of this review, we provide insight and discussion on the future of structural interpretation.


Sign in / Sign up

Export Citation Format

Share Document