scholarly journals Crowdsourced feature tagging for scalable and privacy-preserved autism diagnosis

Author(s):  
Peter Washington ◽  
Qandeel Tariq ◽  
Emilie Leblanc ◽  
Brianna Chrisman ◽  
Kaitlyn Dunlap ◽  
...  

ABSTRACT Standard medical diagnosis of mental health conditions often requires licensed experts who are increasingly outnumbered by those at risk, limiting reach. We test the hypothesis that a trustworthy crowd of non-experts can efficiently label features needed for accurate machine learning detection of the common childhood developmental disorder autism. We implement a novel process for creating a trustworthy distributed workforce for video feature extraction, selecting a workforce of 102 workers from a pool of 1,107. Two previously validated binary autism logistic regression classifiers were used to evaluate the quality of the curated crowd’s ratings on unstructured home videos. A clinically representative balanced sample (N=50 videos) of videos were evaluated with and without face box and pitch shift privacy alterations, with AUROC and AUPRC scores >0.98. With both privacy-preserving modifications, sensitivity is preserved (96.0%) while maintaining specificity (80.0%) and accuracy (88.0%) at levels that exceed classification methods without alterations. We find that machine learning classification from features extracted by a curated nonexpert crowd achieves clinical performance for pediatric autism videos and maintains acceptable performance when privacy-preserving mechanisms are applied. These results suggest that privacy-based crowdsourcing of short videos can be leveraged for rapid and mobile assessment of behavioral health.

2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Peter Washington ◽  
Qandeel Tariq ◽  
Emilie Leblanc ◽  
Brianna Chrisman ◽  
Kaitlyn Dunlap ◽  
...  

AbstractStandard medical diagnosis of mental health conditions requires licensed experts who are increasingly outnumbered by those at risk, limiting reach. We test the hypothesis that a trustworthy crowd of non-experts can efficiently annotate behavioral features needed for accurate machine learning detection of the common childhood developmental disorder Autism Spectrum Disorder (ASD) for children under 8 years old. We implement a novel process for identifying and certifying a trustworthy distributed workforce for video feature extraction, selecting a workforce of 102 workers from a pool of 1,107. Two previously validated ASD logistic regression classifiers, evaluated against parent-reported diagnoses, were used to assess the accuracy of the trusted crowd’s ratings of unstructured home videos. A representative balanced sample (N = 50 videos) of videos were evaluated with and without face box and pitch shift privacy alterations, with AUROC and AUPRC scores > 0.98. With both privacy-preserving modifications, sensitivity is preserved (96.0%) while maintaining specificity (80.0%) and accuracy (88.0%) at levels comparable to prior classification methods without alterations. We find that machine learning classification from features extracted by a certified nonexpert crowd achieves high performance for ASD detection from natural home videos of the child at risk and maintains high sensitivity when privacy-preserving mechanisms are applied. These results suggest that privacy-safeguarded crowdsourced analysis of short home videos can help enable rapid and mobile machine-learning detection of developmental delays in children.


Atmosphere ◽  
2019 ◽  
Vol 10 (5) ◽  
pp. 251 ◽  
Author(s):  
Wael Ghada ◽  
Nicole Estrella ◽  
Annette Menzel

Rain microstructure parameters assessed by disdrometers are commonly used to classify rain into convective and stratiform. However, different types of disdrometer result in different values for these parameters. This in turn potentially deteriorates the quality of rain type classifications. Thies disdrometer measurements at two sites in Bavaria in southern Germany were combined with cloud observations to construct a set of clear convective and stratiform intervals. This reference dataset was used to study the performance of classification methods from the literature based on the rain microstructure. We also explored the possibility of improving the performance of these methods by tuning the decision boundary. We further identified highly discriminant rain microstructure parameters and used these parameters in five machine-learning classification models. Our results confirm the potential of achieving high classification performance by applying the concepts of machine learning compared to already available methods. Machine-learning classification methods provide a concrete and flexible procedure that is applicable regardless of the geographical location or the device. The suggested procedure for classifying rain types is recommended prior to studying rain microstructure variability or any attempts at improving radar estimations of rain intensity.


Author(s):  
Soo Min Kwon ◽  
Anand D. Sarwate

Statistical machine learning algorithms often involve learning a linear relationship between dependent and independent variables. This relationship is modeled as a vector of numerical values, commonly referred to as weights or predictors. These weights allow us to make predictions, and the quality of these weights influence the accuracy of our predictions. However, when the dependent variable inherently possesses a more complex, multidimensional structure, it becomes increasingly difficult to model the relationship with a vector. In this paper, we address this issue by investigating machine learning classification algorithms with multidimensional (tensor) structure. By imposing tensor factorizations on the predictors, we can better model the relationship, as the predictors would take the form of the data in question. We empirically show that our approach works more efficiently than the traditional machine learning method when the data possesses both an exact and an approximate tensor structure. Additionally, we show that estimating predictors with these factorizations also allow us to solve for fewer parameters, making computation more feasible for multidimensional data.


Molecules ◽  
2019 ◽  
Vol 24 (15) ◽  
pp. 2811 ◽  
Author(s):  
Rácz ◽  
Bajusz ◽  
Héberger

Machine learning classification algorithms are widely used for the prediction and classification of the different properties of molecules such as toxicity or biological activity. the prediction of toxic vs. non-toxic molecules is important due to testing on living animals, which has ethical and cost drawbacks as well. The quality of classification models can be determined with several performance parameters. which often give conflicting results. In this study, we performed a multi-level comparison with the use of different performance metrics and machine learning classification methods. Well-established and standardized protocols for the machine learning tasks were used in each case. The comparison was applied to three datasets (acute and aquatic toxicities) and the robust, yet sensitive, sum of ranking differences (SRD) and analysis of variance (ANOVA) were applied for evaluation. The effect of dataset composition (balanced vs. imbalanced) and 2-class vs. multiclass classification scenarios was also studied. Most of the performance metrics are sensitive to dataset composition, especially in 2-class classification problems. The optimal machine learning algorithm also depends significantly on the composition of the dataset.


2018 ◽  
Vol 21 ◽  
pp. 45-48
Author(s):  
Shilpa Balan ◽  
Sanchita Gawand ◽  
Priyanka Purushu

Cybersecurity plays a vital role in protecting the privacy and data of people. In the recent times, there have been several issues relating to cyber fraud, data breach and cyber theft. Many people in the United States have been a victim of identity theft. Thus, understanding of cybersecurity plays an important role in protecting their information and devices. As the adoption of smart devices and social networking are increasing, cybersecurity awareness needs to be spread. The research aims at building a classification machine learning algorithm to determine the awareness of cybersecurity by the common masses in the United States. We were able to attain a good F-measure score when evaluating the performance of the classification model built for this study.


Sensors ◽  
2019 ◽  
Vol 19 (15) ◽  
pp. 3422 ◽  
Author(s):  
Chiang ◽  
Chao ◽  
Tu ◽  
Kao ◽  
Yang ◽  
...  

The classifier of support vector machine (SVM) learning for assessing the quality of arteriovenous fistulae (AVFs) in hemodialysis (HD) patients using a new photoplethysmography (PPG) sensor device is presented in this work. In clinical practice, there are two important indices for assessing the quality of AVF: the blood flow volume (BFV) and the degree of stenosis (DOS). In hospitals, the BFV and DOS of AVFs are nowadays assessed using an ultrasound Doppler machine, which is bulky, expensive, hard to use, and time consuming. In this study, a newly-developed PPG sensor device was utilized to provide patients and doctors with an inexpensive and small-sized solution for ubiquitous AVF assessment. The readout in this sensor was custom-designed to increase the signal-to-noise ratio (SNR) and reduce the environment interference via maximizing successfully the full dynamic range of measured PPG entering an analog–digital converter (ADC) and effective filtering techniques. With quality PPG measurements obtained, machine learning classifiers including SVM were adopted to assess AVF quality, where the input features are determined based on optical Beer–Lambert’s law and hemodynamic model, to ensure all the necessary features are considered. Finally, the clinical experiment results showed that the proposed PPG sensor device successfully achieved an accuracy of 87.84% based on SVM analysis in assessing DOS at AVF, while an accuracy of 88.61% was achieved for assessing BFV at AVF.


2021 ◽  
Vol 2021 ◽  
pp. 1-11
Author(s):  
Zhihong Wang ◽  
Yongbiao Li ◽  
Dingcheng Li ◽  
Ming Li ◽  
Bincheng Zhang ◽  
...  

With the rapid development of vehicular crowdsensing, it becomes easier and more efficient for mobile devices to sense, compute, and measure various data. However, how to address the fair quality evaluation between the platform and participants while preserving the privacy of solutions is still a challenge. In the work, we present a fairness-aware and privacy-preserving scheme for worker quality evaluation by leveraging the blockchain, trusted execution environment (TEE), and machine learning technologies. Specifically, we build our framework atop the decentralized blockchain which can resist a single point of failure/compromise. The smart contracts paradigm in blockchain enforces correct and automatic program execution for task processing. In addition, machine learning and TEE are utilized to evaluate the quality of data collected by the sensors in a privacy-preserving and fair way, eliminating human subject judgement of the sensing solutions. Finally, a prototype of the proposed scheme is implemented to verify the feasibility and efficiency with a benchmark dataset.


Author(s):  
Pooja Sharma ◽  
SK Pahuja ◽  
Karan Veer

Objective: Parkinson’s disease is a pervasive neuro disorder that affects people's quality of life throughout the world. The unsatisfactory results of clinical rating scales open the door for more research. PD treatment using current biomarkers seems a difficult task. So automatic evaluation at an early stage may enhance the quality and time-period of life. Methods: Grading of Recommendations Assessment, Development, and Evaluation (GRADE) and Population, intervention, comparison, and outcome (PICO) search methodology schemes are followed to search the data and eligible studies for this survey. Approximate 1500 articles were extracted using related search strings. After the stepwise mapping and elimination of studies, 94 papers are found suitable for the present review. Results: After the quality assessment of extracted studies, nine inhibitors are identified to analyze people's gait with Parkinson’s disease, where four are critical. This review also differentiates the various machine learning classification techniques with their PD analysis characteristics in previous studies. The extracted research gaps are described as future perspectives. Results can help practitioners understand the PD gait as a valuable biomarker for detection, quantification, and classification. Conclusion: Due to less cost and easy recording of gait, gait-based techniques are becoming popular in PD detection. By encapsulating the gait-based studies, it gives an in-depth knowledge of PD, different measures that affect gait detection and classification.


Sign in / Sign up

Export Citation Format

Share Document