supervised machine learning
Recently Published Documents





2022 ◽  
Vol 127 ◽  
pp. 108548
Edward J. Camp ◽  
Robert J. Quon ◽  
Martha Sajatovic ◽  
Farren Briggs ◽  
Brittany Brownrigg ◽  

Narathep Phruksahiran

<p>A critical problem in spectrum sensing is to create a detection algorithm and test statistics. The existing approaches employ the energy level of each channel of interest. However, this feature cannot accurately characterize the actual application of public amateur radio. The transmitted signal is not continuous and may consist only of a carrier frequency without information. This paper proposes a novel energy detection and waveform feature classification (EDWC) algorithm to detect speech signals in public frequency bands based on energy detection and supervised machine learning. The energy level, descriptive statistics, and spectral measurements of radio channels are treated as feature vectors and classifiers to determine whether the signal is speech or noise. The algorithm is validated using actual frequency modulation (FM) broadcasting and public amateur signals. The proposed EDWC algorithm's performance is evaluated in terms of training duration, classification time, and receiver operating characteristic. The simulation and experimental outcomes show that the EDWC can distinguish and classify waveform characteristics for spectrum sensing purposes, particularly for the public amateur use case. The novel technical results can detect and classify public radio frequency signals as voice signals for speech communication or just noise, which is essential and can be applied in security aspects.</p>

Soma Das ◽  
Pooja Rai ◽  
Sanjay Chatterji

The tremendous increase in the growth of misinformation in news articles has the potential threat for the adverse effects on society. Hence, the detection of misinformation in news data has become an appealing research area. The task of annotating and detecting distorted news article sentences is the immediate need in this research direction. Therefore, an attempt has been made to formulate the legitimacy annotation guideline followed by annotation and detection of the legitimacy in Bengali e-papers. The sentence-level manual annotation of Bengali news has been carried out in two levels, namely “Level-1 Shallow Level Classification” and “Level-2 Deep Level Classification” based on semantic properties of Bengali sentences. The tagging of 1,300 anonymous Bengali e-paper sentences has been done using the formulated guideline-based tags for both levels. The validation of the annotation guideline has been done by applying benchmark supervised machine learning algorithms using the lexical feature, syntactic feature, domain-specific feature, and Level-2 specific feature in both levels. Performance evaluation of these classifiers is done in terms of Accuracy, Precision, Recall, and F-Measure. In both levels, Support Vector Machine outperforms other benchmark classifiers with an accuracy of 72% and 65% in Level-1 and Level-2, respectively.

2022 ◽  
Vol 31 (1) ◽  
pp. 1-46
Chao Liu ◽  
Cuiyun Gao ◽  
Xin Xia ◽  
David Lo ◽  
John Grundy ◽  

Context: Deep learning (DL) techniques have gained significant popularity among software engineering (SE) researchers in recent years. This is because they can often solve many SE challenges without enormous manual feature engineering effort and complex domain knowledge. Objective: Although many DL studies have reported substantial advantages over other state-of-the-art models on effectiveness, they often ignore two factors: (1) reproducibility —whether the reported experimental results can be obtained by other researchers using authors’ artifacts (i.e., source code and datasets) with the same experimental setup; and (2) replicability —whether the reported experimental result can be obtained by other researchers using their re-implemented artifacts with a different experimental setup. We observed that DL studies commonly overlook these two factors and declare them as minor threats or leave them for future work. This is mainly due to high model complexity with many manually set parameters and the time-consuming optimization process, unlike classical supervised machine learning (ML) methods (e.g., random forest). This study aims to investigate the urgency and importance of reproducibility and replicability for DL studies on SE tasks. Method: In this study, we conducted a literature review on 147 DL studies recently published in 20 SE venues and 20 AI (Artificial Intelligence) venues to investigate these issues. We also re-ran four representative DL models in SE to investigate important factors that may strongly affect the reproducibility and replicability of a study. Results: Our statistics show the urgency of investigating these two factors in SE, where only 10.2% of the studies investigate any research question to show that their models can address at least one issue of replicability and/or reproducibility. More than 62.6% of the studies do not even share high-quality source code or complete data to support the reproducibility of their complex models. Meanwhile, our experimental results show the importance of reproducibility and replicability, where the reported performance of a DL model could not be reproduced for an unstable optimization process. Replicability could be substantially compromised if the model training is not convergent, or if performance is sensitive to the size of vocabulary and testing data. Conclusion: It is urgent for the SE community to provide a long-lasting link to a high-quality reproduction package, enhance DL-based solution stability and convergence, and avoid performance sensitivity on different sampled data.

2022 ◽  
Cameron I. Cooper

Abstract Nationally, more than one-third of students enrolling in introductory computer science programming courses (CS101) do not succeed. To improve student success rates, this research team used supervised machine learning to identify students who are “at-risk” of not succeeding in CS101 at a two-year public college. The resultant predictive model accurately identifies \(\approx\)99% of “at-risk” students in an out-of-sample test data set. The programming instructor piloted the use of the model’s predictive factors as early alert triggers to intervene with individualized outreach and support across three course sections of CS101 in fall 2020. The outcome of this pilot study was a 23% increase in student success and a 7.3 percentage point decrease in the DFW rate. More importantly, this study identified academic, early alert triggers for CS101. Specifically, the first two graded programs are of paramount importance for student success in the course.

Materials ◽  
2022 ◽  
Vol 15 (2) ◽  
pp. 647
Meijun Shang ◽  
Hejun Li ◽  
Ayaz Ahmad ◽  
Waqas Ahmad ◽  
Krzysztof Adam Ostrowski ◽  

Environment-friendly concrete is gaining popularity these days because it consumes less energy and causes less damage to the environment. Rapid increases in the population and demand for construction throughout the world lead to a significant deterioration or reduction in natural resources. Meanwhile, construction waste continues to grow at a high rate as older buildings are destroyed and demolished. As a result, the use of recycled materials may contribute to improving the quality of life and preventing environmental damage. Additionally, the application of recycled coarse aggregate (RCA) in concrete is essential for minimizing environmental issues. The compressive strength (CS) and splitting tensile strength (STS) of concrete containing RCA are predicted in this article using decision tree (DT) and AdaBoost machine learning (ML) techniques. A total of 344 data points with nine input variables (water, cement, fine aggregate, natural coarse aggregate, RCA, superplasticizers, water absorption of RCA and maximum size of RCA, density of RCA) were used to run the models. The data was validated using k-fold cross-validation and the coefficient correlation coefficient (R2), mean square error (MSE), mean absolute error (MAE), and root mean square error values (RMSE). However, the model’s performance was assessed using statistical checks. Additionally, sensitivity analysis was used to determine the impact of each variable on the forecasting of mechanical properties.

2022 ◽  
Vol 15 ◽  
Andrzej Z. Wasilczuk ◽  
Qing Cheng Meng ◽  
Andrew R. McKinstry-Wu

Previous studies have demonstrated that the brain has an intrinsic resistance to changes in arousal state. This resistance is most easily measured at the population level in the setting of general anesthesia and has been termed neural inertia. To date, no study has attempted to determine neural inertia in individuals. We hypothesize that individuals with markedly increased or decreased neural inertia might be at increased risk for complications related to state transitions, from awareness under anesthesia, to delayed emergence or confusion/impairment after emergence. Hence, an improved theoretical and practical understanding of neural inertia may have the potential to identify individuals at increased risk for these complications. This study was designed to explicitly measure neural inertia in individuals and empirically test the stochastic model of neural inertia using spectral analysis of the murine EEG. EEG was measured after induction of and emergence from isoflurane administered near the EC50 dose for loss of righting in genetically inbred mice on a timescale that minimizes pharmacokinetic confounds. Neural inertia was assessed by employing classifiers constructed using linear discriminant or supervised machine learning methods to determine if features of EEG spectra reliably demonstrate path dependence at steady-state anesthesia. We also report the existence of neural inertia at the individual level, as well as the population level, and that neural inertia decreases over time, providing direct empirical evidence supporting the predictions of the stochastic model of neural inertia.

2022 ◽  
Gabriela Garcia ◽  
Tharanga Kariyawasam ◽  
Anton Lord ◽  
Cristiano Costa ◽  
Lana Chaves ◽  

Abstract We describe the first application of the Near-infrared spectroscopy (NIRS) technique to detect Plasmodium falciparum and P. vivax malaria parasites through the skin of malaria positive and negative human subjects. NIRS is a rapid, non-invasive and reagent free technique which involves rapid interaction of a beam of light with a biological sample to produce diagnostic signatures in seconds. We used a handheld, miniaturized spectrometer to shine NIRS light on the ear, arm and finger of P. falciparum (n=7) and P. vivax (n=20) positive people and malaria negative individuals (n=33) in a malaria endemic setting in Brazil. Supervised machine learning algorithms for predicting the presence of malaria were applied to predict malaria infection status in independent individuals (n=12). Separate machine learning algorithms for differentiating P. falciparum from P. vivax infected subjects were developed using spectra from the arm and ear of P. falciparum and P. vivax (n=108) and the resultant model predicted infection in spectra of their fingers (n=54).NIRS non-invasively detected malaria positive and negative individuals that were excluded from the model with 100% sensitivity, 83% specificity and 92% accuracy (n=12) with spectra collected from the arm. Moreover, NIRS also correctly differentiated P. vivax from P. falciparum positive individuals with a predictive accuracy of 93% (n=54). These findings are promising but further work on a larger scale is needed to address several gaps in knowledge and establish the full capacity of NIRS as a non-invasive diagnostic tool for malaria. It is recommended that the tool is further evaluated in multiple epidemiological and demographic settings where other factors such as age, mixed infection and skin colour can be incorporated into predictive algorithms to produce more robust models for universal diagnosis of malaria.

2022 ◽  
Vol 19 ◽  
pp. 474-480
Nevila Baci ◽  
Kreshnik Vukatana ◽  
Marius Baci

Small and medium enterprises (SMEs) are businesses that account for a large percentage of the economy in many countries, but they lack cyber security. The present study examines different supervised machine learning methods with a focus on intrusion detection systems (IDSs) that will help in improving SMEs’ security. The algorithms that are tested through a real dataset, are Naïve Bayes, Sequential minimal optimization (SMO), C4.5 decision tree, and Random Forest. The experiments are run using the Waikato Environment for Knowledge Analyses (WEKA) 3.8.4 tools and the metrics used to evaluate the results were: accuracy, false-positive rate (FPR), and total time to train and build a classification model. The results obtained from the original dataset with 130 features show a high value of accuracy, but the computation time to build the classification model was notably high for the cases of C4.5 (1 hr. and 20 mins) and SMO algorithm (4 hrs. and 20 mins). the Information Gain (IG) method was used and the result was impressive. The time needed to train the model was reduced in the order of a few minutes and the accuracy was high (above 95%). In the end, challenges that SMEs can have for choosing an IDS such as lack of scalability and autonomic self-adaptation, can be solved by using a correct methodology with machine learning techniques.

Sign in / Sign up

Export Citation Format

Share Document