scholarly journals Efficient selection of quasar candidates based on optical and infrared photometric data using machine learning

2019 ◽  
Vol 485 (4) ◽  
pp. 4539-4549 ◽  
Author(s):  
Xin Jin ◽  
Yanxia Zhang ◽  
Jingyi Zhang ◽  
Yongheng Zhao ◽  
Xue-bing Wu ◽  
...  

ABSTRACT We aim to select quasar candidates based on the two large survey databases, Pan-STARRS and AllWISE. Exploring the distribution of quasars and stars in the colour spaces, we find that the combination of infrared and optical photometry is more conducive to select quasar candidates. Two new colour criterions (yW1W2 and iW1zW2) are constructed to distinguish quasars from stars efficiently. With iW1zW2, 98.30 per cent of star contamination is eliminated, while 99.50 per cent of quasars are retained, at least to the magnitude limit of our training set of stars. Based on the optical and infrared colour features, we put forward an efficient schema to select quasar candidates and high-redshift quasar candidates, in which two machine learning algorithms (XGBoost and SVM) are implemented. The XGBoost and SVM classifiers have proven to be very effective with accuracy of $99.46{{\ \rm per\ cent}}$ when 8Color as input pattern and default model parameters. Applying the two optimal classifiers to the unknown Pan-STARRS and AllWISE cross-matched data set, a total of 2 006 632 intersected sources are predicted to be quasar candidates given quasar probability larger than 0.5 (i.e. PQSO > 0.5). Among them, 1 201 211 have high probability (PQSO > 0.95). For these newly predicted quasar candidates, a regressor is constructed to estimate their redshifts. Finally 7402 z > 3.5 quasars are obtained. Given the magnitude limitation and site of the LAMOST telescope, part of these candidates will be used as the input catalogue of the LAMOST telescope for follow-up observation, and the rest may be observed by other telescopes.

Author(s):  
Jakub Gęca

The consequences of failures and unscheduled maintenance are the reasons why engineers have been trying to increase the reliability of industrial equipment for years. In modern solutions, predictive maintenance is a frequently used method. It allows to forecast failures and alert about their possibility. This paper presents a summary of the machine learning algorithms that can be used in predictive maintenance and comparison of their performance. The analysis was made on the basis of data set from Microsoft Azure AI Gallery. The paper presents a comprehensive approach to the issue including feature engineering, preprocessing, dimensionality reduction techniques, as well as tuning of model parameters in order to obtain the highest possible performance. The conducted research allowed to conclude that in the analysed case , the best algorithm achieved 99.92% accuracy out of over 122 thousand test data records. In conclusion, predictive maintenance based on machine learning represents the future of machine reliability in industry.


Information ◽  
2021 ◽  
Vol 12 (3) ◽  
pp. 109 ◽  
Author(s):  
Iman Rahimi ◽  
Amir H. Gandomi ◽  
Panagiotis G. Asteris ◽  
Fang Chen

The novel coronavirus disease, also known as COVID-19, is a disease outbreak that was first identified in Wuhan, a Central Chinese city. In this report, a short analysis focusing on Australia, Italy, and UK is conducted. The analysis includes confirmed and recovered cases and deaths, the growth rate in Australia compared with that in Italy and UK, and the trend of the disease in different Australian regions. Mathematical approaches based on susceptible, infected, and recovered (SIR) cases and susceptible, exposed, infected, quarantined, and recovered (SEIQR) cases models are proposed to predict epidemiology in the above-mentioned countries. Since the performance of the classic forms of SIR and SEIQR depends on parameter settings, some optimization algorithms, namely Broyden–Fletcher–Goldfarb–Shanno (BFGS), conjugate gradients (CG), limited memory bound constrained BFGS (L-BFGS-B), and Nelder–Mead, are proposed to optimize the parameters and the predictive capabilities of the SIR and SEIQR models. The results of the optimized SIR and SEIQR models were compared with those of two well-known machine learning algorithms, i.e., the Prophet algorithm and logistic function. The results demonstrate the different behaviors of these algorithms in different countries as well as the better performance of the improved SIR and SEIQR models. Moreover, the Prophet algorithm was found to provide better prediction performance than the logistic function, as well as better prediction performance for Italy and UK cases than for Australian cases. Therefore, it seems that the Prophet algorithm is suitable for data with an increasing trend in the context of a pandemic. Optimization of SIR and SEIQR model parameters yielded a significant improvement in the prediction accuracy of the models. Despite the availability of several algorithms for trend predictions in this pandemic, there is no single algorithm that would be optimal for all cases.


2021 ◽  
Vol 30 (1) ◽  
pp. 460-469
Author(s):  
Yinying Cai ◽  
Amit Sharma

Abstract In the agriculture development and growth, the efficient machinery and equipment plays an important role. Various research studies are involved in the implementation of the research and patents to aid the smart agriculture and authors and reviewers that machine leaning technologies are providing the best support for this growth. To explore machine learning technology and machine learning algorithms, the most of the applications are studied based on the swarm intelligence optimization. An optimized V3CFOA-RF model is built through V3CFOA. The algorithm is tested in the data set collected concerning rice pests, later analyzed and compared in detail with other existing algorithms. The research result shows that the model and algorithm proposed are not only more accurate in recognition and prediction, but also solve the time lagging problem to a degree. The model and algorithm helped realize a higher accuracy in crop pest prediction, which ensures a more stable and higher output of rice. Thus they can be employed as an important decision-making instrument in the agricultural production sector.


Author(s):  
Aska E. Mehyadin ◽  
Adnan Mohsin Abdulazeez ◽  
Dathar Abas Hasan ◽  
Jwan N. Saeed

The bird classifier is a system that is equipped with an area machine learning technology and uses a machine learning method to store and classify bird calls. Bird species can be known by recording only the sound of the bird, which will make it easier for the system to manage. The system also provides species classification resources to allow automated species detection from observations that can teach a machine how to recognize whether or classify the species. Non-undesirable noises are filtered out of and sorted into data sets, where each sound is run via a noise suppression filter and a separate classification procedure so that the most useful data set can be easily processed. Mel-frequency cepstral coefficient (MFCC) is used and tested through different algorithms, namely Naïve Bayes, J4.8 and Multilayer perceptron (MLP), to classify bird species. J4.8 has the highest accuracy (78.40%) and is the best. Accuracy and elapsed time are (39.4 seconds).


2020 ◽  
Vol 9 (3) ◽  
pp. 34
Author(s):  
Giovanna Sannino ◽  
Ivanoe De Falco ◽  
Giuseppe De Pietro

One of the most important physiological parameters of the cardiovascular circulatory system is Blood Pressure. Several diseases are related to long-term abnormal blood pressure, i.e., hypertension; therefore, the early detection and assessment of this condition are crucial. The identification of hypertension, and, even more the evaluation of its risk stratification, by using wearable monitoring devices are now more realistic thanks to the advancements in Internet of Things, the improvements of digital sensors that are becoming more and more miniaturized, and the development of new signal processing and machine learning algorithms. In this scenario, a suitable biomedical signal is represented by the PhotoPlethysmoGraphy (PPG) signal. It can be acquired by using a simple, cheap, and wearable device, and can be used to evaluate several aspects of the cardiovascular system, e.g., the detection of abnormal heart rate, respiration rate, blood pressure, oxygen saturation, and so on. In this paper, we take into account the Cuff-Less Blood Pressure Estimation Data Set that contains, among others, PPG signals coming from a set of subjects, as well as the Blood Pressure values of the latter that is the hypertension level. Our aim is to investigate whether or not machine learning methods applied to these PPG signals can provide better results for the non-invasive classification and evaluation of subjects’ hypertension levels. To this aim, we have availed ourselves of a wide set of machine learning algorithms, based on different learning mechanisms, and have compared their results in terms of the effectiveness of the classification obtained.


2020 ◽  
Vol 27 (6) ◽  
pp. 929-933
Author(s):  
George Demiris ◽  
Kristin L Corey Magan ◽  
Debra Parker Oliver ◽  
Karla T Washington ◽  
Chad Chadwick ◽  
...  

Abstract Objective The goal of this study was to explore whether features of recorded and transcribed audio communication data extracted by machine learning algorithms can be used to train a classifier for anxiety. Materials and Methods We used a secondary data set generated by a clinical trial examining problem-solving therapy for hospice caregivers consisting of 140 transcripts of multiple, sequential conversations between an interviewer and a family caregiver along with standardized assessments of anxiety prior to each session; 98 of these transcripts (70%) served as the training set, holding the remaining 30% of the data for evaluation. Results A classifier for anxiety was developed relying on language-based features. An 86% precision, 78% recall, 81% accuracy, and 84% specificity were achieved with the use of the trained classifiers. High anxiety inflections were found among recently bereaved caregivers and were usually connected to issues related to transitioning out of the caregiving role. This analysis highlighted the impact of lowering anxiety by increasing reciprocity between interviewers and caregivers. Conclusion Verbal communication can provide a platform for machine learning tools to highlight and predict behavioral health indicators and trends.


Geophysics ◽  
2020 ◽  
Vol 85 (4) ◽  
pp. WA41-WA52 ◽  
Author(s):  
Dario Grana ◽  
Leonardo Azevedo ◽  
Mingliang Liu

Among the large variety of mathematical and computational methods for estimating reservoir properties such as facies and petrophysical variables from geophysical data, deep machine-learning algorithms have gained significant popularity for their ability to obtain accurate solutions for geophysical inverse problems in which the physical models are partially unknown. Solutions of classification and inversion problems are generally not unique, and uncertainty quantification studies are required to quantify the uncertainty in the model predictions and determine the precision of the results. Probabilistic methods, such as Monte Carlo approaches, provide a reliable approach for capturing the variability of the set of possible models that match the measured data. Here, we focused on the classification of facies from seismic data and benchmarked the performance of three different algorithms: recurrent neural network, Monte Carlo acceptance/rejection sampling, and Markov chain Monte Carlo. We tested and validated these approaches at the well locations by comparing classification predictions to the reference facies profile. The accuracy of the classification results is defined as the mismatch between the predictions and the log facies profile. Our study found that when the training data set of the neural network is large enough and the prior information about the transition probabilities of the facies in the Monte Carlo approach is not informative, machine-learning methods lead to more accurate solutions; however, the uncertainty of the solution might be underestimated. When some prior knowledge of the facies model is available, for example, from nearby wells, Monte Carlo methods provide solutions with similar accuracy to the neural network and allow a more robust quantification of the uncertainty, of the solution.


Complexity ◽  
2019 ◽  
Vol 2019 ◽  
pp. 1-10
Author(s):  
Qi Zhu ◽  
Ning Yuan ◽  
Donghai Guan

In recent years, self-paced learning (SPL) has attracted much attention due to its improvement to nonconvex optimization based machine learning algorithms. As a methodology introduced from human learning, SPL dynamically evaluates the learning difficulty of each sample and provides the weighted learning model against the negative effects from hard-learning samples. In this study, we proposed a cognitive driven SPL method, i.e., retrospective robust self-paced learning (R2SPL), which is inspired by the following two issues in human learning process: the misclassified samples are more impressive in upcoming learning, and the model of the follow-up learning process based on large number of samples can be used to reduce the risk of poor generalization in initial learning phase. We simultaneously estimated the degrees of learning-difficulty and misclassified in each step of SPL and proposed a framework to construct multilevel SPL for improving the robustness of the initial learning phase of SPL. The proposed method can be viewed as a multilayer model and the output of the previous layer can guide constructing robust initialization model of the next layer. The experimental results show that the R2SPL outperforms the conventional self-paced learning models in classification task.


Sign in / Sign up

Export Citation Format

Share Document