Low/High Redshift Classification of Emission Line Galaxies in the HETDEX survey

AbstractWe discuss different methods to separate high- from low-redshift galaxies based on a combination of spectroscopic and photometric observations. Our baseline scenario is the Hobby-Eberly Telescope Dark Energy eXperiment (HETDEX) survey, which will observe several hundred thousand Lyman Alpha Emitting (LAE) galaxies at 1.9 < z < 3.5, and for which the main source of contamination is [OII]-emitting galaxies at z < 0.5. Additional information useful for the separation comes from empirical knowledge of LAE and [OII] luminosity functions and equivalent width distributions as a function of redshift. We consider three separation techniques: a simple cut in equivalent width, a Bayesian separation method, and machine learning algorithms, including support vector machines. These methods can be easily applied to other surveys and used on simulated data in the framework of survey planning.

Download Full-text

Modeling galaxy evolution at high-redshift in highly overdense and normal regions

Proceedings of the International Astronomical Union ◽

10.1017/s1743921319004721 ◽

2019 ◽

Vol 15 (S341) ◽

pp. 299-301

Author(s):

Raphael Sadoun ◽

Emilio Romano-Daz ◽

Isaac Shlosman ◽

Zheng Zheng

Keyword(s):

Monte Carlo ◽

High Resolution ◽

Galaxy Evolution ◽

Equivalent Width ◽

High Redshift ◽

Post Processing ◽

High Redshift Galaxies ◽

Massive Galaxies ◽

Luminosity Functions ◽

Galactic Outflows

AbstractWe present results from high-resolution, zoom-in cosmological simulations to study the effect of feedback from galactic outflows on the physical and Lyα properties of high-redshift galaxies in highly overdense and normal environments at z >∼6. The Lyα properties have been obtained by post-processing the simulations with a Monte-Carlo radiative transfer (RT) code. Our results demonstrate that galactic outflows play an important role in regulating the growth of massive galaxies in overdense regions as well as the temperature and metallicity of the intergalactic medium. In particular, we find that galactic outflows are necessary to reproduce the observed Lyα luminosity functions as well as the apparent Lyα luminosity, line width and equivalent width distributions of luminous Lyα emitters at z ∼ 6.

Download Full-text

Using Machine Learning Algorithms on Prediction of Stock Price

Journal of Modeling and Optimization ◽

10.32732/jmo.2020.12.2.84 ◽

2020 ◽

Vol 12 (2) ◽

pp. 84-99

Author(s):

Li-Pang Chen

Keyword(s):

Machine Learning ◽

Stock Price ◽

Short Term Memory ◽

Machine Learning Algorithms ◽

Machine Learning Techniques ◽

Support Vector ◽

Short Term ◽

Learning Techniques ◽

Historical Database ◽

Long Short Term Memory

In this paper, we investigate analysis and prediction of the time-dependent data. We focus our attention on four different stocks are selected from Yahoo Finance historical database. To build up models and predict the future stock price, we consider three different machine learning techniques including Long Short-Term Memory (LSTM), Convolutional Neural Networks (CNN) and Support Vector Regression (SVR). By treating close price, open price, daily low, daily high, adjusted close price, and volume of trades as predictors in machine learning methods, it can be shown that the prediction accuracy is improved.

Download Full-text

A Comparative Study of Different Machine Learning Algorithms for Disease Prediction

International Journal of Advanced Research in Computer Science and Software Engineering ◽

10.23956/ijarcsse/v7i7/0177 ◽

2017 ◽

Vol 7 (7) ◽

pp. 172

Author(s):

Anantvir Singh Romana

Keyword(s):

Machine Learning ◽

Subsequent Treatment ◽

Machine Learning Algorithms ◽

Machine Learning Techniques ◽

Support Vector ◽

Disease Prediction ◽

Classification Problems ◽

Learning Techniques ◽

Neural Network Classifiers ◽

Diagnostic Detection

Accurate diagnostic detection of the disease in a patient is critical and may alter the subsequent treatment and increase the chances of survival rate. Machine learning techniques have been instrumental in disease detection and are currently being used in various classification problems due to their accurate prediction performance. Various techniques may provide different desired accuracies and it is therefore imperative to use the most suitable method which provides the best desired results. This research seeks to provide comparative analysis of Support Vector Machine, Naïve bayes, J48 Decision Tree and neural network classifiers breast cancer and diabetes datsets.

Download Full-text

Synthesizing Conjunctive & Disjunctive Linear Invariants by K-means++ and SVM

The International Arab Journal of Information Technology ◽

10.34028/iajit/17/6/3 ◽

2020 ◽

Vol 17 (6) ◽

pp. 847-856

Author(s):

Shengbing Ren ◽

Xiang Zhang

Keyword(s):

Software Verification ◽

State Of The Art ◽

Positive Sample ◽

Machine Learning Algorithms ◽

Support Vector ◽

Hoare Logic ◽

Excellent Performance ◽

Automated Software ◽

Inductive Invariants ◽

Linear Invariants

The problem of synthesizing adequate inductive invariants lies at the heart of automated software verification. The state-of-the-art machine learning algorithms for synthesizing invariants have gradually shown its excellent performance. However, synthesizing disjunctive invariants is a difficult task. In this paper, we propose a method k++ Support Vector Machine (SVM) integrating k-means++ and SVM to synthesize conjunctive and disjunctive invariants. At first, given a program, we start with executing the program to collect program states. Next, k++SVM adopts k-means++ to cluster the positive samples and then applies SVM to distinguish each positive sample cluster from all negative samples to synthesize the candidate invariants. Finally, a set of theories founded on Hoare logic are adopted to check whether the candidate invariants are true invariants. If the candidate invariants fail the check, we should sample more states and repeat our algorithm. The experimental results show that k++SVM is compatible with the algorithms for Intersection Of Half-space (IOH) and more efficient than the tool of Interproc. Furthermore, it is shown that our method can synthesize conjunctive and disjunctive invariants automatically

Download Full-text

Drug Target Group Prediction with Multiple Drug Networks

Combinatorial Chemistry & High Throughput Screening ◽

10.2174/1386207322666190702103927 ◽

2020 ◽

Vol 23 (4) ◽

pp. 274-284 ◽

Cited By ~ 12

Author(s):

Jingang Che ◽

Lei Chen ◽

Zi-Han Guo ◽

Shuaiqun Wang ◽

Aorigele

Keyword(s):

Drug Target ◽

Low Cost ◽

Machine Learning Algorithms ◽

Classification Model ◽

Support Vector ◽

Multiple Drug ◽

Property A ◽

Multiple Networks ◽

Proposed Model ◽

The One

Background: Identification of drug-target interaction is essential in drug discovery. It is beneficial to predict unexpected therapeutic or adverse side effects of drugs. To date, several computational methods have been proposed to predict drug-target interactions because they are prompt and low-cost compared with traditional wet experiments. Methods: In this study, we investigated this problem in a different way. According to KEGG, drugs were classified into several groups based on their target proteins. A multi-label classification model was presented to assign drugs into correct target groups. To make full use of the known drug properties, five networks were constructed, each of which represented drug associations in one property. A powerful network embedding method, Mashup, was adopted to extract drug features from above-mentioned networks, based on which several machine learning algorithms, including RAndom k-labELsets (RAKEL) algorithm, Label Powerset (LP) algorithm and Support Vector Machine (SVM), were used to build the classification model. Results and Conclusion: Tenfold cross-validation yielded the accuracy of 0.839, exact match of 0.816 and hamming loss of 0.037, indicating good performance of the model. The contribution of each network was also analyzed. Furthermore, the network model with multiple networks was found to be superior to the one with a single network and classic model, indicating the superiority of the proposed model.

Download Full-text

Predicting Future Occurrence of Acute Hypotensive Episodes Using Noninvasive and Invasive Features

Military Medicine ◽

10.1093/milmed/usaa418 ◽

2021 ◽

Vol 186 (Supplement_1) ◽

pp. 445-451

Author(s):

Yifei Sun ◽

Navid Rashedi ◽

Vikrant Vaze ◽

Parikshit Shah ◽

Ryan Halter ◽

...

Keyword(s):

Machine Learning ◽

Support Vector Machine ◽

Real World ◽

Short Term Memory ◽

Model Performance ◽

Learning Technologies ◽

Machine Learning Algorithms ◽

Support Vector ◽

K Nearest Neighbor ◽

Continuous Map

ABSTRACT Introduction Early prediction of the acute hypotensive episode (AHE) in critically ill patients has the potential to improve outcomes. In this study, we apply different machine learning algorithms to the MIMIC III Physionet dataset, containing more than 60,000 real-world intensive care unit records, to test commonly used machine learning technologies and compare their performances. Materials and Methods Five classification methods including K-nearest neighbor, logistic regression, support vector machine, random forest, and a deep learning method called long short-term memory are applied to predict an AHE 30 minutes in advance. An analysis comparing model performance when including versus excluding invasive features was conducted. To further study the pattern of the underlying mean arterial pressure (MAP), we apply a regression method to predict the continuous MAP values using linear regression over the next 60 minutes. Results Support vector machine yields the best performance in terms of recall (84%). Including the invasive features in the classification improves the performance significantly with both recall and precision increasing by more than 20 percentage points. We were able to predict the MAP with a root mean square error (a frequently used measure of the differences between the predicted values and the observed values) of 10 mmHg 60 minutes in the future. After converting continuous MAP predictions into AHE binary predictions, we achieve a 91% recall and 68% precision. In addition to predicting AHE, the MAP predictions provide clinically useful information regarding the timing and severity of the AHE occurrence. Conclusion We were able to predict AHE with precision and recall above 80% 30 minutes in advance with the large real-world dataset. The prediction of regression model can provide a more fine-grained, interpretable signal to practitioners. Model performance is improved by the inclusion of invasive features in predicting AHE, when compared to predicting the AHE based on only the available, restricted set of noninvasive technologies. This demonstrates the importance of exploring more noninvasive technologies for AHE prediction.

Download Full-text

Is It Possible to Forecast the Price of Bitcoin?

Forecasting ◽

10.3390/forecast3020024 ◽

2021 ◽

Vol 3 (2) ◽

pp. 377-420

Author(s):

Julien Chevallier ◽

Dominique Guégan ◽

Stéphane Goutte

Keyword(s):

Data Analytics ◽

Daily Variation ◽

A Priori ◽

Machine Learning Algorithms ◽

Support Vector ◽

Market Growth ◽

Network Support ◽

Market Participants ◽

Nearest Neighbours ◽

Stationary Behavior

This paper focuses on forecasting the price of Bitcoin, motivated by its market growth and the recent interest of market participants and academics. We deploy six machine learning algorithms (e.g., Artificial Neural Network, Support Vector Machine, Random Forest, k-Nearest Neighbours, AdaBoost, Ridge regression), without deciding a priori which one is the ‘best’ model. The main contribution is to use these data analytics techniques with great caution in the parameterization, instead of classical parametric modelings (AR), to disentangle the non-stationary behavior of the data. As soon as Bitcoin is also used for diversification in portfolios, we need to investigate its interactions with stocks, bonds, foreign exchange, and commodities. We identify that other cryptocurrencies convey enough information to explain the daily variation of Bitcoin’s spot and futures prices. Forecasting results point to the segmentation of Bitcoin concerning alternative assets. Finally, trading strategies are implemented.

Download Full-text

Implementation of Machine Learning Algorithms in Spectral Analysis of Surface Waves (SASW) Inversion

Applied Sciences ◽

10.3390/app11062557 ◽

2021 ◽

Vol 11 (6) ◽

pp. 2557

Author(s):

Sadia Mannan Mitu ◽

Norinah Abd. Rahman ◽

Khairul Anuar Mohd Nayan ◽

Mohd Asyraf Zulkifley ◽

Sri Atmaja P. Rosyidi

Keyword(s):

Spectral Analysis ◽

Surface Waves ◽

Soil Profile ◽

Field Tests ◽

Iteration Process ◽

Machine Learning Algorithms ◽

Support Vector ◽

Complex Processes ◽

Inversion Procedure ◽

Inversion Analysis

One of the complex processes in spectral analysis of surface waves (SASW) data analysis is the inversion procedure. An initial soil profile needs to be assumed at the beginning of the inversion analysis, which involves calculating the theoretical dispersion curve. If the assumption of the starting soil profile model is not reasonably close, the iteration process might lead to nonconvergence or take too long to be converged. Automating the inversion procedure will allow us to evaluate the soil stiffness properties conveniently and rapidly by means of the SASW method. Multilayer perceptron (MLP), random forest (RF), support vector regression (SVR), and linear regression (LR) algorithms were implemented in order to automate the inversion. For this purpose, the dispersion curves obtained from 50 field tests were used as input data for all of the algorithms. The results illustrated that SVR algorithms could potentially be used to estimate the shear wave velocity of soil.

Download Full-text

Efficient detection of hacker community based on twitter data using complex networks and machine learning algorithm

Journal of Intelligent & Fuzzy Systems ◽

10.3233/jifs-210458 ◽

2021 ◽

pp. 1-17

Author(s):

Ahmed Al-Tarawneh ◽

Ja’afer Al-Saraireh

Keyword(s):

Machine Learning ◽

Complex Networks ◽

Nearest Neighbor ◽

Learning Algorithm ◽

Machine Learning Algorithms ◽

Machine Learning Techniques ◽

Support Vector ◽

K Nearest Neighbor ◽

Efficient Detection ◽

Suggested Keywords

Twitter is one of the most popular platforms used to share and post ideas. Hackers and anonymous attackers use these platforms maliciously, and their behavior can be used to predict the risk of future attacks, by gathering and classifying hackers’ tweets using machine-learning techniques. Previous approaches for detecting infected tweets are based on human efforts or text analysis, thus they are limited to capturing the hidden text between tweet lines. The main aim of this research paper is to enhance the efficiency of hacker detection for the Twitter platform using the complex networks technique with adapted machine learning algorithms. This work presents a methodology that collects a list of users with their followers who are sharing their posts that have similar interests from a hackers’ community on Twitter. The list is built based on a set of suggested keywords that are the commonly used terms by hackers in their tweets. After that, a complex network is generated for all users to find relations among them in terms of network centrality, closeness, and betweenness. After extracting these values, a dataset of the most influential users in the hacker community is assembled. Subsequently, tweets belonging to users in the extracted dataset are gathered and classified into positive and negative classes. The output of this process is utilized with a machine learning process by applying different algorithms. This research build and investigate an accurate dataset containing real users who belong to a hackers’ community. Correctly, classified instances were measured for accuracy using the average values of K-nearest neighbor, Naive Bayes, Random Tree, and the support vector machine techniques, demonstrating about 90% and 88% accuracy for cross-validation and percentage split respectively. Consequently, the proposed network cyber Twitter model is able to detect hackers, and determine if tweets pose a risk to future institutions and individuals to provide early warning of possible attacks.

Download Full-text

New Results on Radioactive Mixture Identification and Relative Count Contribution Estimation

Sensors ◽

10.3390/s21124155 ◽

2021 ◽

Vol 21 (12) ◽

pp. 4155

Author(s):

Bulent Ayhan ◽

Chiman Kwan

Keyword(s):

Deep Learning ◽

Noise Source ◽

Simulated Data ◽

Nuclear Material ◽

Machine Learning Algorithms ◽

Detector Response ◽

Nuclear Materials ◽

Ratio Estimation ◽

Analysis Software ◽

Detector Distance

Detecting nuclear materials in mixtures is challenging due to low concentration, environmental factors, sensor noise, source-detector distance variations, and others. This paper presents new results on nuclear material identification and relative count contribution (also known as mixing ratio) estimation for mixtures of materials in which there are multiple isotopes present. Conventional and deep-learning-based machine learning algorithms were compared. Realistic simulated data using Gamma Detector Response and Analysis Software (GADRAS) were used in our comparative studies. It was observed that a deep learning approach is highly promising.

Download Full-text