scholarly journals Audiogmenter: a MATLAB toolbox for audio data augmentation

2021 ◽  
Vol ahead-of-print (ahead-of-print) ◽  
Author(s):  
Gianluca Maguolo ◽  
Michelangelo Paci ◽  
Loris Nanni ◽  
Ludovico Bonan

PurposeCreate and share a MATLAB library that performs data augmentation algorithms for audio data. This study aims to help machine learning researchers to improve their models using the algorithms proposed by the authors.Design/methodology/approachThe authors structured our library into methods to augment raw audio data and spectrograms. In the paper, the authors describe the structure of the library and give a brief explanation of how every function works. The authors then perform experiments to show that the library is effective.FindingsThe authors prove that the library is efficient using a competitive dataset. The authors try multiple data augmentation approaches proposed by them and show that they improve the performance.Originality/valueA MATLAB library specifically designed for data augmentation was not available before. The authors are the first to provide an efficient and parallel implementation of a large number of algorithms.

2015 ◽  
Vol 22 (5) ◽  
pp. 573-590 ◽  
Author(s):  
Mojtaba Maghrebi ◽  
Claude Sammut ◽  
S. Travis Waller

Purpose – The purpose of this paper is to study the implementation of machine learning (ML) techniques in order to automatically measure the feasibility of performing ready mixed concrete (RMC) dispatching jobs. Design/methodology/approach – Six ML techniques were selected and tested on data that was extracted from a developed simulation model and answered by a human expert. Findings – The results show that the performance of most of selected algorithms were the same and achieved an accuracy of around 80 per cent in terms of accuracy for the examined cases. Practical implications – This approach can be applied in practice to match experts’ decisions. Originality/value – In this paper the feasibility of handling complex concrete delivery problems by ML techniques is studied. Currently, most of the concrete mixing process is done by machines. However, RMC dispatching still relies on human resources to complete many tasks. In this paper the authors are addressing to reconstruct experts’ decisions as only practical solution.


2020 ◽  
Vol 34 (1) ◽  
pp. 30-47 ◽  
Author(s):  
Mohamed Zaki ◽  
Janet R. McColl-Kennedy

Purpose The purpose of this paper is to offer a step-by-step text mining analysis roadmap (TMAR) for service researchers. The paper provides guidance on how to choose between alternative tools, using illustrative examples from a range of business contexts. Design/methodology/approach The authors provide a six-stage TMAR on how to use text mining methods in practice. At each stage, the authors provide a guiding question, articulate the aim, identify a range of methods and demonstrate how machine learning and linguistic techniques can be used in practice with illustrative examples drawn from business, from an array of data types, services and contexts. Findings At each of the six stages, this paper demonstrates useful insights that result from the text mining techniques to provide an in-depth understanding of the phenomenon and actionable insights for research and practice. Originality/value There is little research to guide scholars and practitioners on how to gain insights from the extensive “big data” that arises from the different data sources. In a first, this paper addresses this important gap highlighting the advantages of using text mining to gain useful insights for theory testing and practice in different service contexts.


2021 ◽  
Vol 55 (4) ◽  
pp. 586-608
Author(s):  
Gabriela Montenegro Montenegro de Barros ◽  
Valdecy Pereira ◽  
Marcos Costa Roboredo

PurposeThis paper presents an algorithm that can elicitate (infer) all or any combination of elimination and choice expressing reality (ELECTRE) Tri-B parameters. For example, a decision maker can maintain the values for indifference, preference and veto thresholds, and the study’s algorithm can find the criteria weights, reference profiles and the lambda cutting level. The study’s approach is inspired by a machine learning ensemble technique, the random forest, and for that, the authors named the study’s approach as ELECTRE tree algorithm.Design/methodology/approachFirst, the authors generate a set of ELECTRE Tri-B models, where each model solves a random sample of criteria and alternates. Each sample is made with replacement, having at least two criteria and between 10% and 25% of alternates. Each model has its parameters optimized by a genetic algorithm (GA) that can use an ordered cluster or an assignment example as a reference to the optimization. Finally, after the optimization phase, two procedures can be performed; the first one will merge all models, finding in this way the elicitated parameters and in the second procedure, each alternate is classified (voted) by each separated model, and the majority vote decides the final class.FindingsThe authors have noted that concerning the voting procedure, nonlinear decision boundaries are generated and they can be suitable in analyzing problems of the same nature. In contrast, the merged model generates linear decision boundaries.Originality/valueThe elicitation of ELECTRE Tri-B parameters is made by an ensemble technique that is composed of a set of multicriteria models that are engaged in generating robust solutions.


2018 ◽  
Vol 26 (5) ◽  
pp. 613-636 ◽  
Author(s):  
Gunikhan Sonowal ◽  
KS Kuppusamy

Purpose This paper aims to propose a model entitled MMSPhiD (multidimensional similarity metrics model for screen reader user to phishing detection) that amalgamates multiple approaches to detect phishing URLs. Design/methodology/approach The model consists of three major components: machine learning-based approach, typosquatting-based approach and phoneme-based approach. The major objectives of the proposed model are detecting phishing URL, typosquatting and phoneme-based domain and suggesting the legitimate domain which is targeted by attackers. Findings The result of the experiment shows that the MMSPhiD model can successfully detect phishing with 99.03 per cent accuracy. In addition, this paper has analyzed 20 leading domains from Alexa and identified 1,861 registered typosquatting and 543 phoneme-based domains. Research limitations/implications The proposed model has used machine learning with the list-based approach. Building and maintaining the list shall be a limitation. Practical implication The results of the experiments demonstrate that the model achieved higher performance due to the incorporation of multi-dimensional filters. Social implications In addition, this paper has incorporated the accessibility needs of persons with visual impairments and provides an accessible anti-phishing approach. Originality/value This paper assists persons with visual impairments on detection phoneme-based phishing domains.


2021 ◽  
Vol ahead-of-print (ahead-of-print) ◽  
Author(s):  
Ema Utami ◽  
Irwan Oyong ◽  
Suwanto Raharjo ◽  
Anggit Dwi Hartanto ◽  
Sumarni Adi

PurposeGathering knowledge regarding personality traits has long been the interest of academics and researchers in the fields of psychology and in computer science. Analyzing profile data from personal social media accounts reduces data collection time, as this method does not require users to fill any questionnaires. A pure natural language processing (NLP) approach can give decent results, and its reliability can be improved by combining it with machine learning (as shown by previous studies).Design/methodology/approachIn this, cleaning the dataset and extracting relevant potential features “as assessed by psychological experts” are essential, as Indonesians tend to mix formal words, non-formal words, slang and abbreviations when writing social media posts. For this article, raw data were derived from a predefined dominance, influence, stability and conscientious (DISC) quiz website, returning 316,967 tweets from 1,244 Twitter accounts “filtered to include only personal and Indonesian-language accounts”. Using a combination of NLP techniques and machine learning, the authors aim to develop a better approach and more robust model, especially for the Indonesian language.FindingsThe authors find that employing a SMOTETomek re-sampling technique and hyperparameter tuning boosts the model’s performance on formalized datasets by 57% (as measured through the F1-score).Originality/valueThe process of cleaning dataset and extracting relevant potential features assessed by psychological experts from it are essential because Indonesian people tend to mix formal words, non-formal words, slang words and abbreviations when writing tweets. Organic data derived from a predefined DISC quiz website resulting 1244 records of Twitter accounts and 316.967 tweets.


2021 ◽  
Vol ahead-of-print (ahead-of-print) ◽  
Author(s):  
Paolo Dello Vicario ◽  
Valentina Tortolini

Purpose The purpose of this paper is to define a methodology to analyze links between programming topics and libraries starting from GitHub data. Design/methodology/approach This paper developed an analysis over machine learning repositories on GitHub, finding communities of repositories and studying the anatomy of collaboration around a popular topic such as machine learning. Findings This analysis indicates the significant importance of programming languages and technologies such as Python and Jupyter Notebook. It also shows the rise of deep learning and of specific libraries such as Tensorflow from Google. Originality/value There exists no survey or analysis based on how developers influence each other for specific topics. Other researchers focused their analysis on the collaborative structure and social impact instead of topic impact. Using this methodology to analyze programming topics is important not just for machine learning but also for other topics.


2021 ◽  
Vol ahead-of-print (ahead-of-print) ◽  
Author(s):  
A. Prakash ◽  
A. Shyam Joseph ◽  
R. Shanmugasundaram ◽  
C.S. Ravichandran

Purpose This paper aims to propose a machine learning approach-based power theft detection using Garra Rufa Fish (GRF) optimization. Here, the analyzing of power theft is an important part to reduce the financial loss and protect the electricity from fraudulent users. Design/methodology/approach In this section, a new method is implemented to reduce the power theft in transmission lines and utility grids. The detection of power theft using smart meter with reliable manner can be achieved by the help of GRF algorithm. Findings The loss of power due to non-technical loss is small by using this proposed algorithm. It provides some benefits like increased predicting capacity, less complexity, high speed and high reliable output. The result is analyzed using MATLAB/Simulink platform. The result is compared with an existing method. According to the comparison result, the proposed method provides the good performance than existing method. Originality/value The proposed method gives good results of comparison than those of the other techniques and has an ability to overcome the associated problems.


2017 ◽  
Vol 45 (6) ◽  
pp. 50-54 ◽  
Author(s):  
Prashant Shukla ◽  
H. James Wilson ◽  
Allan Alter ◽  
David Lavieri

Purpose The authors explore the potential of machine learning, computers employ that an algorithm to sort data, make decisions and then continuously assess and improve their functionality. They suggest that it be used to power a radical redesign of company processes that they call machine reengineering. Design/methodology/approach The authors interpret a survey of more than a thousand corporate public agency IT professionals on their use of artificial intelligence and machine learning. Findings Companies that embrace machine learning find that it adds value to the work product of their employees and provides companies with new capabilities. Practical implications Working together with an intelligent machine, workers become custodians of powerfully smart tools, tools that personalize work to maximize their most productive ways of working. Originality/value A guide to establishing a culture that empowers employees to thrive alongside intelligent machines.


2021 ◽  
Vol 17 (1) ◽  
pp. 45-53
Author(s):  
Le Hong Trang ◽  
Tran Duong Huy ◽  
Anh Ngoc Le

Purpose Pricing on the online booking systems is a difficult task for the host, the systems usually set the prices that are lower than the general premises and quality, and that only gives benefits to the system by easily attracting the customer to use the service. The setting price of the new accommodation is often based on location, the number of beds, type of house and so on. The main problem is to predict the most reasonable price for the host. This paper aims to study the use of machine learning and sentiment analysis for predicting the price of online booking systems. Design/methodology/approach In particular, an empirical study is performed first for some well-known classification models for the problems. The authors then propose to apply k-means, a clustering technique, together with Gradient Boost and XGBoost models to improve the prediction performance. Experiments are conducted and tested for real Airbnb data sets collected in London City. Findings Experimental results are given and compared to show that the authors’ method outperforms to an updated method. Originality/value The authors use k-means and sampling together with Gradient Boost and XGBoost models to improve the prediction performance.


2021 ◽  
Vol ahead-of-print (ahead-of-print) ◽  
Author(s):  
Pankaj Kumar ◽  
Bhavna Bajpai ◽  
Deepak Omprakash Gupta ◽  
Dinesh C. Jain ◽  
S. Vimal

Purpose The purpose of this study/paper To focus on finding COVID-19 with the help of DarkCovidNet architecture on patient images. Design/methodology/approach We used machine learning techniques with convolutional neural network. Findings Detecting COVID-19 symptoms from patient CT scan images. Originality/value This paper contains a new architecture for detecting COVID-19 symptoms from patient computed tomography scan images.


Sign in / Sign up

Export Citation Format

Share Document