scholarly journals Modelling Timbral Hardness

2019 ◽  
Vol 9 (3) ◽  
pp. 466 ◽  
Author(s):  
Andy Pearce ◽  
Tim Brookes ◽  
Russell Mason

Hardness is the most commonly searched timbral attribute within freesound.org, a commonly used online sound effect repository. A perceptual model of hardness was developed to enable the automatic generation of metadata to facilitate hardness-based filtering or sorting of search results. A training dataset was collected of 202 stimuli with 32 sound source types, and perceived hardness was assessed by a panel of listeners. A multilinear regression model was developed on six features: maximum bandwidth, attack centroid, midband level, percussive-to-harmonic ratio, onset strength, and log attack time. This model predicted the hardness of the training data with R 2 = 0.76. It predicted hardness within a new dataset with R 2 = 0.57, and predicted the rank order of individual sources perfectly, after accounting for the subjective variance of the ratings. Its performance exceeded that of human listeners.

2020 ◽  
Vol 12 (3) ◽  
pp. 1030 ◽  
Author(s):  
Sabrina Hempel ◽  
Julian Adolphs ◽  
Niels Landwehr ◽  
David Janke ◽  
Thomas Amon

Environmental protection efforts can only be effective in the long term with a reliable quantification of pollutant gas emissions as a first step to mitigation. Measurement and analysis strategies must permit the accurate extrapolation of emission values. We systematically analyzed the added value of applying modern machine learning methods in the process of monitoring emissions from naturally ventilated livestock buildings to the atmosphere. We considered almost 40 weeks of hourly emission values from a naturally ventilated dairy cattle barn in Northern Germany. We compared model predictions using 27 different scenarios of temporal sampling, multiple measures of model accuracy, and eight different regression approaches. The error of the predicted emission values with the tested measurement protocols was, on average, well below 20%. The sensitivity of the prediction to the selected training dataset was worse for the ordinary multilinear regression. Gradient boosting and random forests provided the most accurate and robust emission value predictions, accompanied by the second-smallest model errors. Most of the highly ranked scenarios involved six measurement periods, while the scenario with the best overall performance was: One measurement period in summer and three in the transition periods, each lasting for 14 days.


2020 ◽  
Vol 27 ◽  
Author(s):  
Zaheer Ullah Khan ◽  
Dechang Pi

Background: S-sulfenylation (S-sulphenylation, or sulfenic acid) proteins, are special kinds of post-translation modification, which plays an important role in various physiological and pathological processes such as cytokine signaling, transcriptional regulation, and apoptosis. Despite these aforementioned significances, and by complementing existing wet methods, several computational models have been developed for sulfenylation cysteine sites prediction. However, the performance of these models was not satisfactory due to inefficient feature schemes, severe imbalance issues, and lack of an intelligent learning engine. Objective: In this study, our motivation is to establish a strong and novel computational predictor for discrimination of sulfenylation and non-sulfenylation sites. Methods: In this study, we report an innovative bioinformatics feature encoding tool, named DeepSSPred, in which, resulting encoded features is obtained via n-segmented hybrid feature, and then the resampling technique called synthetic minority oversampling was employed to cope with the severe imbalance issue between SC-sites (minority class) and non-SC sites (majority class). State of the art 2DConvolutional Neural Network was employed over rigorous 10-fold jackknife cross-validation technique for model validation and authentication. Results: Following the proposed framework, with a strong discrete presentation of feature space, machine learning engine, and unbiased presentation of the underline training data yielded into an excellent model that outperforms with all existing established studies. The proposed approach is 6% higher in terms of MCC from the first best. On an independent dataset, the existing first best study failed to provide sufficient details. The model obtained an increase of 7.5% in accuracy, 1.22% in Sn, 12.91% in Sp and 13.12% in MCC on the training data and12.13% of ACC, 27.25% in Sn, 2.25% in Sp, and 30.37% in MCC on an independent dataset in comparison with 2nd best method. These empirical analyses show the superlative performance of the proposed model over both training and Independent dataset in comparison with existing literature studies. Conclusion : In this research, we have developed a novel sequence-based automated predictor for SC-sites, called DeepSSPred. The empirical simulations outcomes with a training dataset and independent validation dataset have revealed the efficacy of the proposed theoretical model. The good performance of DeepSSPred is due to several reasons, such as novel discriminative feature encoding schemes, SMOTE technique, and careful construction of the prediction model through the tuned 2D-CNN classifier. We believe that our research work will provide a potential insight into a further prediction of S-sulfenylation characteristics and functionalities. Thus, we hope that our developed predictor will significantly helpful for large scale discrimination of unknown SC-sites in particular and designing new pharmaceutical drugs in general.


2020 ◽  
Vol 12 (9) ◽  
pp. 1418
Author(s):  
Runmin Dong ◽  
Cong Li ◽  
Haohuan Fu ◽  
Jie Wang ◽  
Weijia Li ◽  
...  

Substantial progress has been made in the field of large-area land cover mapping as the spatial resolution of remotely sensed data increases. However, a significant amount of human power is still required to label images for training and testing purposes, especially in high-resolution (e.g., 3-m) land cover mapping. In this research, we propose a solution that can produce 3-m resolution land cover maps on a national scale without human efforts being involved. First, using the public 10-m resolution land cover maps as an imperfect training dataset, we propose a deep learning based approach that can effectively transfer the existing knowledge. Then, we improve the efficiency of our method through a network pruning process for national-scale land cover mapping. Our proposed method can take the state-of-the-art 10-m resolution land cover maps (with an accuracy of 81.24% for China) as the training data, enable a transferred learning process that can produce 3-m resolution land cover maps, and further improve the overall accuracy (OA) to 86.34% for China. We present detailed results obtained over three mega cities in China, to demonstrate the effectiveness of our proposed approach for 3-m resolution large-area land cover mapping.


2015 ◽  
Vol 32 (7) ◽  
pp. 1341-1355 ◽  
Author(s):  
S. J. Rennie ◽  
M. Curtis ◽  
J. Peter ◽  
A. W. Seed ◽  
P. J. Steinle ◽  
...  

AbstractThe Australian Bureau of Meteorology’s operational weather radar network comprises a heterogeneous radar collection covering diverse geography and climate. A naïve Bayes classifier has been developed to identify a range of common echo types observed with these radars. The success of the classifier has been evaluated against its training dataset and by routine monitoring. The training data indicate that more than 90% of precipitation may be identified correctly. The echo types most difficult to distinguish from rainfall are smoke, chaff, and anomalous propagation ground and sea clutter. Their impact depends on their climatological frequency. Small quantities of frequently misclassified persistent echo (like permanent ground clutter or insects) can also cause quality control issues. The Bayes classifier is demonstrated to perform better than a simple threshold method, particularly for reducing misclassification of clutter as precipitation. However, the result depends on finding a balance between excluding precipitation and including erroneous echo. Unlike many single-polarization classifiers that are only intended to extract precipitation echo, the Bayes classifier also discriminates types of nonprecipitation echo. Therefore, the classifier provides the means to utilize clear air echo for applications like data assimilation, and the class information will permit separate data handling of different echo types.


2006 ◽  
Vol 36 (5) ◽  
pp. 1129-1138 ◽  
Author(s):  
Jennifer L. Rooker Jensen ◽  
Karen S Humes ◽  
Tamara Conner ◽  
Christopher J Williams ◽  
John DeGroot

Although lidar data are widely available from commercial contractors, operational use in North America is still limited by both cost and the uncertainty of large-scale application and associated model accuracy issues. We analyzed whether small-footprint lidar data obtained from five noncontiguous geographic areas with varying species and structural composition, silvicultural practices, and topography could be used in a single regression model to produce accurate estimates of commonly obtained forest inventory attributes on the Nez Perce Reservation in northern Idaho, USA. Lidar-derived height metrics were used as predictor variables in a best-subset multiple linear regression procedure to determine whether a suite of stand inventory variables could be accurately estimated. Empirical relationships between lidar-derived height metrics and field-measured dependent variables were developed with training data and acceptable models validated with an independent subset. Models were then fit with all data, resulting in coefficients of determination and root mean square errors (respectively) for seven biophysical characteristics, including maximum canopy height (0.91, 3.03 m), mean canopy height (0.79, 2.64 m), quadratic mean DBH (0.61, 6.31 cm), total basal area (0.91, 2.99 m2/ha), ellipsoidal crown closure (0.80, 0.08%), total wood volume (0.93, 24.65 m3/ha), and large saw-wood volume (0.75, 28.76 m3/ha). Although these regression models cannot be generalized to other sites without additional testing, the results obtained in this study suggest that for these types of mixed-conifer forests, some biophysical characteristics can be adequately estimated using a single regression model over stands with highly variable structural characteristics and topography.


2020 ◽  
Vol 10 (6) ◽  
pp. 2104
Author(s):  
Michał Tomaszewski ◽  
Paweł Michalski ◽  
Jakub Osuchowski

This article presents an analysis of the effectiveness of object detection in digital images with the application of a limited quantity of input. The possibility of using a limited set of learning data was achieved by developing a detailed scenario of the task, which strictly defined the conditions of detector operation in the considered case of a convolutional neural network. The described solution utilizes known architectures of deep neural networks in the process of learning and object detection. The article presents comparisons of results from detecting the most popular deep neural networks while maintaining a limited training set composed of a specific number of selected images from diagnostic video. The analyzed input material was recorded during an inspection flight conducted along high-voltage lines. The object detector was built for a power insulator. The main contribution of the presented papier is the evidence that a limited training set (in our case, just 60 training frames) could be used for object detection, assuming an outdoor scenario with low variability of environmental conditions. The decision of which network will generate the best result for such a limited training set is not a trivial task. Conducted research suggests that the deep neural networks will achieve different levels of effectiveness depending on the amount of training data. The most beneficial results were obtained for two convolutional neural networks: the faster region-convolutional neural network (faster R-CNN) and the region-based fully convolutional network (R-FCN). Faster R-CNN reached the highest AP (average precision) at a level of 0.8 for 60 frames. The R-FCN model gained a worse AP result; however, it can be noted that the relationship between the number of input samples and the obtained results has a significantly lower influence than in the case of other CNN models, which, in the authors’ assessment, is a desired feature in the case of a limited training set.


Author(s):  
M. Kölle ◽  
V. Walter ◽  
S. Schmohl ◽  
U. Soergel

Abstract. Automated semantic interpretation of 3D point clouds is crucial for many tasks in the domain of geospatial data analysis. For this purpose, labeled training data is required, which has often to be provided manually by experts. One approach to minimize effort in terms of costs of human interaction is Active Learning (AL). The aim is to process only the subset of an unlabeled dataset that is particularly helpful with respect to class separation. Here a machine identifies informative instances which are then labeled by humans, thereby increasing the performance of the machine. In order to completely avoid involvement of an expert, this time-consuming annotation can be resolved via crowdsourcing. Therefore, we propose an approach combining AL with paid crowdsourcing. Although incorporating human interaction, our method can run fully automatically, so that only an unlabeled dataset and a fixed financial budget for the payment of the crowdworkers need to be provided. We conduct multiple iteration steps of the AL process on the ISPRS Vaihingen 3D Semantic Labeling benchmark dataset (V3D) and especially evaluate the performance of the crowd when labeling 3D points. We prove our concept by using labels derived from our crowd-based AL method for classifying the test dataset. The analysis outlines that by labeling only 0:4% of the training dataset by the crowd and spending less than 145 $, both our trained Random Forest and sparse 3D CNN classifier differ in Overall Accuracy by less than 3 percentage points compared to the same classifiers trained on the complete V3D training set.


Author(s):  
C. Koetsier ◽  
T. Peters ◽  
M. Sester

Abstract. Estimating vehicle poses is crucial for generating precise movement trajectories from (surveillance) camera data. Additionally for real time applications this task has to be solved in an efficient way. In this paper we introduce a deep convolutional neural network for pose estimation of vehicles from image patches. For a given 2D image patch our approach estimates the 2D coordinates of the image representing the exact center ground point (cx, cy) and the orientation of the vehicle - represented by the elevation angle (e) of the camera with respect to the vehicle’s center ground point and the azimuth rotation (a) of the vehicle with respect to the camera. To train a accurate model a large and diverse training dataset is needed. Collecting and labeling such large amount of data is very time consuming and expensive. Due to the lack of a sufficient amount of training data we show furthermore, that also rendered 3D vehicle models with artificial generated textures are nearly adequate for training.


Author(s):  
Shaolei Wang ◽  
Zhongyuan Wang ◽  
Wanxiang Che ◽  
Sendong Zhao ◽  
Ting Liu

Spoken language is fundamentally different from the written language in that it contains frequent disfluencies or parts of an utterance that are corrected by the speaker. Disfluency detection (removing these disfluencies) is desirable to clean the input for use in downstream NLP tasks. Most existing approaches to disfluency detection heavily rely on human-annotated data, which is scarce and expensive to obtain in practice. To tackle the training data bottleneck, in this work, we investigate methods for combining self-supervised learning and active learning for disfluency detection. First, we construct large-scale pseudo training data by randomly adding or deleting words from unlabeled data and propose two self-supervised pre-training tasks: (i) a tagging task to detect the added noisy words and (ii) sentence classification to distinguish original sentences from grammatically incorrect sentences. We then combine these two tasks to jointly pre-train a neural network. The pre-trained neural network is then fine-tuned using human-annotated disfluency detection training data. The self-supervised learning method can capture task-special knowledge for disfluency detection and achieve better performance when fine-tuning on a small annotated dataset compared to other supervised methods. However, limited in that the pseudo training data are generated based on simple heuristics and cannot fully cover all the disfluency patterns, there is still a performance gap compared to the supervised models trained on the full training dataset. We further explore how to bridge the performance gap by integrating active learning during the fine-tuning process. Active learning strives to reduce annotation costs by choosing the most critical examples to label and can address the weakness of self-supervised learning with a small annotated dataset. We show that by combining self-supervised learning with active learning, our model is able to match state-of-the-art performance with just about 10% of the original training data on both the commonly used English Switchboard test set and a set of in-house annotated Chinese data.


Sign in / Sign up

Export Citation Format

Share Document