scholarly journals Optimizing Prediction of YouTube Video Popularity Using XGBoost

Electronics ◽  
2021 ◽  
Vol 10 (23) ◽  
pp. 2962
Author(s):  
Meher UN Nisa ◽  
Danish Mahmood ◽  
Ghufran Ahmed ◽  
Suleman Khan ◽  
Mazin Abed Mohammed ◽  
...  

YouTube is a source of income for many people, and therefore a video’s popularity ultimately becomes the top priority for sustaining a steady income, provided that the popularity of videos remains the highest. Analysts and researchers use different algorithms and models to predict the maximum viewership of popular videos. This study predicts the popularity of such videos using the XGBoost model, considering features selection, fusion, min-max normalization and some precision parameters such as gamma, eta, learning_rate etc. The XGBoost gives 86% accuracy and 64% precision. Moreover, the Tuned XGboost also shows enhanced accuracy and precision. We have also analyzed the classification of unpopular videos for a comparison with our results. Finally, cross-validation methods are also used to evaluate certain combination of parameter’s values to validate our claims. Based on the obtained results, it can be said that our proposed models and techniques are very useful and can precisely and accurately predict the popularity of YouTube videos.

Author(s):  
Desi Dwi Natalia ◽  
Fajar Subekti ◽  
Ni Ketut Mirahayuni

This article reports on two separate studies—Natalia (2019) and Subekti (2019)—on  communication mechanism in political debates. Specifically these studies focus on turn taking strategies adopted in political debates by political figures during their campaign for presidency or in dealing with specific issues. Both studies adopted Stenstrom’s (1994) classification of turn taking strategies which include three main strategies: taking the turn, holding the turn, and yielding the turn, each of which was further specified into more specific strategies. The data were two Youtube videos: first, Trump and Clinton First Presidential Debate 2016 (36 minutes 22 seconds [Natalia, 2019]) and second, BBC World Debate “Why Poverty”November 30,2012 (47 minutes 16 seconds, [Subekti, 2019]). Employing descriptive qualitative, with the aim of analyzing turn taking strategies adopted in the debates, both studies found interesting points: first, Stenstrom’s three strategies appeared in the debates; second, taking the turn strategy was the dominant strategy, followed by holding the turn strategy and the least used one was yielding to turn; and third, interruption which was a specific type of taking the turn strategy seems to be most often used in the debater’s attempt to maintain the turn and present their points and thus dominate the debate.


2021 ◽  
Vol 14 (5) ◽  
pp. 440
Author(s):  
Eirini Siozou ◽  
Vasilios Sakkas ◽  
Nikolaos Kourkoumelis

A new methodology, based on Fourier transform infrared spectroscopy equipped with an attenuated total reflectance accessory (ATR FT-IR), was developed for the determination of diclofenac sodium (DS) in dispersed commercially available tablets using chemometric tools such as partial least squares (PLS) coupled with discriminant analysis (PLS-DA). The results of PLS-DA depicted a perfect classification of the tablets into three different groups based on their DS concentrations, while the developed model with PLS had a sufficiently low root mean square error (RMSE) for the prediction of the samples’ concentration (~5%) and therefore can be practically used for any tablet with an unknown concentration of DS. Comparison with ultraviolet/visible (UV/Vis) spectrophotometry as the reference method revealed no significant difference between the two methods. The proposed methodology exhibited satisfactory results in terms of both accuracy and precision while being rapid, simple and of low cost.


Processes ◽  
2021 ◽  
Vol 9 (2) ◽  
pp. 196
Author(s):  
Araz Soltani Nazarloo ◽  
Vali Rasooli Sharabiani ◽  
Yousef Abbaspour Gilandeh ◽  
Ebrahim Taghinezhad ◽  
Mariusz Szymanek ◽  
...  

The purpose of this work was to investigate the detection of the pesticide residual (profenofos) in tomatoes by using visible/near-infrared spectroscopy. Therefore, the experiments were performed on 180 tomato samples with different percentages of profenofos pesticide (higher and lower values than the maximum residual limit (MRL)) as compared to the control (no pesticide). VIS/near infrared (NIR) spectral data from pesticide solution and non-pesticide tomato samples (used as control treatment) impregnated with different concentrations of pesticide in the range of 400 to 1050 nm were recorded by a spectrometer. For classification of tomatoes with pesticide content at lower and higher levels of MRL as healthy and unhealthy samples, we used different spectral pre-processing methods with partial least squares discriminant analysis (PLS-DA) models. The Smoothing Moving Average pre-processing method with the standard error of cross validation (SECV) = 4.2767 was selected as the best model for this study. In addition, in the calibration and prediction sets, the percentages of total correctly classified samples were 90 and 91.66%, respectively. Therefore, it can be concluded that reflective spectroscopy (VIS/NIR) can be used as a non-destructive, low-cost, and rapid technique to control the health of tomatoes impregnated with profenofos pesticide.


2021 ◽  
Vol 39 (15_suppl) ◽  
pp. 3044-3044
Author(s):  
David Haan ◽  
Anna Bergamaschi ◽  
Yuhong Ning ◽  
William Gibb ◽  
Michael Kesling ◽  
...  

3044 Background: Epigenomics assays have recently become popular tools for identification of molecular biomarkers, both in tissue and in plasma. In particular 5-hydroxymethyl-cytosine (5hmC) method, has been shown to enable the epigenomic regulation of gene expression and subsequent gene activity, with different patterns, across several tumor and normal tissues types. In this study we show that 5hmC profiles enable discrete classification of tumor and normal tissue for breast, colorectal, lung ovary and pancreas. Such classification was also recapitulated in cfDNA from patient with breast, colorectal, lung, ovarian and pancreatic cancers. Methods: DNA was isolated from 176 fresh frozen tissues from breast, colorectal, lung, ovary and pancreas (44 per tumor per tissue type and up to 11 tumor tissues for each stage (I-IV)) and up to 10 normal tissues per tissue type. cfDNA was isolated from plasma from 783 non-cancer individuals and 569 cancer patients. Plasma-isolated cfDNA and tumor genomic DNA, were enriched for the 5hmC fraction using chemical labelling, sequenced, and aligned to a reference genome to construct features sets of 5hmC patterns. Results: 5hmC multinomial logistic regression analysis was employed across tumor and normal tissues and identified a set of specific and discrete tumor and normal tissue gene-based features. This indicates that we can classify samples regardless of source, with a high degree of accuracy, based on tissue of origin and also distinguish between normal and tumor status.Next, we employed a stacked ensemble machine learning algorithm combining multiple logistic regression models across diverse feature sets to the cfDNA dataset composed of 783 non cancers and 569 cancers comprising 67 breast, 118 colorectal, 210 Lung, 71 ovarian and 100 pancreatic cancers. We identified a genomic signature that enable the classification of non-cancer versus cancers with an outer fold cross validation sensitivity of 49% (CI 45%-53%) at 99% specificity. Further, individual cancer outer fold cross validation sensitivity at 99% specificity, was measured as follows: breast 30% (CI 119% -42%); colorectal 41% (CI 32%-50%); lung 49% (CI 42%-56%); ovarian 72% (CI 60-82%); pancreatic 56% (CI 46%-66%). Conclusions: This study demonstrates that 5hmC profiles can distinguish cancer and normal tissues based on their origin. Further, 5hmC changes in cfDNA enables detection of the several cancer types: breast, colorectal, lung, ovarian and pancreatic cancers. Our technology provides a non-invasive tool for cancer detection with low risk sample collection enabling improved compliance than current screening methods. Among other utilities, we believe our technology could be applied to asymptomatic high-risk individuals thus enabling enrichment for those subjects that most need a diagnostic imaging follow up.


Author(s):  
Elena E. Abramkina

Forensic authorship analysis is a frequently used technique to identify the real author of an arguable document. Often enough, under study are interrogation minutes. This kind of text is difficult for examination because of its stylistic and genre characteristics: formal phrases and structure as well as different author and compiler of the document. The above features restrict the use of some levels of language analysis. This issue, however, is poorly covered in specialist literature, with only a few articles related to it. The current paper describes the main discursive features of interrogation minutes used in authorship expertise. First we look at conventional techniques of authorship expertise and discuss their limitations. Special attention is given to the analysis of the interrogation minutes genre characteristics and their influence on the whole set of identifiers. The analysis of several conventional interrogation minutes techniques singled out two central tendencies in the authorship attribution: an identification features selection with new identifiers being added. The aim of the article is to propose a solution to the problem. Our technique is based on the methods of The Federal Ministry of the Interior, but it also takes into account genre charecteristics of the interrogation minutes. A new classification of identifiers has been developed. Additional features are offered to improve the attribution accuracy. These are clarifications, which are classified according to the semantic type of the object. In the article clarifications are divided into six types and a few subtypes and are also divided into low and high informative ones. The analysis of clarification is illustrated with the example of three different interrogation minutes. The concluding part of the article is concerned with the techniques of the interrogation minutes used in authorship expertise description, materials requirements and the steps of the analysis.


2020 ◽  
Author(s):  
Eleonora De Filippi ◽  
Mara Wolter ◽  
Bruno Melo ◽  
Carlos J. Tierra-Criollo ◽  
Tiago Bortolini ◽  
...  

AbstractDuring the last decades, neurofeedback training for emotional self-regulation has received significant attention from both the scientific and clinical communities. However, most studies have focused on broader emotional states such as “negative vs. positive”, primarily due to our poor understanding of the functional anatomy of more complex emotions at the electrophysiological level. Our proof-of-concept study aims at investigating the feasibility of classifying two complex emotions that have been implicated in mental health, namely tenderness and anguish, using features extracted from the electroencephalogram (EEG) signal in healthy participants. Electrophysiological data were recorded from fourteen participants during a block-designed experiment consisting of emotional self-induction trials combined with a multimodal virtual scenario. For the within-subject classification, the linear Support Vector Machine was trained with two sets of samples: random cross-validation of the sliding windows of all trials; and 2) strategic cross-validation, assigning all the windows of one trial to the same fold. Spectral features, together with the frontal-alpha asymmetry, were extracted using Complex Morlet Wavelet analysis. Classification results with these features showed an accuracy of 79.3% on average when doing random cross-validation, and 73.3% when applying strategic cross-validation. We extracted a second set of features from the amplitude time-series correlation analysis, which significantly enhanced random cross-validation accuracy while showing similar performance to spectral features when doing strategic cross-validation. These results suggest that complex emotions show distinct electrophysiological correlates, which paves the way for future EEG-based, real-time neurofeedback training of complex emotional states.Significance statementThere is still little understanding about the correlates of high-order emotions (i.e., anguish and tenderness) in the physiological signals recorded with the EEG. Most studies have investigated emotions using functional magnetic resonance imaging (fMRI), including the real-time application in neurofeedback training. However, concerning the therapeutic application, EEG is a more suitable tool with regards to costs and practicability. Therefore, our proof-of-concept study aims at establishing a method for classifying complex emotions that can be later used for EEG-based neurofeedback on emotion regulation. We recorded EEG signals during a multimodal, near-immersive emotion-elicitation experiment. Results demonstrate that intraindividual classification of discrete emotions with features extracted from the EEG is feasible and may be implemented in real-time to enable neurofeedback.


2021 ◽  
Vol 29 (2) ◽  
pp. 44-54
Author(s):  
Dukhayel Aldukhayel

Chapelle (2003) proposed three general types of input enhancement that help L2 learners “acquire features of the linguistic input that they are exposed to during the course reading or listening for meaning” (p. 40): input salience, input modification, and input elaboration. In 2010, Cárdenas-Claros and Gruba argued that Chapelle’s different types of input enhancement “can be and have been operationalized through help options” primarily utilized in the teaching of reading, listening, writing, grammar, and vocabulary such as glossed words, video/audio control features, captions, subtitles, and grammar explanations (p. 79). As understood from Cárdenas-Claros and Gruba’s classification of help options, input enhancement can only be accomplished through one process: salience, modification, or elaboration. In this article, we argue that YouTube comments have the potential to be (1) a help option that facilitate both listening comprehension of the videos and vocabulary learning and that (2) input enhancement accomplished by comments can be achieved by a combination of different types of input enhancement. Put another way, the aural input of a YouTube video can be salient, modified, and elaborated, thanks to the various types of comments YouTube videos often receive.


SUAR BETANG ◽  
2020 ◽  
Vol 15 (2) ◽  
pp. 219-231
Author(s):  
Muhammad Rizqi

This paper concerns the use of humor as a pragmatic device in academic discourse. The previous studies in this area has shown that, though unlikely, humor is commonly used in academic discourse—both that of written and spoken nature. Among many aspects analyzed in the studies of academic discourse, some are related to academic cultures. With a deliberate consideration of this existing body of literature, this research aims to contribute in area by examining the use of humor in a specific academic environment, Indonesia. The data analyzed are selected transcripts from chosen YouTube videos of studium generale lectures by three Indonesian political figures. The usage of humor will be identified and analyzed pragmatically, and further classified on a table based on the classification of humor by Martin, Puhlik-Doris, Larsen, Grey Weir (2003). The findings of this study show that in Indonesian studium generale lectures, all four types of humor in the theory occurred, and the most frequently used type of humor is aggressive humor, to which offensive jokes belong.


2020 ◽  
Author(s):  
Raquel Candido ◽  
Rafael Lama ◽  
Natália Chiari ◽  
Marcello Nogueira-Barbosa ◽  
Paulo Azevedo Marques ◽  
...  

Non-traumatic Vertebral Compression Fractures (VCFs) are generally caused by osteoporosis (benign VCFs) or metastatic cancer (malignant VCFs) and the success of the medical treatment strongly depends on a fast and correct classification of VCFs. Recently, methods for computer-aided diagnosis (CAD) based on machine learning have been proposed for classifying VCFs. In this work, we investigate the problem of clustering images of VCFs and the impact of feature selection by genetic algorithms, comparing the clustering i)with all features and ii)with feature selection through the purity results. The analysis of the clusters helps to understand the results of classifiers and difficulties of differentiating images of different classes by an expert. The results indicate that features selection improved the separability of clusters and purity. Feature selection also helps to understand which attributes are most important for analysing the images of vertebral bodies.


Sign in / Sign up

Export Citation Format

Share Document