multinomial data
Recently Published Documents


TOTAL DOCUMENTS

98
(FIVE YEARS 8)

H-INDEX

16
(FIVE YEARS 0)

2021 ◽  
Vol 69 (2) ◽  
pp. 96-100
Author(s):  
Farzana Afroz

Traditionally, the overdispersion parameter ϕ is estimated by using Pearson’s lack of fit statistic X2or the Deviance statistic D, which do not perform well in the case of sparse data. This paper particularly focuses on an estimator ϕnew of overdispersion parameter which was proposed for sparse multinomial data. The estimator was derived on the basis of an assumption on the 3rd cumulant of the response variable.When the data comes from the Dirichlet-multinomial distribution ϕnew is known to have the lowest root mean squared error comparing to the other three estimators. In this paper the 1st to 3rd order raw moments of the finite mixture of Dirichlet-multinomial distributions are derived, which results in complicated mathematical expressions. Furthermore, it is found that the 3rd cumulant of this mixture does not satisfy the assumption which is considered in the derivation of ϕnew . Dhaka Univ. J. Sci. 69(2): 96-100, 2021 (July)



2021 ◽  
Author(s):  
Sergei Bazylik ◽  
Magne Mogstad ◽  
Joseph Romano ◽  
Azeem Shaikh ◽  
Daniel Wilhelm


Algorithms ◽  
2021 ◽  
Vol 14 (10) ◽  
pp. 296
Author(s):  
Lucy Blondell ◽  
Mark Z. Kos ◽  
John Blangero ◽  
Harald H. H. Göring

Statistical analysis of multinomial data in complex datasets often requires estimation of the multivariate normal (mvn) distribution for models in which the dimensionality can easily reach 10–1000 and higher. Few algorithms for estimating the mvn distribution can offer robust and efficient performance over such a range of dimensions. We report a simulation-based comparison of two algorithms for the mvn that are widely used in statistical genetic applications. The venerable Mendell-Elston approximation is fast but execution time increases rapidly with the number of dimensions, estimates are generally biased, and an error bound is lacking. The correlation between variables significantly affects absolute error but not overall execution time. The Monte Carlo-based approach described by Genz returns unbiased and error-bounded estimates, but execution time is more sensitive to the correlation between variables. For ultra-high-dimensional problems, however, the Genz algorithm exhibits better scale characteristics and greater time-weighted efficiency of estimation.



Algorithms ◽  
2021 ◽  
Vol 14 (4) ◽  
pp. 128
Author(s):  
George Odongo ◽  
Richard Musabe ◽  
Damien Hanyurwimfura

This study investigates the use of machine-learning approaches to interpret Dissolved Gas Analysis (DGA) data to find incipient faults early in oil-impregnated transformers. Transformers are critical pieces of equipment in transmitting and distributing electrical energy. The failure of a single unit disturbs a huge number of consumers and suppresses economic activities in the vicinity. Because of this, it is important that power utility companies accord high priority to condition monitoring of critical assets. The analysis of dissolved gases is a technique popularly used for monitoring the condition of transformers dipped in oil. The interpretation of DGA data is however inconclusive as far as the determination of incipient faults is concerned and depends largely on the expertise of technical personnel. To have a coherent, accurate, and clear interpretation of DGA, this study proposes a novel multinomial classification model christened KosaNet that is based on decision trees. Actual DGA data with 2912 entries was used to compute the performance of KosaNet against other algorithms with multiclass classification ability namely the decision tree, k-NN, Random Forest, Naïve Bayes, and Gradient Boost. Investigative results show that KosaNet demonstrated an improved DGA classification ability particularly when classifying multinomial data.



2021 ◽  
Vol 12 (2) ◽  
pp. 339-357
Author(s):  
U. Sangeetha ◽  
M. Subbiah ◽  
M.R. Srinivasan ◽  
B. Nandram


2021 ◽  
Author(s):  
Sergei Bazylik ◽  
Magne Mogstad ◽  
Joseph P. Romano ◽  
Azeem Shaikh


2021 ◽  
Author(s):  
Sergei Bazylik ◽  
Magne Mogstad ◽  
Joseph P. Romano ◽  
Azeem Shaikh ◽  
Daniel Wilhelm


METIK JURNAL ◽  
2020 ◽  
Vol 4 (2) ◽  
pp. 49-54
Author(s):  
Nanda Arista Rizki ◽  
Petrus Fendiyanto ◽  
Ainun Jariah
Keyword(s):  

Pandemi covid-19 yang mengancam hingga seluruh penjuru dunia, membuat kebijakan mengenai Ujian Nasional dihapuskan. Hal ini berakibat terhadap sistem penjurusan di SMAN 2 Samarinda hanya berdasarkan nilai ujian sekolah dan peminatan saja. Hasil penjurusan yang dilakukan oleh sekolah dapat dimodelkan melalui metode klasifikasi. Penelitian ini bertujuan untuk membandingkan ketepatan hasil prediksi penjurusan dari diskriminan dan regresi logistik multinomial. Data yang digunakan dalam penelitian ini adalah nilai ujian sekolah dan hasil penjurusan peserta didik kelas X di SMAN 2 Samarinda. Tahapan analisis yang dilakukan adalah statistika deskriptif, pembentukan model analisis diskriminan, pembentukan model regresi logistik multinomial, penentuan model terbaik, dan interpretasi model terbaik. Proporsi pembagian data training dan data testing yang diterapkan dalam penelitian ini adalah 60:40, 70:30, 80:20, dan 90:10 dengan resampling bootstrap B=1000. Berdasarkan hasil analisis penelitian, maka model diskriminan kuadratik merupakan model terbaik yang menggambarkan kondisi penjurusan peserta didik di SMAN 2 Samarinda dengan akurasi prediksi tertinggi sebesar 0,8077. Bentuk kuadratik dalam model ini terlihat pada kurva pembatas daerah klasifikasi yaitu fungsi polinom berderajat dua.



Author(s):  
Timothy L Kennel ◽  
Richard Valliant

Abstract Estimators based on linear models are the standard in finite population estimation. However, many items collected in surveys are better described by nonlinear models; these include variables that have binary, binomial, or multinomial distributions. We extend previous work on generalized difference, model-calibrated, and pseudo-empirical likelihood estimators to two-stage cluster sampling and derive their theoretical properties with particular emphasis on multinomial data. We present asymptotic theory for both the point estimators of totals and their variance estimators. The alternatives are tested via simulation using artificial and real populations. The two real populations are one of educational institutions and degrees awarded and one of owned and rented housing units.



Sign in / Sign up

Export Citation Format

Share Document