The Development of Topic Model Based on Beta-Negative Binomial Process

2013 ◽  
Vol 427-429 ◽  
pp. 1597-1600
Author(s):  
Ya Shu Liu ◽  
Han Bing Yan

. Topic Model is one of the important subfields in Data Mining, which has been developed very quickly and has been applicated in many fields in recent years. Many researchers have been engaged in this field. In this paper, we introduce the BNB process based on Beta and Negative Binomial distribution, using the hierarchical distribution instead of Dirichlet in LDA. And we give the expression of parameter estimation used by Gibbs sampling. Then, BNB process is applicated in the text topic classification. We design experiments to decide the numbers of topics and compare the BNB process with LDA. Experiment results show that the BNB process has better performance over LDA in English Dataset, but they have almost the same result in Chinese micro-blog topic classification. Finally we analyze the problem and give the idea in further research.

1996 ◽  
Vol 79 (4) ◽  
pp. 981-988 ◽  
Author(s):  
Thomas Whitaker ◽  
Francis Giesbrecht ◽  
Jeremy Wu

Abstract The acceptability of 10 theoretical distributions to simulate observed distribution of sample aflatoxin test results was evaluated by using 2 parameter estimation methods and 3 goodness of fit (GOF) tests. All theoretical distributions were compared with 120 observed distributions of aflatoxin test results of farmers' stock peanuts. For a given parameter estimation method and GOF test, the negative binomial distribution had the highest percentage of statistically acceptable fits. The log normal and Poisson-gamma (gamma shape parameter = 0.5) distributions had slightly fewer but an almost equal percentage of acceptable fits. For the 3 most acceptable statistical models, the negative binomial had the greatest percentage of best or closest fits. Both the parameter estimation method and the GOF test had an influence on which theoretical distribution had the largest number of acceptable fits. All theoretical distributions, except the negative binomial distribution, had more acceptable fits when model parameters were determined by the maximum likelihood method. The negative binomial had slightly more acceptable fits when model parameters were estimated by the method of moments. The results also demonstrated the importance of using the same GOF test for comparing the acceptability of several theoretical distributions.


2019 ◽  
Vol 53 (5) ◽  
pp. 417-422
Author(s):  
P. De los Ríos ◽  
E. Ibáñez Arancibia

Abstract The coastal marine ecosystems in Easter Island have been poorly studied, and the main studies were isolated species records based on scientific expeditions. The aim of the present study is to apply a spatial distribution analysis and niche sharing null model in published data on intertidal marine gastropods and decapods in rocky shore in Easter Island based in field works in 2010, and published information from CIMAR cruiser in 2004. The field data revealed the presence of decapods Planes minutus (Linnaeus, 1758) and Leptograpsus variegatus (Fabricius, 1793), whereas it was observed the gastropods Nodilittorina pyramidalis pascua Rosewater, 1970 and Nerita morio (G. B. Sowerby I., 1833). The available information revealed the presence of more species in data collected in 2004 in comparison to data collected in 2010, with one species markedly dominant in comparison to the other species. The spatial distribution of species reported in field works revealed that P. minutus and N. morio have aggregated pattern and negative binomial distribution, L. variegatus had uniform pattern with binomial distribution, and finally N. pyramidalis pascua, in spite of aggregated distribution pattern, had not negative binomial distribution. Finally, the results of null model revealed that the species reported did not share ecological niche due to competition absence. The results would agree with other similar information about littoral and sub-littoral fauna for Easter Island.


2011 ◽  
Vol 10 (2) ◽  
pp. 1
Author(s):  
Y. ARBI ◽  
R. BUDIARTI ◽  
I G. P. PURNABA

Operational risk is defined as the risk of loss resulting from inadequate or failed internal processes or external problems. Insurance companies as financial institution that also faced at risk. Recording of operating losses in insurance companies, were not properly conducted so that the impact on the limited data for operational losses. In this work, the data of operational loss observed from the payment of the claim. In general, the number of insurance claims can be modelled using the Poisson distribution, where the expected value of the claims is similar with variance, while the negative binomial distribution, the expected value was bound to be less than the variance.Analysis tools are used in the measurement of the potential loss is the loss distribution approach with the aggregate method. In the aggregate method, loss data grouped in a frequency distribution and severity distribution. After doing 10.000 times simulation are resulted total loss of claim value, which is total from individual claim every simulation. Then from the result was set the value of potential loss (OpVar) at a certain level confidence.


Mathematics ◽  
2021 ◽  
Vol 9 (13) ◽  
pp. 1571
Author(s):  
Irina Shevtsova ◽  
Mikhail Tselishchev

We investigate the proximity in terms of zeta-structured metrics of generalized negative binomial random sums to generalized gamma distribution with the corresponding parameters, extending thus the zeta-structured estimates of the rate of convergence in the Rényi theorem. In particular, we derive upper bounds for the Kantorovich and the Kolmogorov metrics in the law of large numbers for negative binomial random sums of i.i.d. random variables with nonzero first moments and finite second moments. Our method is based on the representation of the generalized negative binomial distribution with the shape and exponent power parameters no greater than one as a mixed geometric law and the infinite divisibility of the negative binomial distribution.


Sign in / Sign up

Export Citation Format

Share Document