statistical data mining
Recently Published Documents


TOTAL DOCUMENTS

52
(FIVE YEARS 10)

H-INDEX

8
(FIVE YEARS 1)

Author(s):  
Michal Kaźmierczak ◽  
Ewa Patyk-Kaźmierczak

The Cambridge Structural Database (CSD) is the largest repository of crystal structures of organic and metal–organic compounds, containing over 1.1 million entries. Over 3300 of the deposits are structures determined under high pressure, with the number being strongly affected by the experimental requirements of the high-pressure techniques. Nevertheless, it still presents a population sufficiently representative for statistical data mining. In this work, an in-depth analysis of this population is presented, showing where contributors of high-pressure depositions come from, which journals high-pressure structures are published in, and also providing information on some trends in high-pressure crystallography and how they have changed over the years elucidated from data collected in the CSD. The ultimate goal of this article is to bring the high-pressure crystallography content in the CSD to a wider audience of scientists.


2021 ◽  
Vol 12 ◽  
Author(s):  
Jiao He ◽  
Qian Zhang ◽  
Cuiying Ma ◽  
Gabriel I. Giancaspro ◽  
Kaishun Bi ◽  
...  

C. morifolium flower and C. indicum flower are two closely related herbal species with similar morphological and microscopic characteristics but are discriminated in edible and medicinal purpose. However, there is no effective approach to distinguish the two herbs. A novel workflow for quickly differentiating C. morifolium flower and C. indicum flower was developed. Firstly, the difference in anti-inflammatory effects for C. morifolium flower and C. indicum flower was characterized using lipopolysaccharide-treated rats. Then HPLC fingerprint analysis for 53 batches of C. morifolium flowers and 33 batches of C. indicum flower was carried out to deep profile the chemical components. The preliminary markers were screened out by OPLS-DA, identified by HPLC-ESI-QTOF-MS, and quantified by the improved SSDMC (single reference standard to determine multiple compounds) approach. Finally, multiple statistical data mining was performed to confirm the markers and a binary logistic regression equation was built to differentiate C. morifolium flower and C. indicum flower successfully. In general, the established workflow was rapid, effective and highly feasible, which would provide a powerful tool for herb identification.


2020 ◽  
Author(s):  
◽  
P. A. S. O. Silva

Pain analysis in newborns has become a relevant study subject over the last few decades, given the inability to objectively identify the source and intensity of the pain in newborn babies. Over the last few years, several methods for pain detection and evaluation were able to classify pain levels using facial expressions from newborn babies, through statistical models, machine learning and deep learning. Considering this context, health professionals are increasingly more interested in having computerized tools at their disposal. These tools would not only be able to accurately rank the newborn’s potential pain level, but also identify the facial regions of greatest relevance for a particular pain phenomenon. This dissertation’s main objective is to develop a computer framework capable of recognizing and interpreting patterns in facial expressions for an automated evaluation of pain levels on term babies. Specifically, this dissertation focuses on the investigation, implementation and integration of a series of techniques, including image detection and segmentation, spacial normalization and, ultimately, the classification of facial expressions based on information obtained through statistical data mining. Finally, the framework developed here, evaluated with an accuracy (upper limit) of approximately 96% for the COPE base and 77% for the UNIFESP base, reveal that it is possible to not only rank pain levels statistically through images of facial expressions, but also to identify key facial regions for certain pain phenomena, therefore assisting in creating more general and accurate pediatric pain scales


2019 ◽  
Vol 22 (4) ◽  
pp. 753-763
Author(s):  
Mark Eshwar Lokanan

Purpose The purpose of this paper is to use statistical techniques to mine and analyze suspicious transactions. With the increase in money laundering activities across various sectors in some of the world’s leading democracies, the ability to detect such transactions is gaining grounds with more urgency. Regulators and practitioners have been calling for an approach that can mine the large volume of unstructured data form suspicious money laundering transactions to inform public policies. Design/methodology/approach By deducing from the results of empirical studies in the field of money laundering detection, this paper presented an overview of data mining technology for detecting suspicious transactions. Findings After chronicling the data mining process, the paper delves into an analysis of the statistical approaches that can be used to differentiate between legitimate and suspicious money laundering transactions. The different stages of the data mining process are carefully explained in relation to their application to anti-money laundering compliance. The results indicate that statistical data mining methodology is a very efficient and useful technique to detect suspicious transactions. Practical implications The paper is of relevance to regulators and the financial service sector. A discussion of how data can be mined to facilitate statistical analysis can be used to inform regulatory policies on the detection and prevention of money laundering activities in the financial service sector. Originality/value The paper discuss approaches that illustrate how analysts can use statistical techniques to analyze data for suspicious money laundering transactions


Author(s):  
Tazeem Zainab ◽  
Zahid Ashraf Wani

To quantify science and to handle the scientific information, various methods are used. Researchers and scientists use varied techniques for fundamental concepts which are more or less auxiliary and corresponding to a certain extension with respect to their applications. Scientometrics, in this context, is a novel scientific field joining science and technology with information science and expending numerous mathematical, statistical, data mining techniques, and procedures to measure and quantify scientific information. The focus of scientometrics as a discipline is the literature of science and technology. The chapter thus aims to discuss the concept of scientometrics and its indicators that are employed to assess the quality of scholarly content. Further, the chapter also discusses the pros and cons of prominent scientometric indicators that are currently employed in assessing the performance of an individual researcher, institution, or country.


Sign in / Sign up

Export Citation Format

Share Document