scholarly journals Entropic Ranks: A Methodology for Enhanced, Threshold-Free, Information-Rich Data Partition and Interpretation

Author(s):  
Hector - Xavier de Lastic ◽  
Irene Liampa ◽  
Alexandros G. Georgakilas ◽  
Michalis Zervakis ◽  
Aristotelis Chatziioannou

Background: Traditional omic analysis relies on p-value and fold change as selection criteria. There is an ongoing debate on their effectiveness in delivering systemic and robust interpretation, due to their dependence on assumptions of conformity with various parametric distributions.Here, we propose a threshold-free selection method based on robust, non-parametric statistics, ensuring independence from the statistical distribution properties and broad applicability. Such methods could adapt to different initial data distributions, contrary to statistical techniques based on fixed thresholds. Methods: Our work extends the Rank Products methodology with a neutral selection method of high information-extraction capacity. We introduce the calculation of the RP distribution’s entropy to isolate the features of interest by their contribution to the distribution’s information content. The aim is a methodology performing threshold-free identification of the differentially expressed features, which are highly informative about the phenomenon under scrutiny. Conclusions: Applying the proposed method on microarray (transcriptomic and DNA methylation) and RNAseq count data of varying sizes and noise presence, we observe robust convergence for the different parameterisations to stable cutoff points. Functional analysis through BioInfoMiner and EnrichR was used to evaluate the information potency of the resulting feature lists. Overall, the derived functional terms provide a systemic description highly compatible with the results of traditional statistical hypothesis testing techniques. The methodology behaves consistently across different data types. The feature lists are compact and information-rich, indicating phenotypic aspects specific to the tissue and biological phenomenon i nvestigated. Selection by information content measures efficiently addresses problems, emerging from arbitrary thresholding, thus facilitating the full automation of the analysis.

2020 ◽  
Vol 10 (20) ◽  
pp. 7077
Author(s):  
Hector-Xavier de Lastic ◽  
Irene Liampa ◽  
Alexandros G. Georgakilas ◽  
Michalis Zervakis ◽  
Aristotelis Chatziioannou

Background: Here, we propose a threshold-free selection method for the identification of differentially expressed features based on robust, non-parametric statistics, ensuring independence from the statistical distribution properties and broad applicability. Such methods could adapt to different initial data distributions, contrary to statistical techniques, based on fixed thresholds. This work aims to propose a methodology, which automates and standardizes the statistical selection, through the utilization of established measures like that of entropy, already used in information retrieval from large biomedical datasets, thus departing from classical fixed-threshold based methods, relying in arbitrary p-value and fold change values as selection criteria, whose efficacy also depends on degree of conformity to parametric distributions,. Methods: Our work extends the rank product (RP) methodology with a neutral selection method of high information-extraction capacity. We introduce the calculation of the RP entropy of the distribution, to isolate the features of interest by their contribution to its information content. Goal is a methodology of threshold-free identification of the differentially expressed features, which are highly informative about the phenomenon under study. Conclusions: Applying the proposed method on microarray (transcriptomic and DNA methylation) and RNAseq count data of varying sizes and noise presence, we observe robust convergence for the different parameterizations to stable cutoff points. Functional analysis through BioInfoMiner and EnrichR was used to evaluate the information potency of the resulting feature lists. Overall, the derived functional terms provide a systemic description highly compatible with the results of traditional statistical hypothesis testing techniques. The methodology behaves consistently across different data types. The feature lists are compact and rich in information, indicating phenotypic aspects specific to the tissue and biological phenomenon investigated. Selection by information content measures efficiently addresses problems, emerging from arbitrary thresh-holding, thus facilitating the full automation of the analysis.


2021 ◽  
Author(s):  
Lingfei Wang

AbstractSingle-cell RNA sequencing (scRNA-seq) provides unprecedented technical and statistical potential to study gene regulation but is subject to technical variations and sparsity. Here we present Normalisr, a linear-model-based normalization and statistical hypothesis testing framework that unifies single-cell differential expression, co-expression, and CRISPR scRNA-seq screen analyses. By systematically detecting and removing nonlinear confounding from library size, Normalisr achieves high sensitivity, specificity, speed, and generalizability across multiple scRNA-seq protocols and experimental conditions with unbiased P-value estimation. We use Normalisr to reconstruct robust gene regulatory networks from trans-effects of gRNAs in large-scale CRISPRi scRNA-seq screens and gene-level co-expression networks from conventional scRNA-seq.


2021 ◽  
Vol 3 (2) ◽  
pp. 41-51
Author(s):  
Sri Hidayat ◽  
Syafri Syafri ◽  
Syahriar Tato

Koridor ruas jalan Hertasning-Tun Abdul Razak merupakan wilayah peri-urban yang mengalami dinamika cukup tinggi akibat kebutuhan permukiman dan sarana kegiatan baru. Hal ini memicu terjadinya transformasi spasial. Transformasi spasial memberikan dampak pada peningkatan aktivitas antropogenik yang dapat mengubah iklim perkotaan. Peningkatan aktivitas antropogenik ditandai dengan perbedaan penggunaan lahan dan kinerja lalu lintas sepanjang koridor. Penelitian ini menggunakan metode kuantitatif untuk mengetahui hubungan variabel penggunaan lahan dan kinerja lalu lintas terhadap kondisi iklim perkotaan dengan analisis data menggunakan SEM PLS.  Hasil pengujian hipotesis secara statistik terhadap pengaruh masing-masing variabel independen terhadap variabel dependennya menghasilkan kesimpulan penggunaan lahan berpengaruh signifikan terhadap kondisi iklim dimana nilai T-Statistik sebesar 2,752 > 1,96 atau nilai P sebesar 0,040 < 0,05. Sementara kinerja lalu lintas tidak berpengaruh signifikan terhadap kondisi iklim perkotaan dengan nilai T-Statistik sebesar 1,071 < 1,96 atau nilai P sebesar 0,285 > 0,05. Hasil ini juga menunjukkan bahwa penggunaan lahan di koridor ruas jalan Hertasning-Tun Abdul Razak dapat menyebabkan meningkatnya suhu perkotaan dikawasan tersebut. Namun peningkatan suhu perkotaan pada kawasan tersebut lebih disebabkan oleh aktivitas antropogenik pada penggunaan lahannya dan tidak dipengaruhi oleh luas area yang terbangun. The corridor of the Hertasning-Tun Abdul Razak road section is a peri-urban area experiencing high dynamics due to the need for new housing and activity facilities. This triggers a spatial transformation. Spatial transformation has an impact on increasing anthropogenic activities that can change the urban climate. The increase in anthropogenic activity is indicated by differences in land use and traffic performance along the corridor. This study uses a quantitative method to determine the relationship between land use variables and traffic performance on urban climatic conditions with data analysis using SEM PLS. The results of statistical hypothesis testing on the effect of each independent variable on the dependent variable resulted in the conclusion that land use had a significant effect on climatic conditions where the T-statistic value was 2.752> 1.96 or the P value was 0.040 <0.05. Meanwhile, traffic performance has no significant effect on urban climatic conditions with a T-statistic value of 1.071 <1.96 or a P value of 0.285> 0.05. These results also indicate that land use in the Hertasning-Tun Abdul Razak road corridor can cause an increase in urban temperatures in the area. However, the increase in urban temperature in these areas is more due to anthropogenic activities in land use and is not influenced by the area that is built.


Author(s):  
Helena Kraemer

“As ye sow. So shall ye reap”: For almost 100 years, researchers have been taught that the be-all and end-all in data-based research is the p-value. The resulting problems have now generated concern, often from us who have long so taught researchers. We must bear a major responsibility for the present situation and must alter our teachings. Despite the fact that the Zhang and Hughes paper is titled “Beyond p-value”, the total focus remains on statistical hypothesis testing studies (HTS) and p-values(1). Instead, I would propose that there are three distinct, necessary, and important phases of research: 1) Hypothesis Generation Studies (HGS) or Exploratory Research (2-4); 2) Hypothesis Testing Studies (HTS); 3) Replication and Application of Results. Of these, HTS is undoubtedly the most important, but without HGS, HTS is often weak and wasteful, and without Replication and Application, the results of HTS are often misleading.


Author(s):  
Dyah Wulandari ◽  
Siti Maria Ulfa ◽  
Arfiyan Ridwan

<p class="MsoNormal" style="margin-top: 0cm; margin-right: 5.6pt; margin-bottom: .0001pt; margin-left: 5.8pt; text-align: justify;">The objective of the research to compare of Zimmer twins website tool as digital storytelling than use nondigital in writing text narrative on students writing ability at the eleventh grade of MA  Yayasan Sirojul  Islam Sukolilo 2018/2019 academic year. Zimmer twins media is an animated movie maker based on the website for the students to create their short stories in movies with many emotion, etc. The sample of this research the Eleventh of MA YASI those are XI- 1 class as the experimental class, and the X1-2 class as the control class consisted of 20 students. The method in this research was a quantitative method. In addition, the design used  was quasi-experimental research, and the instrument used a test. The research was taken by using non-random sampling. Moreover, this research was conducted through the following procedures: giving pre-test, applying treatments and giving post-test. The data analyzed and processed by using the statistic data calculation of ANCOVA by SPSS 23 program. The significant was shown by the students post-test mean in experimental class is 76.55 and the mean post-test in control class is 70.55. The result of the statistical hypothesis testing found from p-value was 0.000. It is lower than the level significant of 0.05. If p-value ≤ from the level significant 0.05. It means that H<sub>1</sub> was accepted and H<sub>0</sub> was rejected. In conclusion, Zimmer twins media can be effective as media teaching to writing ability of narrative text at the eleventh-grade students of MA Yayasan Sirojul IslamSukolilo.</p><table class="MsoNormalTable" style="width: 468.1pt; border-collapse: collapse; border: none; mso-border-alt: solid windowtext .5pt; mso-yfti-tbllook: 1184; mso-padding-alt: 0cm 5.4pt 0cm 5.4pt; mso-border-insideh: .5pt solid windowtext; mso-border-insidev: .5pt solid windowtext;" width="624" border="1" cellspacing="0" cellpadding="0"><tbody><tr style="mso-yfti-irow: 0; mso-yfti-firstrow: yes; mso-yfti-lastrow: yes; height: 62.1pt;"><td style="width: 305.8pt; border: none; border-top: solid windowtext 1.0pt; mso-border-top-alt: solid windowtext .5pt; padding: 0cm 5.4pt 0cm 5.4pt; height: 62.1pt;" valign="top" width="408"><p class="MsoNormal" style="margin-top: 0cm; margin-right: 5.6pt; margin-bottom: .0001pt; margin-left: 5.8pt; text-align: justify;"><span style="mso-bidi-language: EN-US;">The objective of the research to compare of Zimmer twins website tool as digital storytelling than use nondigital in writing text narrative on students writing ability at the eleventh grade of MA<span style="mso-spacerun: yes;">  </span>Yayasan Sirojul<span style="mso-spacerun: yes;">  </span>Islam Sukolilo 2018/2019 academic year. Zimmer twins media is an animated movie maker based on the website for the students to create their short stories in movies with many emotion, etc. The sample of this research the Eleventh of MA YASI those are XI- 1 class as the experimental class, and the X1-2 class as the control class consisted of 20 students. The method in this research was a quantitative method. In addition, the design used<span style="mso-spacerun: yes;">  </span>was quasi-experimental research, and the instrument used a test. The research was taken by using non-random sampling. Moreover, this research was conducted through the following procedures: giving pre-test, applying treatments and giving post-test. The data analyzed and processed by using the statistic data calculation of ANCOVA by SPSS 23 program. The significant was shown by the students post-test mean in experimental class is 76.55 and the mean post-test in control class is 70.55. The result of the statistical hypothesis testing found from p-value was 0.000. It is lower than the level significant of 0.05. If p-value ≤ from the level significant 0.05. It means that H<sub>1</sub> was accepted and H<sub>0</sub> was rejected. In conclusion, Zimmer twins media can be effective as media teaching to writing ability of narrative text at the eleventh-grade students of MA Yayasan Sirojul IslamSukolilo.</span></p><p class="MsoNormal" style="text-align: justify;"><span style="font-size: 9.0pt; mso-fareast-font-family: Calibri; mso-bidi-font-style: italic;"> </span></p></td></tr></tbody></table>


Author(s):  
Riko Kelter

AbstractThe Full Bayesian Significance Test (FBST) and the Bayesian evidence value recently have received increasing attention across a variety of sciences including psychology. Ly and Wagenmakers (2021) have provided a critical evaluation of the method and concluded that it suffers from four problems which are mostly attributed to the asymptotic relationship of the Bayesian evidence value to the frequentist p-value. While Ly and Wagenmakers (2021) tackle an important question about the best way of statistical hypothesis testing in the cognitive sciences, it is shown in this paper that their arguments are based on a specific measure-theoretic premise. The identified problems hold only under a specific class of prior distributions which are required only when adopting a Bayes factor test. However, the FBST explicitly avoids this premise, which resolves the problems in practical data analysis. In summary, the analysis leads to the more important question whether precise point null hypotheses are realistic for scientific research, and a shift towards the Hodges-Lehmann paradigm may be an appealing solution when there is doubt on the appropriateness of a precise hypothesis.


2021 ◽  
Vol 7 (1) ◽  
Author(s):  
Lukas Vlcek ◽  
Shize Yang ◽  
Yongji Gong ◽  
Pulickel Ajayan ◽  
Wu Zhou ◽  
...  

AbstractExploration of structure-property relationships as a function of dopant concentration is commonly based on mean field theories for solid solutions. However, such theories that work well for semiconductors tend to fail in materials with strong correlations, either in electronic behavior or chemical segregation. In these cases, the details of atomic arrangements are generally not explored and analyzed. The knowledge of the generative physics and chemistry of the material can obviate this problem, since defect configuration libraries as stochastic representation of atomic level structures can be generated, or parameters of mesoscopic thermodynamic models can be derived. To obtain such information for improved predictions, we use data from atomically resolved microscopic images that visualize complex structural correlations within the system and translate them into statistical mechanical models of structure formation. Given the significant uncertainties about the microscopic aspects of the material’s processing history along with the limited number of available images, we combine model optimization techniques with the principles of statistical hypothesis testing. We demonstrate the approach on data from a series of atomically-resolved scanning transmission electron microscopy images of MoxRe1-xS2 at varying ratios of Mo/Re stoichiometries, for which we propose an effective interaction model that is then used to generate atomic configurations and make testable predictions at a range of concentrations and formation temperatures.


Cancers ◽  
2021 ◽  
Vol 13 (5) ◽  
pp. 1153
Author(s):  
Elysia Racanelli ◽  
Abdulhadi Jfri ◽  
Amnah Gefri ◽  
Elizabeth O’Brien ◽  
Ivan Litvinov ◽  
...  

Background: Cutaneous squamous cell carcinoma (cSCC) is a rare complication of hidradenitis suppurativa (HS). Objectives: To conduct a systematic review and an individual patient data (IPD) meta-analysis to describe the clinical characteristics of HS patients developing cSCC and determine predictors of poor outcome. Methods: Medline/PubMed, Embase, and Web of Science were searched for studies reporting cSCC arising in patients with HS from inception to December 2019. A routine descriptive analysis, statistical hypothesis testing, and Kaplan–Meier survival curves/Cox proportional hazards regression models were performed. Results: A total of 34 case reports and series including 138 patients were included in the study. The majority of patients were males (81.6%), White (83.3%), and smokers (n = 22/27 reported) with a mean age of 53.5 years. Most patients had gluteal (87.8%), Hurley stage 3 HS (88.6%). The mean time from the diagnosis of HS to the development of cSCC was 24.7 years. Human papillomavirus was identified in 12/38 patients tested. Almost 50% of individuals had nodal metastasis and 31.3% had distant metastases. Half of the patients succumbed to their disease. Conclusions: cSCC is a rare but life-threatening complication seen in HS patients, mainly occurring in White males who are smokers with severe, long-standing gluteal HS. Regular clinical examination and biopsy of any suspicious lesions in high-risk patients should be considered. The use of HPV vaccination as a preventive and possibly curative method needs to be explored.


Sign in / Sign up

Export Citation Format

Share Document