scholarly journals Elastic Correlation Adjusted Regression (ECAR) scores for high dimensional variable importance measuring

2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Yuan Zhou ◽  
Botao Fa ◽  
Ting Wei ◽  
Jianle Sun ◽  
Zhangsheng Yu ◽  
...  

AbstractInvestigation of the genetic basis of traits or clinical outcomes heavily relies on identifying relevant variables in molecular data. However, characteristics such as high dimensionality and complex correlation structures of these data hinder the development of related methods, resulting in the inclusion of false positives and negatives. We developed a variable importance measure method, termed the ECAR scores, that evaluates the importance of variables in the dataset. Based on this score, ranking and selection of variables can be achieved simultaneously. Unlike most current approaches, the ECAR scores aim to rank the influential variables as high as possible while maintaining the grouping property, instead of selecting the ones that are merely predictive. The ECAR scores’ performance is tested and compared to other methods on simulated, semi-synthetic, and real datasets. Results showed that the ECAR scores improve the CAR scores in terms of accuracy of variable selection and high-rank variables’ predictive power. It also outperforms other classic methods such as lasso and stability selection when there is a high degree of correlation among influential variables. As an application, we used the ECAR scores to analyze genes associated with forced expiratory volume in the first second in patients with lung cancer and reported six associated genes.

2018 ◽  
Vol 34 (7) ◽  
Author(s):  
Manuel Lozano ◽  
Lara Manyes ◽  
Juanjo Peiró ◽  
Adina Iftimi ◽  
José María Ramada

Multidisciplinary research in public health is approached using methods from many scientific disciplines. One of the main characteristics of this type of research is dealing with large data sets. Classic statistical variable selection methods, known as “screen and clean”, and used in a single-step, select the variables with greater explanatory weight in the model. These methods, commonly used in public health research, may induce masking and multicollinearity, excluding relevant variables for the experts in each discipline and skewing the result. Some specific techniques are used to solve this problem, such as penalized regressions and Bayesian statistics, they offer more balanced results among subsets of variables, but with less restrictive selection thresholds. Using a combination of classical methods, a three-step procedure is proposed in this manuscript, capturing the relevant variables of each scientific discipline, minimizing the selection of variables in each of them and obtaining a balanced distribution that explains most of the variability. This procedure was applied on a dataset from a public health research. Comparing the results with the single-step methods, the proposed method shows a greater reduction in the number of variables, as well as a balanced distribution among the scientific disciplines associated with the response variable. We propose an innovative procedure for variable selection and apply it to our dataset. Furthermore, we compare the new method with the classic single-step procedures.


2017 ◽  
Vol 14 (27) ◽  
pp. 30-38
Author(s):  
Filipe ALBANO ◽  
Carla ten CATEN ◽  
Michel ANZANELLO

Proficiency Tests (PT) based on interlaboratory comparisons are activities aimed at assessing the technical competence of laboratories in carrying out specific measurements. The analyses of homogeneity and stability of prepared samples are an important step in ensuring the reliability of the comparison rounds, since improper selection of the parameter to carry out this evaluation can influence the promoted comparison. This paper proposes a method for selecting the most relevant variables aimed at improving homogeneity and stability tests in PT. For that matter, the approach relies on a variable importance index derived from Principal Components Analysis (PCA) parameters. The proposed method was applied to three different PT schemes (beverage, water and coal) in Brazil. Results indicate that the use of PCA was adequate to help the variable selection of homogeneity and stability tests in PT schemes. The selected subset of variables was corroborated by experts in the PT schemes analyzed.


Author(s):  
Manish M. Kayasth ◽  
Bharat C. Patel

The entire character recognition system is logically characterized into different sections like Scanning, Pre-processing, Classification, Processing, and Post-processing. In the targeted system, the scanned image is first passed through pre-processing modules then feature extraction, classification in order to achieve a high recognition rate. This paper describes mainly on Feature extraction and Classification technique. These are the methodologies which play an important role to identify offline handwritten characters specifically in Gujarati language. Feature extraction provides methods with the help of which characters can identify uniquely and with high degree of accuracy. Feature extraction helps to find the shape contained in the pattern. Several techniques are available for feature extraction and classification, however the selection of an appropriate technique based on its input decides the degree of accuracy of recognition. 


Author(s):  
Behnam Jahangiri ◽  
Punyaslok Rath ◽  
Hamed Majidifard ◽  
William G. Buttlar

Various agencies have begun to research and introduce performance-related specifications (PRS) for the design of modern asphalt paving mixtures. The focus of most recent studies has been directed toward simplified cracking test development and evaluation. In some cases, development and validation of PRS has been performed, building on these new tests, often by comparison of test values to accelerated pavement test studies and/or to limited field data. This study describes the findings of a comprehensive research project conducted at Illinois Tollway, leading to a PRS for the design of mainline and shoulder asphalt mixtures. A novel approach was developed, involving the systematic establishment of specification requirements based on: 1) selection of baseline values based on minimally acceptable field performance thresholds; 2) elevation of thresholds to account for differences between short-term lab aging and expected long-term field aging; 3) further elevation of thresholds to account for variability in lab testing, plus variability in the testing of field cores; and 4) final adjustment and rounding of thresholds based on a consensus process. After a thorough evaluation of different candidate cracking tests in the course of the project, the Disk-shaped Compact Tension—DC(T)—test was chosen to be retained in the Illinois Tollway PRS and to be presented in this study for the design of crack-resistant mixtures. The DC(T) test was selected because of its high degree of correlation with field results and its excellent repeatability. Tailored Hamburg rut depth and stripping inflection point thresholds were also established for mainline and shoulder mixes.


2021 ◽  
Vol 54 (3) ◽  
pp. 1-36
Author(s):  
Syed Wasif Abbas Hamdani ◽  
Haider Abbas ◽  
Abdul Rehman Janjua ◽  
Waleed Bin Shahid ◽  
Muhammad Faisal Amjad ◽  
...  

Cyber threats have been growing tremendously in recent years. There are significant advancements in the threat space that have led towards an essential need for the strengthening of digital infrastructure security. Better security can be achieved by fine-tuning system parameters to the best and optimized security levels. For the protection of infrastructure and information systems, several guidelines have been provided by well-known organizations in the form of cybersecurity standards. Since security vulnerabilities incur a very high degree of financial, reputational, informational, and organizational security compromise, it is imperative that a baseline for standard compliance be established. The selection of security standards and extracting requirements from those standards in an organizational context is a tedious task. This article presents a detailed literature review, a comprehensive analysis of various cybersecurity standards, and statistics of cyber-attacks related to operating systems (OS). In addition to that, an explicit comparison between the frameworks, tools, and software available for OS compliance testing is provided. An in-depth analysis of the most common software solutions ensuring compliance with certain cybersecurity standards is also presented. Finally, based on the cybersecurity standards under consideration, a comprehensive set of minimum requirements is proposed for OS hardening and a few open research challenges are discussed.


1971 ◽  
Vol 8 (3) ◽  
pp. 340-347 ◽  
Author(s):  
George S. Day ◽  
Roger M. Heeler

When the selection of a sample of stores or cities requires a high degree of similarity among the test units in order to ensure a sensitive experiment, the sample may no longer represent the market. These conflicting requirements can be satisfied by choosing the sample from clusters displayed in a reduced space representation of the market.


2017 ◽  
Vol 22 (4) ◽  
pp. 467
Author(s):  
Mahran Zeity ◽  
Nagappa Srinivas ◽  
Chinnamade Channegowde Gowda

Study of morphological characters of Tetranychus macfarlanei Baker & Pritchard and Tetranychus malaysiensis Ehara revealed high similarity by comparing all the important characters in addition to the characters pointed out by Ehara to separate those two species. Molecular phylogeny of seven Indian populations of T. macfarlanei and one population of T. malaysiensis from Philippines along with few distantly related species of Tetranychus was attempted. High degree of similarity between these two species at mitochondrial COI gene (96%) as well as ITS2 (rDNA) (96–99%) region was evident. Based on both morphological features and molecular data, T. malaysiensis is proposed as a junior synonym of T. macfarlanei based on ICZN’s law of priority. Also more female characters are prompted in this study to distinctly discriminate T. macfarlanei from its most resembling species, Tetranychus ludeni Zacher. Tetranychus macfarlanei has emerged as a pest of several cultivated crop plants in India. 


PEDIATRICS ◽  
1964 ◽  
Vol 34 (5) ◽  
pp. 705-707
Author(s):  
WILLIAM D. DONALD

In vitro sensitivities of 70 shigella strains isolated over a recent 18-month period are reported. The high degree of sulfadiazine resistance casts some doubt on the selection of this agent as the drug of choice in the treatment of shigellosis, at least in this community. Some of the other agents, although inhibiting the growth of the organisms in vitro, have disadvantages such as toxicity or failure of absorption from the gastrointestinal tract. Tetracycline resistance was found in only 7% of the organisms tested, but from this and other reports we may anticipate the occurrence of more organisms resistant to this agent. The results of the sensitivities to ampicillin are encouraging and further studies including clinical trials of this agent are in order.


2021 ◽  
Vol 1037 ◽  
pp. 486-493
Author(s):  
Sergey Y. Zhachkin ◽  
Anatoly I. Zavrazhnov ◽  
Nikita A. Penkov ◽  
George V. Kudryavtsev ◽  
Paul V. Tsisarenko

One of the fundamental tasks in restoring the operability of cylinder liners is the application of a composite coating with a predetermined microhardness value. The authors have developed a technology for applying composite coatings based on iron on cylindrical surfaces, which makes it possible to vary the physical, mechanical and operational parameters of the formed iron-containing coating due to the planned selection of the deposition parameters. This eliminates the need for mechanical treatment of the applied coating, which is the reason for the high degree of rejection parts that undergo the iron-on operation. Contact interaction of the working tool with the formed layer of the composite coating has a positive effect on the value of its roughness.


Sign in / Sign up

Export Citation Format

Share Document