normalization methods Latest Research Papers

MAFFIN: Metabolomics Sample Normalization Using Maximal Density Fold Change with High-Quality Metabolic Features and Corrected Signal Intensities

10.1101/2021.12.23.474041 ◽

2021 ◽

Author(s):

Huaxu Yu ◽

Tao Huan

Keyword(s):

Principal Component ◽

Fold Change ◽

Superior Performance ◽

Normalization Factor ◽

High Quality ◽

Metabolomics Data ◽

Maximal Density ◽

Normalization Methods ◽

Quality Features ◽

Signal Intensities

Sample normalization is a critical step in metabolomics to remove differences in total sample amount or concentration of metabolites between biological samples. Here, we present MAFFIN, an accurate and robust post-acquisition sample normalization workflow that works universally for metabolomics data collected by mass spectrometry (MS)-based platforms. The most important design of MAFFIN is the calculation of normalization factor using maximal density fold change (MDFC) value computed by a kernel density-based approach. MDFC is more accurate than traditional median FC-based normalization, especially when the numbers of up- and down-regulated metabolic features are different. In addition, we showcase two essential steps that are overlooked by conventional normalization methods, and incorporated them into MAFFIN. First, instead of using all detected metabolic features, MAFFIN automatically extracts and uses only the high-quality features to calculate FCs and determine the normalization factor. In particular, multiple orthogonal criteria are proposed to pick up the high-quality features. Second, to guarantee the accuracy of the FCs, the MS signal intensities of the high-quality features are corrected using serial quality control (QC) samples. Using simulated data and urine metabolomics datasets, we demonstrated the critical need of high-quality feature selection, MS signal correction, and MDFC. We also show the superior performance of MAFFIN over other commonly used post-acquisition sample normalization methods. Finally, a biological application on a human saliva metabolomics study shows that MAFFIN provides robust sample normalization, leading to better data separation in principal component analysis (PCA) and the identification of more significantly altered metabolic features.

A comparative simulation of normalization methods for machine learning-based intrusion detection systems using KDD Cup’99 dataset

Journal of Intelligent & Fuzzy Systems ◽

10.3233/jifs-211191 ◽

2021 ◽

pp. 1-18

Author(s):

Satish Kumar ◽

Sunanda Gupta ◽

Sakshi Arora

Keyword(s):

Machine Learning ◽

Intrusion Detection ◽

Network Traffic ◽

Intrusion Detection Systems ◽

High Dimensional ◽

Traffic Data ◽

Detection Systems ◽

Dimensional Network ◽

Normalization Methods ◽

Kdd Cup 99

Network Intrusion detection systems (NIDS) detect malicious and intrusive information in computer networks. Presently, commercial NIDS is based on machine learning approaches that have complex algorithms and increase intrusion detection efficiency and efficacy. These machine learning-based NIDS use high dimensional network traffic data from which intrusive information is to be detected. This high-dimensional network traffic data in NIDS needs to be preprocessed and normalized to make it suitable for machine learning tools. A machine learning approach with appropriate normalization and prepossessing increases NIDS performance. This paper presents an empirical study on various normalization methods implemented on a benchmark network traffic dataset, KDD Cup’99, that has been used to evaluate the NIDS model. The present study shows decimal normalization has a better prediction performance than non-normalized traffic data categorized into ‘normal’ or ‘intrusive’ classes.

Normalisation of SARS-CoV-2 concentrations in wastewater: the use of flow, conductivity and CrAssphages

10.1101/2021.11.30.21266889 ◽

2021 ◽

Author(s):

Jeroen Langeveld ◽

Remy Schilperoort ◽

Leo Heijnen ◽

goffe elsinga ◽

claudia schapendonk ◽

...

Keyword(s):

Public Health ◽

Domestic Wastewater ◽

Normalization Method ◽

Short Term ◽

Flow Monitoring ◽

Virus Rna ◽

Normalization Methods ◽

Stool Samples ◽

Quality Check ◽

Over Time

Over the course of the COVID-19 pandemic in 2020-2021, monitoring of SARS-CoV-2 RNA in wastewater has rapidly evolved into a supplementary surveillance instrument for public health. Short term trends (2 weeks) are used as a basis for policy and decision making on measures for dealing with the pandemic. Normalization is required to account for the varying dilution rates of the domestic wastewater, that contains the shedded virus RNA. The dilution rate varies due to runoff, industrial discharges and extraneous waters. Three normalization methods using flow, conductivity and CrAssphage, have been investigated on 9 monitoring locations between Sep 2020 and Aug 2021, rendering 1071 24-hour flow-proportional samples. In addition, 221 stool samples have been analyzed to determine the daily CrAssphage load per person. Results show that flow normalization supported by a quality check using conductivity monitoring is the advocated normalization method in case flow monitoring is or can be made available. Although Crassphage shedding rates per person vary greatly, the CrAssphage loads were very consistent over time and space and direct CrAssphage based normalization can be applied reliably for populations of 5600 and above.

A New Methodological Approach for the Evaluation of Scaling Up a Latent Storage Module for Integration in Heat Pumps

Energies ◽

10.3390/en14227470 ◽

2021 ◽

Vol 14 (22) ◽

pp. 7470

Author(s):

Gabriel Zsembinszki ◽

Boniface Dominick Mselle ◽

David Vérez ◽

Emiliano Borri ◽

Andreas Strehlow ◽

...

Keyword(s):

Energy Storage ◽

Thermal Energy ◽

Thermal Energy Storage ◽

Performance Indicators ◽

Methodological Approach ◽

Scaling Up ◽

Energy Performance ◽

Heat Pumps ◽

Normalization Methods ◽

Comparison Of The Results

A clear gap was identified in the literature regarding the in-depth evaluation of scaling up thermal energy storage components. To cover such a gap, a new methodological approach was developed and applied to a novel latent thermal energy storage module. The purpose of this paper is to identify some key aspects to be considered when scaling up the module from lab-scale to full-scale using different performance indicators calculated in both charge and discharge. Different normalization methods were applied to allow an appropriate comparison of the results at both scales. As a result of the scaling up, the theoretical energy storage capacity increases by 52% and 145%, the average charging power increases by 21% and 94%, while the average discharging power decreases by 16% but increases by 36% when mass and volume normalization methods are used, respectively. When normalization by the surface area of heat transfer is used, all of the above performance indicators decrease, especially the average discharging power, which decreases by 49%. Moreover, energy performance in charge and discharge decreases by 17% and 15%, respectively. However, efficiencies related to charging, discharging, and round-trip processes are practically not affected by the scaling up.

Collaborative Representation for Visible to Band Gender Classification Using Multi-spectral Imaging: Extensive Evaluations by Exploring 22 Photometric Normalization Methods

SN Computer Science ◽

10.1007/s42979-021-00910-3 ◽

2021 ◽

Vol 2 (6) ◽

Author(s):

Narayan Vetrekar ◽

Aparajita Naik ◽

R. S. Gad

Keyword(s):

Spectral Imaging ◽

Collaborative Representation ◽

Gender Classification ◽

Normalization Methods

A Systematic Interrogation of MHC Class I Antigen Presentation Identifies Constitutive and Compensatory Protein Degradation Pathways

10.1101/2021.10.07.463289 ◽

2021 ◽

Author(s):

Jennifer L. Mamrosh ◽

Jing Li ◽

David J. Sherman ◽

Annie Moradian ◽

Michael J. Sweredoski ◽

...

Keyword(s):

Protein Degradation ◽

Antigen Presentation ◽

Mhc Class I ◽

Degradation Products ◽

Proteasome Inhibition ◽

Class I ◽

Degradation Pathways ◽

Peptide Antigens ◽

Mitochondrial Inner Membrane ◽

Normalization Methods

Protein degradation products are constitutively presented as peptide antigens by MHC Class I. While hypervariability of Class I genes is known to tremendously impact antigen presentation, whether differential function of protein degradation pathways (comprising >1000 genes) could alter antigen generation remains poorly understood apart from a few model substrates. In this study, we introduce normalization methods for quantitative antigen mass spectrometry and confirm that most Class I antigens are dependent on ubiquitination and proteasomal degradation. Remarkably, many antigens derived from mitochondrial inner membrane proteins are not. Additionally, we find that atypical antigens can arise from compensatory protein degradation pathways, such as an increase in mitochondrial and membrane protein antigen presentation upon proteasome inhibition. Notably, incomplete inhibition of protein degradation pathways may have clinical utility in cancer immunotherapy, as evidenced by appearance of novel antigens upon partial proteasome inhibition.

cdev: a ground-truth based measure to evaluate RNA-seq normalization performance

PeerJ ◽

10.7717/peerj.12233 ◽

2021 ◽

Vol 9 ◽

pp. e12233

Author(s):

Diem-Trang Tran ◽

Matthew Might

Keyword(s):

Data Collection ◽

Performance Measures ◽

Condition Number ◽

Ad Hoc ◽

Ground Truth ◽

Active Area ◽

Rna Seq ◽

Normalization Methods ◽

Downstream Analysis ◽

Expression Matrix

Normalization of RNA-seq data has been an active area of research since the problem was first recognized a decade ago. Despite the active development of new normalizers, their performance measures have been given little attention. To evaluate normalizers, researchers have been relying on ad hoc measures, most of which are either qualitative, potentially biased, or easily confounded by parametric choices of downstream analysis. We propose a metric called condition-number based deviation, or cdev, to quantify normalization success. cdev measures how much an expression matrix differs from another. If a ground truth normalization is given, cdev can then be used to evaluate the performance of normalizers. To establish experimental ground truth, we compiled an extensive set of public RNA-seq assays with external spike-ins. This data collection, together with cdev, provides a valuable toolset for benchmarking new and existing normalization methods.

FLINO—A new method for immunofluorescence bioimage normalization

Bioinformatics ◽

10.1093/bioinformatics/btab686 ◽

2021 ◽

Author(s):

John Graf ◽

Sanghee Cho ◽

Elizabeth McDonough ◽

Alex Corwin ◽

Anup Sood ◽

...

Keyword(s):

Single Cells ◽

Ground Truth ◽

Immunofluorescence Staining ◽

Alternative Methods ◽

Supplementary Information ◽

Great Promise ◽

Underlying Structure ◽

Batch Effects ◽

Normalization Methods ◽

Segmented Cell

Abstract Motivation Multiplexed immunofluorescence bioimaging of single-cells and their spatial organization in tissue holds great promise to the development of future precision diagnostics and therapeutics. Current multiplexing pipelines typically involve multiple rounds of immunofluorescence staining across multiple tissue slides. This introduces experimental batch effects that can hide underlying biological signal. It is important to have robust algorithms that can correct for the batch effects while not introducing biases into the data. Performance of data normalization methods can vary among different assay pipelines. To evaluate differences, it is critical to have a ground truth dataset that is representative of the assay. Results A new immunoFLuorescence Image NOrmalization (FLINO) method is presented and evaluated against alternative methods and workflows. Multi-round immunofluorescence staining of the same tissue with the nuclear dye DAPI was used to represent virtual slides and a ground truth. DAPI was re-stained on a given tissue slide producing multiple images of the same underlying structure but undergoing multiple representative tissue handling steps. This ground truth dataset was used to evaluate and compare multiple normalization methods including median, quantile, smooth quantile, median ratio normalization (MRN) and trimmed mean of the M-values (TMM). These methods were applied in both an unbiased grid object and segmented cell object workflow to 24 multiplexed biomarkers. An upper quartile normalization of grid objects in log space was found to obtain almost equivalent performance to directly normalizing segmented cell objects by the middle quantile. The developed grid-based technique was then applied with on-slide controls for evaluation. Using five or fewer controls per slide can introduce biases into the data. Ten or more on-slide controls were able to robustly correct for batch effects. Supplementary information Supplementary data are available at Bioinformatics online.

Effect of normalization methods on rank performance in single valued m-polar fuzzy ELECTRE-I algorithm

Materials Today Proceedings ◽

10.1016/j.matpr.2021.10.146 ◽

2021 ◽

Author(s):

Madan Jagtap ◽

Prasad Karande

Keyword(s):

Normalization Methods ◽

Electre I

The Performance of BERT as Data Representation of Text Clustering

10.21203/rs.3.rs-940164/v1 ◽

2021 ◽

Author(s):

Alvin Subakti ◽

Hendri Murfi ◽

Nora Hariadi

Keyword(s):

Feature Extraction ◽

Clustering Algorithm ◽

Clustering Algorithms ◽

Text Clustering ◽

Data Representation ◽

Text Representation ◽

Inverse Document Frequency ◽

Document Frequency ◽

Normalization Methods ◽

Textual Data

Abstract Text clustering is the task of grouping a set of texts so that text in the same group will be more similar than those from a different group. The process of grouping text manually requires a significant amount of time and labor. Therefore, automation utilizing machine learning is necessary. The standard method used to represent textual data is Term Frequency Inverse Document Frequency (TFIDF). However, TFIDF cannot consider the position and context of a word in a sentence. Bidirectional Encoder Representation from Transformers (BERT) model can produce text representation that incorporates the position and context of a word in a sentence. This research analyzed the performance of the BERT model as data representation for text. Moreover, various feature extraction and normalization methods are also applied for the data representation of the BERT model. To examine the performances of BERT, we use four clustering algorithms, i.e., k-means clustering, eigenspace-based fuzzy c-means, deep embedded clustering, and improved deep embedded clustering. Our simulations show that BERT outperforms the standard TFIDF method in 28 out of 36 metrics. Furthermore, different feature extraction and normalization produced varied performances. The usage of these feature extraction and normalization must be altered depending on the text clustering algorithm used.

normalization methods
Recently Published Documents

TOTAL DOCUMENTS

H-INDEX

MAFFIN: Metabolomics Sample Normalization Using Maximal Density Fold Change with High-Quality Metabolic Features and Corrected Signal Intensities

A comparative simulation of normalization methods for machine learning-based intrusion detection systems using KDD Cup’99 dataset

Normalisation of SARS-CoV-2 concentrations in wastewater: the use of flow, conductivity and CrAssphages

A New Methodological Approach for the Evaluation of Scaling Up a Latent Storage Module for Integration in Heat Pumps

Collaborative Representation for Visible to Band Gender Classification Using Multi-spectral Imaging: Extensive Evaluations by Exploring 22 Photometric Normalization Methods

A Systematic Interrogation of MHC Class I Antigen Presentation Identifies Constitutive and Compensatory Protein Degradation Pathways

cdev: a ground-truth based measure to evaluate RNA-seq normalization performance

FLINO—A new method for immunofluorescence bioimage normalization

Effect of normalization methods on rank performance in single valued m-polar fuzzy ELECTRE-I algorithm

The Performance of BERT as Data Representation of Text Clustering

Export Citation Format

normalization methodsRecently Published Documents

TOTAL DOCUMENTS

H-INDEX

MAFFIN: Metabolomics Sample Normalization Using Maximal Density Fold Change with High-Quality Metabolic Features and Corrected Signal Intensities

A comparative simulation of normalization methods for machine learning-based intrusion detection systems using KDD Cup’99 dataset

Normalisation of SARS-CoV-2 concentrations in wastewater: the use of flow, conductivity and CrAssphages

A New Methodological Approach for the Evaluation of Scaling Up a Latent Storage Module for Integration in Heat Pumps

Collaborative Representation for Visible to Band Gender Classification Using Multi-spectral Imaging: Extensive Evaluations by Exploring 22 Photometric Normalization Methods

A Systematic Interrogation of MHC Class I Antigen Presentation Identifies Constitutive and Compensatory Protein Degradation Pathways

cdev: a ground-truth based measure to evaluate RNA-seq normalization performance

FLINO—A new method for immunofluorescence bioimage normalization

Effect of normalization methods on rank performance in single valued m-polar fuzzy ELECTRE-I algorithm

The Performance of BERT as Data Representation of Text Clustering

normalization methods
Recently Published Documents