EVOLUTION OF INFLUENZA A NUCLEOTIDE SEGMENTS THROUGH THE LENS OF DIFFERENT COMPLEXITY MEASURES
Evolution of influenza viruses is a highly complex process that is still poorly understood. Multiyear persistence of similar variants and accumulating evidences of existence of multigenic traits indicates that influenza viruses operate as integrated units and not only as sets of distinct genes. However, there is still no consensus on whether it is the case, and to what extent. One of the main problems is the lack of framework for analyzing and interpreting large body of available high dimensional genomic, clinical and epidemiological data. By reducing dimensionality of data we intend to show whether in addition to gene-centric selective pressure, the evolution of influenza RNA segments is also shaped by their mutual interactions. Therefore, we will analyze how different complexity/entropy measures (Shannon entropy, topological entropy and Lempel–Ziv complexity) can be used to study evolution of nucleotide segments of different influenza subtypes, while reducing data dimensionality. We show that, at the nucleotide level, multiyear clusters of genome-wide entropy/complexity correlations emerged during the H1N1 pandemic in 2009. Our data are the first empirical results that indirectly support the suggestion that a component of influenza evolutionary dynamics involves correlation between RNA segments. Of all used complexity/entropy measures, Shannon entropy shows the best correlation with epidemiological data.