Searching for the shadows of giants: characterizing protoclusters with line of sight Lyman-α absorption

ABSTRACT We use state-of-the-art hydrodyamical simulations from the Sherwood, EAGLE, and Illustris projects to examine the signature of Mz = 0 ≃ 1014 M⊙ protoclusters observed in Ly α absorption at z ≃ 2.4. We find that there is a weak correlation between the mass overdensity, δm, and the Ly α effective optical depth relative to the mean, $\delta _{\tau _\textrm{eff}}$, averaged over $15~h^{-1}\, \textrm{cMpc}$ scales, although scatter in the δm–$\delta _{\tau _\textrm{eff}}$ plane means it is not possible to uniquely identify large-scale overdensities with strong Ly α absorption. Although all protoclusters are associated with large-scale mass overdensities, most sightlines through protoclusters in a ∼106$\rm cMpc^{3}$ volume probe the low column density Ly α forest. A small subset of sightlines that pass through protoclusters exhibit coherent, strong Ly α absorption on $15h^{-1}\rm \, cMpc$ scales, although these correspond to a wide range in mass overdensity. Assuming perfect removal of contamination by Ly α absorbers with damping wings, more than half of the remaining sightlines with $\delta _{\tau _{\rm eff}}\gt 3.5$ trace protoclusters. It is furthermore possible to identify a model-dependent $\delta _{\tau _{\rm eff}}$ threshold that selects only protoclusters. However, such regions are rare: excluding absorption caused by damped systems, less than 0.1 per cent of sightlines that pass through a protocluster have $\delta _{\tau _{\rm eff}}\gt 3.5$, meaning that any protocluster sample selected in this manner will also be highly incomplete. On the other hand, coherent regions of Ly α absorption also provide a promising route for identifying and studying filamentary environments at high redshift.

Download Full-text

Mapping large-scale-structure evolution over cosmic times

Experimental Astronomy ◽

10.1007/s10686-021-09755-3 ◽

2021 ◽

Author(s):

Marta B. Silva ◽

Ely D. Kovetz ◽

Garrett K. Keating ◽

Azadeh Moradinezhad Dizgah ◽

Matthieu Bethermin ◽

...

Keyword(s):

Galaxy Formation ◽

Large Scale ◽

High Redshift ◽

Formation Rate ◽

Far Infrared ◽

Large Scale Structure ◽

Scale Structure ◽

Structure Evolution ◽

Intensity Mapping ◽

Wide Range

AbstractThis paper outlines the science case for line-intensity mapping with a space-borne instrument targeting the sub-millimeter (microwaves) to the far-infrared (FIR) wavelength range. Our goal is to observe and characterize the large-scale structure in the Universe from present times to the high redshift Epoch of Reionization. This is essential to constrain the cosmology of our Universe and form a better understanding of various mechanisms that drive galaxy formation and evolution. The proposed frequency range would make it possible to probe important metal cooling lines such as [CII] up to very high redshift as well as a large number of rotational lines of the CO molecule. These can be used to trace molecular gas and dust evolution and constrain the buildup in both the cosmic star formation rate density and the cosmic infrared background (CIB). Moreover, surveys at the highest frequencies will detect FIR lines which are used as diagnostics of galaxies and AGN. Tomography of these lines over a wide redshift range will enable invaluable measurements of the cosmic expansion history at epochs inaccessible to other methods, competitive constraints on the parameters of the standard model of cosmology, and numerous tests of dark matter, dark energy, modified gravity and inflation. To reach these goals, large-scale structure must be mapped over a wide range in frequency to trace its time evolution and the surveyed area needs to be very large to beat cosmic variance. Only a space-borne mission can properly meet these requirements.

Download Full-text

DeepMAsED: evaluating the quality of metagenomic assemblies

Bioinformatics ◽

10.1093/bioinformatics/btaa124 ◽

2020 ◽

Vol 36 (10) ◽

pp. 3011-3017 ◽

Cited By ~ 5

Author(s):

Olga Mineeva ◽

Mateo Rojas-Carulla ◽

Ruth E Ley ◽

Bernhard Schölkopf ◽

Nicholas D Youngblut

Keyword(s):

Large Scale ◽

State Of The Art ◽

Ground Truth ◽

Supplementary Information ◽

Learning Approach ◽

Wide Range ◽

Metagenome Assembly ◽

Model Training ◽

Reference Genomes

Abstract Motivation Methodological advances in metagenome assembly are rapidly increasing in the number of published metagenome assemblies. However, identifying misassemblies is challenging due to a lack of closely related reference genomes that can act as pseudo ground truth. Existing reference-free methods are no longer maintained, can make strong assumptions that may not hold across a diversity of research projects, and have not been validated on large-scale metagenome assemblies. Results We present DeepMAsED, a deep learning approach for identifying misassembled contigs without the need for reference genomes. Moreover, we provide an in silico pipeline for generating large-scale, realistic metagenome assemblies for comprehensive model training and testing. DeepMAsED accuracy substantially exceeds the state-of-the-art when applied to large and complex metagenome assemblies. Our model estimates a 1% contig misassembly rate in two recent large-scale metagenome assembly publications. Conclusions DeepMAsED accurately identifies misassemblies in metagenome-assembled contigs from a broad diversity of bacteria and archaea without the need for reference genomes or strong modeling assumptions. Running DeepMAsED is straight-forward, as well as is model re-training with our dataset generation pipeline. Therefore, DeepMAsED is a flexible misassembly classifier that can be applied to a wide range of metagenome assembly projects. Availability and implementation DeepMAsED is available from GitHub at https://github.com/leylabmpi/DeepMAsED. Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

Exemplar Guided Neural Dialogue Generation

Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2020/498 ◽

2020 ◽

Author(s):

Hengyi Cai ◽

Hongshen Chen ◽

Yonghao Song ◽

Xiaofang Zhao ◽

Dawei Yin

Keyword(s):

Large Scale ◽

State Of The Art ◽

Training Data ◽

Small Subset ◽

Generation Model ◽

Retrieval Model ◽

Training Set ◽

Dialogue Model ◽

Quantitative Metrics ◽

The Given

Humans benefit from previous experiences when taking actions. Similarly, related examples from the training data also provide exemplary information for neural dialogue models when responding to a given input message. However, effectively fusing such exemplary information into dialogue generation is non-trivial: useful exemplars are required to be not only literally-similar, but also topic-related with the given context. Noisy exemplars impair the neural dialogue models understanding the conversation topics and even corrupt the response generation. To address the issues, we propose an exemplar guided neural dialogue generation model where exemplar responses are retrieved in terms of both the text similarity and the topic proximity through a two-stage exemplar retrieval model. In the first stage, a small subset of conversations is retrieved from a training set given a dialogue context. These candidate exemplars are then finely ranked regarding the topical proximity to choose the best-matched exemplar response. To further induce the neural dialogue generation model consulting the exemplar response and the conversation topics more faithfully, we introduce a multi-source sampling mechanism to provide the dialogue model with both local exemplary semantics and global topical guidance during decoding. Empirical evaluations on a large-scale conversation dataset show that the proposed approach significantly outperforms the state-of-the-art in terms of both the quantitative metrics and human evaluations.

Download Full-text

Optimal energy growth in stably stratified turbulent Couette flow

10.5194/egusphere-egu21-10311 ◽

2021 ◽

Author(s):

Grigory Zasko ◽

Andrey Glazunov ◽

Evgeny Mortikov ◽

Yuri Nechepurenko ◽

Pavel Perezhogin

Keyword(s):

Boundary Layer ◽

Turbulent Flow ◽

Couette Flow ◽

Large Scale ◽

Thin Layers ◽

Plane Couette Flow ◽

Turbulent Layer ◽

Wide Range ◽

The Mean ◽

Optimal Disturbances

In this report, we will try to explain the emergence of large-scale organized structures in stably stratified turbulent flows using optimal disturbances of the mean turbulent flow. These structures have been recently obtained in numerical simulations of turbulent stably stratified flows [1] (Ekman layer, LES) and [2] (plane Couette flow, DNS and LES) and indirectly confirmed by field measurements in the stable boundary layer of the atmosphere [1, 2]. In instantaneous temperature fields they manifest themselves as irregular inclined thin layers with large gradients (fronts), spaced from each other by distances comparable to the height of the entire turbulent layer, and separated by regions with weak stratification.Optimal disturbances of a stably stratified turbulent plane Couette flow are investigated in a wide range of Reynolds and Richardson numbers. These disturbances were computed based on a simplified linearized system of equations in which turbulent Reynolds stresses and heat fluxes were approximated by isotropic viscosity and diffusion with coefficients obtained from DNS results. It was shown [3] that the spatial scales and configurations of the inclined structures extracted from DNS data coincide with the ones obtained from optimal disturbances of the mean turbulent flow.Critical value of the stability parameter is found starting from which the optimal disturbances resemble inclined structures. The physical mechanisms that determine the evolution, energetics and spatial configuration of these optimal disturbances are discussed. The effects due to the presence of stable stratification are highlighted.Numerical experiments with optimal disturbances were supported by the RSF (grant No. 17-71-20149). Direct numerical simulation of stratified turbulent Couette flow was supported by the RFBR (grant No. 20-05-00776).References:[1] P.P. Sullivan, J.C. Weil, E.G. Patton, H.J. Jonker, D.V. Mironov. Turbulent winds and temperature fronts in large-eddy simulations of the stable atmospheric boundary layer // J. Atmos. Sci., 2016, V. 73, P. 1815-1840.[2] A.V. Glazunov, E.V. Mortikov, K.V. Barskov, E.V. Kadantsev, S.S. Zilitinkevich. Layered structure of stably stratified turbulent shear flows // Izv. Atmos. Ocean. Phys., 2019, V. 55, P. 312&#8211;323.[3] G.V. Zasko, A.V. Glazunov, E.V. Mortikov, Yu.M. Nechepurenko. Large-scale structures in stratified turbulent Couette flow and optimal disturbances // Russ. J. Num. Anal. Math. Model., 2010, V. 35, P. 35&#8211;53.

Download Full-text

Deep Learning-Based Sentimental Analysis for Large-Scale Imbalanced Twitter Data

Future Internet ◽

10.3390/fi11090190 ◽

2019 ◽

Vol 11 (9) ◽

pp. 190 ◽

Cited By ~ 3

Author(s):

Jamal ◽

Xianqiao ◽

Aldabbas

Keyword(s):

Deep Learning ◽

Large Scale ◽

State Of The Art ◽

Hybrid Approach ◽

Principal Component ◽

Specific Topic ◽

Weighting Method ◽

Psychological Conditions ◽

Twitter Data ◽

Wide Range

Emotions detection in social media is very effective to measure the mood of people about a specific topic, news, or product. It has a wide range of applications, including identifying psychological conditions such as anxiety or depression in users. However, it is a challenging task to distinguish useful emotions’ features from a large corpus of text because emotions are subjective, with limited fuzzy boundaries that may be expressed in different terminologies and perceptions. To tackle this issue, this paper presents a hybrid approach of deep learning based on TensorFlow with Keras for emotions detection on a large scale of imbalanced tweets’ data. First, preprocessing steps are used to get useful features from raw tweets without noisy data. Second, the entropy weighting method is used to compute the importance of each feature. Third, class balancer is applied to balance each class. Fourth, Principal Component Analysis (PCA) is applied to transform high correlated features into normalized forms. Finally, the TensorFlow based deep learning with Keras algorithm is proposed to predict high-quality features for emotions classification. The proposed methodology is analyzed on a dataset of 1,600,000 tweets collected from the website ‘kaggle’. Comparison is made of the proposed approach with other state of the art techniques on different training ratios. It is proved that the proposed approach outperformed among other techniques.

Download Full-text

A large-scale comparative study on peptide encodings for biomedical classification

NAR Genomics and Bioinformatics ◽

10.1093/nargab/lqab039 ◽

2021 ◽

Vol 3 (2) ◽

Author(s):

Sebastian Spänig ◽

Siba Mohsen ◽

Georges Hattab ◽

Anne-Christin Hauschild ◽

Dominik Heider

Keyword(s):

Machine Learning ◽

Large Scale ◽

State Of The Art ◽

Multiple Datasets ◽

Wide Range ◽

Fixed Parameter ◽

Additional Sequence ◽

Automated Machine Learning ◽

Distinct Peptide ◽

Comprehensive Study

Abstract Owing to the great variety of distinct peptide encodings, working on a biomedical classification task at hand is challenging. Researchers have to determine encodings capable to represent underlying patterns as numerical input for the subsequent machine learning. A general guideline is lacking in the literature, thus, we present here the first large-scale comprehensive study to investigate the performance of a wide range of encodings on multiple datasets from different biomedical domains. For the sake of completeness, we added additional sequence- and structure-based encodings. In particular, we collected 50 biomedical datasets and defined a fixed parameter space for 48 encoding groups, leading to a total of 397 700 encoded datasets. Our results demonstrate that none of the encodings are superior for all biomedical domains. Nevertheless, some encodings often outperform others, thus reducing the initial encoding selection substantially. Our work offers researchers to objectively compare novel encodings to the state of the art. Our findings pave the way for a more sophisticated encoding optimization, for example, as part of automated machine learning pipelines. The work presented here is implemented as a large-scale, end-to-end workflow designed for easy reproducibility and extensibility. All standardized datasets and results are available for download to comply with FAIR standards.

Download Full-text

Hierarchical Porous Polyamide 6 by Solution Foaming: Synthesis, Characterization and Properties

Polymers ◽

10.3390/polym10121310 ◽

2018 ◽

Vol 10 (12) ◽

pp. 1310 ◽

Cited By ~ 6

Author(s):

Liang Wang ◽

Yu-Ke Wu ◽

Fang-Fang Ai ◽

Jie Fan ◽

Zhao-Peng Xia ◽

...

Keyword(s):

Large Scale ◽

Polyamide 6 ◽

Solution Concentration ◽

The Novel ◽

Foaming Agent ◽

Hierarchical Porous ◽

Wide Range ◽

The Mean ◽

Novel Strategy

Porous polym er materials have received great interest in both academic and industrial fields due to their wide range of applications. In this work, a porous polyamide 6 (PA6) material was prepared by a facile solution foaming strategy. In this approach, a sodium carbonate (SC) aqueous solution acted as the foaming agent that reacted with formic acid (FA), generating CO2 and causing phase separation of polyamide (PA). The influence of the PA/FA solution concentration and Na2CO3 concentration on the microstructures and physical properties of prepared PA foams were investigated, respectively. PA foams showed a hierarchical porous structure along the foaming direction. The mean pore dimension ranged from hundreds of nanometers to several microns. Low amounts of sodium salt generated from a neutralization reaction played an important role of heterogeneous nucleation, which increased the crystalline degree of PA foams. The porous PA materials exhibited low thermal conductivity, high crystallinity and good mechanical properties. The novel strategy in this work could produce PA foams on a large scale for potential engineering applications.

Download Full-text

Tracing high redshift cosmic web with quasar systems

Proceedings of the International Astronomical Union ◽

10.1017/s1743921316009777 ◽

2014 ◽

Vol 11 (S308) ◽

pp. 161-166

Author(s):

Maret Einasto

Keyword(s):

Large Scale ◽

Random Distribution ◽

High Redshift ◽

Random Systems ◽

Sample Volume ◽

Web Pages ◽

Local Universe ◽

Space Density ◽

Random Samples ◽

The Mean

AbstractWe study the cosmic web at redshifts 1.0 ≤ ≤ 1.8 using quasar systems based on quasar data from the SDSS DR7 QSO catalogue. Quasar systems were determined with a friend-of-friend (FoF) algorithm at a series of linking lengths. At the linking lengths l ≤ 30 h-1 Mpc the diameters of quasar systems are smaller than the diameters of random systems, and are comparable to the sizes of galaxy superclusters in the local Universe. The mean space density of quasar systems is close to the mean space density of local rich superclusters. At larger linking lengths the diameters of quasar systems are comparable with the sizes of supercluster complexes in our cosmic neighbourhood. The richest quasar systems have diameters exceeding 500h Mpc. Very rich systems can be found also in random distribution but the percolating system which penetrate the whole sample volume appears in quasar sample at smaller linking length than in random samples showing that the large-scale distribution of quasar systems differs from random distribution. Quasar system catalogues at our web pages (http://www.aai.ee/maret/QSOsystems.html) serve as a database to search for superclusters of galaxies and to trace the cosmic web at high redshifts.

Download Full-text

The Time-Variant Aerodynamic Response of a Stator Row Including the Effects of Airfoil Camber

Journal of Engineering for Power ◽

10.1115/1.3230257 ◽

1980 ◽

Vol 102 (2) ◽

pp. 334-342 ◽

Cited By ~ 11

Author(s):

S. Fleeter ◽

W. A. Bennett ◽

R. L. Jay

Keyword(s):

Flat Plate ◽

Pressure Difference ◽

Large Scale ◽

State Of The Art ◽

Dynamic Pressure ◽

Pressure Coefficient ◽

Forced Response ◽

Phase Lag ◽

Forcing Function ◽

Wide Range

An experimental investigation was conducted to quantitatively determine the validity and applicability of state-of-the-art transverse gust cascade analyses. This was accomplished by obtaining fundamental time-variant forced response data at realistic values of key parameters in a large-scale, low-speed, single-stage research compressor. The forcing function, the velocity defect created by the rotor blade wakes, was measured with a crossed hot-wire probe. The resulting time-variant aerodynamic response was measured by means of flush mounted high response pressure transducers on both flat plate and cambered airfoil stator vane rows over a wide range of incidence angles. These dynamic data were then analyzed to determine the chordwise variation of the unsteady pressure difference in terms of a dimensionless dynamic pressure coefficient and an aerodynamic phase lag referenced to a transverse gust at the leading edge of the vanes. These dimensionless pressure difference data were all correlated with predictions obtained from a state-of-the-art compressible transverse gust, flat plate cascade analysis. Correlation of the classical flat plate unsteady data with the predictions permitted the range of validity of the analysis to be assessed in terms of incidence angle. Correlation of the cambered vane unsteady data with those for the flat plate and with the predictions allowed the effects of airfoil camber as well as the applicability of the flat plate prediction to realistic cambered airfoils to be quantitatively determined.

Download Full-text

3D chemical structure of diffuse turbulent ISM

Astronomy and Astrophysics ◽

10.1051/0004-6361/202038593 ◽

2020 ◽

Vol 643 ◽

pp. A36 ◽

Cited By ~ 1

Author(s):

E. Bellomi ◽

B. Godard ◽

P. Hennebelle ◽

V. Valdivia ◽

G. Pineau des Forêts ◽

...

Keyword(s):

Magnetic Field ◽

Radiation Field ◽

Uv Radiation ◽

Large Scale ◽

Statistical Properties ◽

Line Of Sight ◽

Neutral Medium ◽

Chemical Information ◽

Wide Range ◽

The Magnetic Field

Context. The amount of data collected by spectrometers from radio to ultraviolet (UV) wavelengths opens a new era where the statistical and chemical information contained in the observations can be used concomitantly to investigate the thermodynamical state and the evolution of the interstellar medium (ISM). Aims. In this paper, we study the statistical properties of the HI-to-H2 transition observed in absorption in the local diffuse and multiphase ISM. Our goal is to identify the physical processes that control the probability of occurrence of any line of sight and the origins of the variations of the integrated molecular fraction from one line of sight to another. Methods. The turbulent diffuse ISM is modeled using the RAMSES code, which includes detailed treatments of the magnetohydrodynamics, the thermal evolution of the gas, and the chemistry of H2. The impacts of the UV radiation field, the mean density, the turbulent forcing, the integral scale, the magnetic field, and the gravity on the molecular content of the gas are explored through a parametric study that covers a wide range of physical conditions. The statistics of the HI-to-H2 transition are interpreted through analytical prescriptions and compared with the observations using a modified and robust version of the Kolmogorov-Smirnov test. Results. The analysis of the observed background sources shows that the lengths of the lines of sight follow a flat distribution in logarithmic scale from ~100 pc to ~3 kpc. Without taking into account any variation of the parameters along a line of sight or from one line of sight to another, the results of one simulation, convolved with the distribution of distances of the observational sample, are able to simultaneously explain the position, the width, the dispersion, and most of the statistical properties of the HI-to-H2 transition observed in the local ISM. The tightest agreement is obtained for a neutral diffuse gas modeled over ~200 pc, with a mean density n̅H̅ = 1−2 cm−3, illuminated by the standard interstellar UV radiation field, and stirred up by a large-scale compressive turbulent forcing. Within this configuration, the 2D probability histogram of the column densities of H and H2, poetically called the kingfisher diagram, is remarkably stable and is almost unaltered by gravity, the strength of the turbulent forcing, the resolution of the simulation, or the strength of the magnetic field Bx, as long as Bx < 4 μG. The weak effect of the resolution and our analytical prescription suggest that the column densities of HI are likely built up in large-scale warm neutral medium and cold neutral medium (CNM) structures correlated in density over ~20 pc and ~10 pc, respectively, while those of H2 are built up in CNM structures between ~3 and ~10 pc. Conclusions. Combining the chemical and statistical information contained in the observations of HI and H2 sheds new light on the study of the diffuse matter. Applying this new tool to several atomic and molecular species is a promising perspective to understanding the effects of turbulence, magnetic field, thermal instability, and gravity on the formation and evolution of molecular clouds.

Download Full-text