scholarly journals An ensemble learning approach to auto-annotation for whole-brain C. elegans imaging

2017 ◽  
Author(s):  
S. Wu ◽  
Y. Toyoshima ◽  
M.S. Jang ◽  
M. Kanamori ◽  
T. Teramoto ◽  
...  

AbstractShifting from individual neuron analysis to whole-brain neural network analysis opens up new research opportunities for Caenorhabditis elegans (C. elegans). An automated data processing pipeline, including neuron detection, segmentation, tracking and annotation, will significantly improve the efficiency of analyzing whole-brain C. elegans imaging. The resulting large data sets may motivate new scientific discovery by exploiting many promising analysis tools for big data. In this study, we focus on the development of an automated annotation procedure. With only around 180 neurons in the central nervous system of a C. elegans, the annotation of each individual neuron still remains a major challenge because of the high density in space, similarity in neuron shape, unpredictable distortion of the worm’s head during motion, intrinsic variations during worm development, etc. We use an ensemble learning approach to achieve around 25% error for a test based on real experimental data. Also, we demonstrate the importance of exploring extra source of information for annotation other than the neuron positions.

2017 ◽  
Vol 12 (7) ◽  
pp. 851-855 ◽  
Author(s):  
Louis Passfield ◽  
James G. Hopker

This paper explores the notion that the availability and analysis of large data sets have the capacity to improve practice and change the nature of science in the sport and exercise setting. The increasing use of data and information technology in sport is giving rise to this change. Web sites hold large data repositories, and the development of wearable technology, mobile phone applications, and related instruments for monitoring physical activity, training, and competition provide large data sets of extensive and detailed measurements. Innovative approaches conceived to more fully exploit these large data sets could provide a basis for more objective evaluation of coaching strategies and new approaches to how science is conducted. An emerging discipline, sports analytics, could help overcome some of the challenges involved in obtaining knowledge and wisdom from these large data sets. Examples of where large data sets have been analyzed, to evaluate the career development of elite cyclists and to characterize and optimize the training load of well-trained runners, are discussed. Careful verification of large data sets is time consuming and imperative before useful conclusions can be drawn. Consequently, it is recommended that prospective studies be preferred over retrospective analyses of data. It is concluded that rigorous analysis of large data sets could enhance our knowledge in the sport and exercise sciences, inform competitive strategies, and allow innovative new research and findings.


Neurology ◽  
2020 ◽  
Vol 94 (12) ◽  
pp. 526-537 ◽  
Author(s):  
Codrin Lungu ◽  
Laurie Ozelius ◽  
David Standaert ◽  
Mark Hallett ◽  
Beth-Anne Sieber ◽  
...  

ObjectiveDystonia is a complex movement disorder. Research progress has been difficult, particularly in developing widely effective therapies. This is a review of the current state of knowledge, research gaps, and proposed research priorities.MethodsThe NIH convened leaders in the field for a 2-day workshop. The participants addressed the natural history of the disease, the underlying etiology, the pathophysiology, relevant research technologies, research resources, and therapeutic approaches and attempted to prioritize dystonia research recommendations.ResultsThe heterogeneity of dystonia poses challenges to research and therapy development. Much can be learned from specific genetic subtypes, and the disorder can be conceptualized along clinical, etiology, and pathophysiology axes. Advances in research technology and pooled resources can accelerate progress. Although etiologically based therapies would be optimal, a focus on circuit abnormalities can provide a convergent common target for symptomatic therapies across dystonia subtypes. The discussions have been integrated into a comprehensive review of all aspects of dystonia.ConclusionOverall research priorities include the generation and integration of high-quality phenotypic and genotypic data, reproducing key features in cellular and animal models, both of basic cellular mechanisms and phenotypes, leveraging new research technologies, and targeting circuit-level dysfunction with therapeutic interventions. Collaboration is necessary both for collection of large data sets and integration of different research methods.


eLife ◽  
2021 ◽  
Vol 10 ◽  
Author(s):  
Shivesh Chaudhary ◽  
Sol Ah Lee ◽  
Yueyi Li ◽  
Dhaval S Patel ◽  
Hang Lu

Although identifying cell names in dense image stacks is critical in analyzing functional whole-brain data enabling comparison across experiments, unbiased identification is very difficult, and relies heavily on researchers' experiences. Here we present a probabilistic-graphical-model framework, CRF_ID, based on Conditional Random Fields, for unbiased and automated cell identification. CRF_ID focuses on maximizing intrinsic similarity between shapes. Compared to existing methods, CRF_ID achieves higher accuracy on simulated and ground-truth experimental datasets, and better robustness against challenging noise conditions common in experimental data. CRF_ID can further boost accuracy by building atlases from annotated data in highly computationally efficient manner, and by easily adding new features (e.g. from new strains). We demonstrate cell annotation in C. elegans images across strains, animal orientations, and tasks including gene-expression localization, multi-cellular and whole-brain functional imaging experiments. Together, these successes demonstrate that unbiased cell annotation can facilitate biological discovery, and this approach may be valuable to annotation tasks for other systems.


Molecules ◽  
2019 ◽  
Vol 24 (23) ◽  
pp. 4292 ◽  
Author(s):  
Daniel Midkiff ◽  
Adriana San-Miguel

The nematode Caenorhabditis elegans is a powerful model organism that has been widely used to study molecular biology, cell development, neurobiology, and aging. Despite their use for the past several decades, the conventional techniques for growth, imaging, and behavioral analysis of C. elegans can be cumbersome, and acquiring large data sets in a high-throughput manner can be challenging. Developments in microfluidic “lab-on-a-chip” technologies have improved studies of C. elegans by increasing experimental control and throughput. Microfluidic features such as on-chip control layers, immobilization channels, and chamber arrays have been incorporated to develop increasingly complex platforms that make experimental techniques more powerful. Genetic and chemical screens are performed on C. elegans to determine gene function and phenotypic outcomes of perturbations, to test the effect that chemicals have on health and behavior, and to find drug candidates. In this review, we will discuss microfluidic technologies that have been used to increase the throughput of genetic and chemical screens in C. elegans. We will discuss screens for neurobiology, aging, development, behavior, and many other biological processes. We will also discuss robotic technologies that assist in microfluidic screens, as well as alternate platforms that perform functions similar to microfluidics.


2021 ◽  
Vol 8 (1) ◽  
pp. 205395172110207
Author(s):  
Simon Aagaard Enni ◽  
Maja Bak Herrie

Machine learning (ML) systems have shown great potential for performing or supporting inferential reasoning through analyzing large data sets, thereby potentially facilitating more informed decision-making. However, a hindrance to such use of ML systems is that the predictive models created through ML are often complex, opaque, and poorly understood, even if the programs “learning” the models are simple, transparent, and well understood. ML models become difficult to trust, since lay-people, specialists, and even researchers have difficulties gauging the reasonableness, correctness, and reliability of the inferences performed. In this article, we argue that bridging this gap in the understanding of ML models and their reasonableness requires a focus on developing an improved methodology for their creation. This process has been likened to “alchemy” and criticized for involving a large degree of “black art,” owing to its reliance on poorly understood “best practices”. We soften this critique and argue that the seeming arbitrariness often is the result of a lack of explicit hypothesizing stemming from an empiricist and myopic focus on optimizing for predictive performance rather than from an occult or mystical process. We present some of the problems resulting from the excessive focus on optimizing generalization performance at the cost of hypothesizing about the selection of data and biases. We suggest embedding ML in a general logic of scientific discovery similar to the one presented by Charles Sanders Peirce, and present a recontextualized version of Peirce’s scientific hypothesis adjusted to ML.


Author(s):  
Sven Dorkenwald ◽  
Claire McKellar ◽  
Thomas Macrina ◽  
Nico Kemnitz ◽  
Kisuk Lee ◽  
...  

ABSTRACTDue to advances in automated image acquisition and analysis, new whole-brain connectomes beyond C. elegans are finally on the horizon. Proofreading of whole-brain automated reconstructions will require many person-years of effort, due to the huge volumes of data involved. Here we present FlyWire, an online community for proofreading neural circuits in a fly brain, and explain how its computational and social structures are organized to scale up to whole-brain connectomics. Browser-based 3D interactive segmentation by collaborative editing of a spatially chunked supervoxel graph makes it possible to distribute proofreading to individuals located virtually anywhere in the world. Information in the edit history is programmatically accessible for a variety of uses such as estimating proofreading accuracy or building incentive systems. An open community accelerates proofreading by recruiting more participants, and accelerates scientific discovery by requiring information sharing. We demonstrate how FlyWire enables circuit analysis by reconstructing and analysing the connectome of mechanosensory neurons.


Author(s):  
John A. Hunt

Spectrum-imaging is a useful technique for comparing different processing methods on very large data sets which are identical for each method. This paper is concerned with comparing methods of electron energy-loss spectroscopy (EELS) quantitative analysis on the Al-Li system. The spectrum-image analyzed here was obtained from an Al-10at%Li foil aged to produce δ' precipitates that can span the foil thickness. Two 1024 channel EELS spectra offset in energy by 1 eV were recorded and stored at each pixel in the 80x80 spectrum-image (25 Mbytes). An energy range of 39-89eV (20 channels/eV) are represented. During processing the spectra are either subtracted to create an artifact corrected difference spectrum, or the energy offset is numerically removed and the spectra are added to create a normal spectrum. The spectrum-images are processed into 2D floating-point images using methods and software described in [1].


Author(s):  
Thomas W. Shattuck ◽  
James R. Anderson ◽  
Neil W. Tindale ◽  
Peter R. Buseck

Individual particle analysis involves the study of tens of thousands of particles using automated scanning electron microscopy and elemental analysis by energy-dispersive, x-ray emission spectroscopy (EDS). EDS produces large data sets that must be analyzed using multi-variate statistical techniques. A complete study uses cluster analysis, discriminant analysis, and factor or principal components analysis (PCA). The three techniques are used in the study of particles sampled during the FeLine cruise to the mid-Pacific ocean in the summer of 1990. The mid-Pacific aerosol provides information on long range particle transport, iron deposition, sea salt ageing, and halogen chemistry.Aerosol particle data sets suffer from a number of difficulties for pattern recognition using cluster analysis. There is a great disparity in the number of observations per cluster and the range of the variables in each cluster. The variables are not normally distributed, they are subject to considerable experimental error, and many values are zero, because of finite detection limits. Many of the clusters show considerable overlap, because of natural variability, agglomeration, and chemical reactivity.


Sign in / Sign up

Export Citation Format

Share Document