scholarly journals Various Approaches to the Quantitative Evaluation of Biological and Medical Data Using Mathematical Models

Symmetry ◽  
2021 ◽  
Vol 14 (1) ◽  
pp. 7
Author(s):  
Mária Ždímalová ◽  
Anuprava Chatterjee ◽  
Helena Kosnáčová ◽  
Mridul Ghosh ◽  
Sk Md Obaidullah ◽  
...  

Biomedical data (structured and unstructured) has grown dramatically in strength and volume over the last few years. Innovative, intelligent, and autonomous scientific approaches are needed to examine the large data sets that are gradually becoming widely available. In order to predict unique symmetric and asymmetric patterns, there is also an increasing demand for designing, analyzing, and understanding such complicated data sets. In this paper, we focused on a different way of processing biological and medical data. We provide an overview of known methods as well as a look at optimized mathematical approaches in the field of biological data analysis. We deal with the RGB threshold algorithm, new filtering based on the histogram and on the RGB model, the Image J program, and the structural similarity index method (SSIM) approaches. Finally, we compared the results with the open-source software. We can confirm that our own software based on new mathematical models is an extremely suitable tool for processing biological images and is important in research areas such as the detection of iron in biological samples. We study even symmetric and asymmetric properties of the iron existence as a design analysis of the biological real data. Unique approaches for clinical information gathering, organizing, analysis, information retrieval, and inventive implementation of contemporary computing approaches are all part of this research project, which has much potential in biomedical research. These cutting-edge multidisciplinary techniques will enable the detection and retrieval of important symmetric and asymmetric patterns, as well as the faster finding of pertinent data and the opening of novel learning pathways.

2014 ◽  
Vol 11 (2) ◽  
pp. 68-79
Author(s):  
Matthias Klapperstück ◽  
Falk Schreiber

Summary The visualization of biological data gained increasing importance in the last years. There is a large number of methods and software tools available that visualize biological data including the combination of measured experimental data and biological networks. With growing size of networks their handling and exploration becomes a challenging task for the user. In addition, scientists also have an interest in not just investigating a single kind of network, but on the combination of different types of networks, such as metabolic, gene regulatory and protein interaction networks. Therefore, fast access, abstract and dynamic views, and intuitive exploratory methods should be provided to search and extract information from the networks. This paper will introduce a conceptual framework for handling and combining multiple network sources that enables abstract viewing and exploration of large data sets including additional experimental data. It will introduce a three-tier structure that links network data to multiple network views, discuss a proof of concept implementation, and shows a specific visualization method for combining metabolic and gene regulatory networks in an example.


2013 ◽  
Vol 3 (4) ◽  
pp. 31-46 ◽  
Author(s):  
Hanaa Ismail Elshazly ◽  
Ahmad Taher Azar ◽  
Aboul Ella Hassanien ◽  
Abeer Mohamed Elkorany

Computational intelligence provides the biomedical domain by a significant support. The application of machine learning techniques in medical applications have been evolved from the physician needs. Screening, medical images, pattern classification, prognosis are some examples of health care support systems. Typically medical data has its own characteristics such as huge size and features, continuous and real attributes that refer to patients' investigations. Therefore, discretization and feature selection process are considered a key issue in improving the extracted knowledge from patients' investigations records. In this paper, a hybrid system that integrates Rough Set (RS) and Genetic Algorithm (GA) is presented for the efficient classification of medical data sets of different sizes and dimensionalities. Genetic Algorithm is applied with the aim of reducing the dimension of medical datasets and RS decision rules were used for efficient classification. Furthermore, the proposed system applies the Entropy Gain Information (EI) for discretization process. Four biomedical data sets are tested by the proposed system (EI-GA-RS), and the highest score was obtained through three different datasets. Other different hybrid techniques shared the proposed technique the highest accuracy but the proposed system preserves its place as one of the highest results systems four three different sets. EI as discretization technique also is a common part for the best results in the mentioned datasets while RS as an evaluator realized the best results in three different data sets.


Understanding the plankton ecosystem in the ocean requires detailed demographic analysis. It is impossible to sample the ocean adequately for such analysis, but progress can be made by analysing data sets generated by mathematical models provided they realistically simulate the ecosystem. The Lagrangian Ensemble method is well suited to demographic studies because it generates large data sets containing complete information on all the families living in the simulated ecosystem. It provides audit trails of individual families for unambiguous analysis of mechanisms responsible for the simulated changes in community and environment. Recent papers based on the Lagrangian Ensemble method are reviewed.


2020 ◽  
Vol 36 (4) ◽  
pp. 803-825
Author(s):  
Marco Fortini

AbstractRecord linkage addresses the problem of identifying pairs of records coming from different sources and referred to the same unit of interest. Fellegi and Sunter propose an optimal statistical test in order to assign the match status to the candidate pairs, in which the needed parameters are obtained through EM algorithm directly applied to the set of candidate pairs, without recourse to training data. However, this procedure has a quadratic complexity as the two lists to be matched grow. In addition, a large bias of EM-estimated parameters is also produced in this case, so that the problem is tackled by reducing the set of candidate pairs through filtering methods such as blocking. Unfortunately, the probability that excluded pairs would be actually true-matches cannot be assessed through such methods.The present work proposes an efficient approach in which the comparison of records between lists are minimised while the EM estimates are modified by modelling tables with structural zeros in order to obtain unbiased estimates of the parameters. Improvement achieved by the suggested method is shown by means of simulations and an application based on real data.


Author(s):  
Lawrence O. Hall ◽  
Dmitry B. Goldgof ◽  
Juana Canul-Reich ◽  
Prodip Hore ◽  
Weijian Cheng ◽  
...  

This chapter examines how to scale algorithms which learn fuzzy models from the increasing amounts of labeled or unlabeled data that are becoming available. Large data repositories are increasingly available, such as records of network transmissions, customer transactions, medical data, and so on. A question arises about how to utilize the data effectively for both supervised and unsupervised fuzzy learning. This chapter will focus on ensemble approaches to learning fuzzy models for large data sets which may be labeled or unlabeled. Further, the authors examine ways of scaling fuzzy clustering to extremely large data sets. Examples from existing data repositories, some quite large, will be given to show the approaches discussed here are effective.


Author(s):  
Denise Fukumi Tsunoda ◽  
Heitor Silvério Lopes ◽  
Ana Tereza Vasconcelos

Bioinformatics means solving problems arising from biology using methods from computer science. The National Center for Biotechnology Information (www.ncbi.nih.gov) defines bioinformatics as: “…the field of science in which biology, computer science, and information technology merge into a single discipline...There are three important sub-disciplines within bioinformatics: the development of new algorithms and statistics with which to access relationships among members of large data sets; the analysis and interpretation of various types of data including nucleotide and amino acid sequences, protein domains, and protein structures; and the development and implementation of tools that enable efficient access and management of different types of information.”


Entropy ◽  
2021 ◽  
Vol 23 (5) ◽  
pp. 552
Author(s):  
Hamid Mousavi ◽  
Mareike Buhl ◽  
Enrico Guiraud ◽  
Jakob Drefs ◽  
Jörg Lücke

Latent Variable Models (LVMs) are well established tools to accomplish a range of different data processing tasks. Applications exploit the ability of LVMs to identify latent data structure in order to improve data (e.g., through denoising) or to estimate the relation between latent causes and measurements in medical data. In the latter case, LVMs in the form of noisy-OR Bayes nets represent the standard approach to relate binary latents (which represent diseases) to binary observables (which represent symptoms). Bayes nets with binary representation for symptoms may be perceived as a coarse approximation, however. In practice, real disease symptoms can range from absent over mild and intermediate to very severe. Therefore, using diseases/symptoms relations as motivation, we here ask how standard noisy-OR Bayes nets can be generalized to incorporate continuous observables, e.g., variables that model symptom severity in an interval from healthy to pathological. This transition from binary to interval data poses a number of challenges including a transition from a Bernoulli to a Beta distribution to model symptom statistics. While noisy-OR-like approaches are constrained to model how causes determine the observables’ mean values, the use of Beta distributions additionally provides (and also requires) that the causes determine the observables’ variances. To meet the challenges emerging when generalizing from Bernoulli to Beta distributed observables, we investigate a novel LVM that uses a maximum non-linearity to model how the latents determine means and variances of the observables. Given the model and the goal of likelihood maximization, we then leverage recent theoretical results to derive an Expectation Maximization (EM) algorithm for the suggested LVM. We further show how variational EM can be used to efficiently scale the approach to large networks. Experimental results finally illustrate the efficacy of the proposed model using both synthetic and real data sets. Importantly, we show that the model produces reliable results in estimating causes using proofs of concepts and first tests based on real medical data and on images.


2021 ◽  
Author(s):  
Stephen Taylor

Molecular biology experiments are generating an unprecedented amount of information from a variety of different experimental modalities. DNA sequencing machines, proteomics mass cytometry and microscopes generate huge amounts of data every day. Not only is the data large, but it is also multidimensional. Understanding trends and getting actionable insights from these data requires techniques that allow comprehension at a high level but also insight into what underlies these trends. Lots of small errors or poor summarization can lead to false results and reproducibility issues in large data sets. Hence it is essential we do not cherry-pick results to suit a hypothesis but instead examine all data and publish accurate insights in a data-driven way. This article will give an overview of some of the problems faced by the researcher in understanding epigenetic changes (which are related to changes in the physical structure of DNA) when presented with raw analysis results using visualization methods. We will also discuss the new challenges faced by using machine learning which can be helped by visualization.


2014 ◽  
Author(s):  
R Daniel Kortschak ◽  
David L Adelson

bíogo is a framework designed to ease development and maintenance of computationally intensive bioinformatics applications. The library is written in the Go programming language, a garbage-collected, strictly typed compiled language with built in support for concurrent processing, and performance comparable to C and Java. It provides a variety of data types and utility functions to facilitate manipulation and analysis of large scale genomic and other biological data. bíogo uses a concise and expressive syntax, lowering the barriers to entry for researchers needing to process large data sets with custom analyses while retaining computational safety and ease of code review. We believe bíogo provides an excellent environment for training and research in computational biology because of its combination of strict typing, simple and expressive syntax, and high performance.


2013 ◽  
Vol 4 (2) ◽  
pp. 31-50 ◽  
Author(s):  
Simon Andrews ◽  
Constantinos Orphanides

Formal Concept Analysis (FCA) has been successfully applied to data in a number of problem domains. However, its use has tended to be on an ad hoc, bespoke basis, relying on FCA experts working closely with domain experts and requiring the production of specialised FCA software for the data analysis. The availability of generalised tools and techniques, that might allow FCA to be applied to data more widely, is limited. Two important issues provide barriers: raw data is not normally in a form suitable for FCA and requires undergoing a process of transformation to make it suitable, and even when converted into a suitable form for FCA, real data sets tend to produce a large number of results that can be difficult to manage and interpret. This article describes how some open-source tools and techniques have been developed and used to address these issues and make FCA more widely available and applicable. Three examples of real data sets, and real problems related to them, are used to illustrate the application of the tools and techniques and demonstrate how FCA can be used as a semantic technology to discover knowledge. Furthermore, it is shown how these tools and techniques enable FCA to deliver a visual and intuitive means of mining large data sets for association and implication rules that complements the semantic analysis. In fact, it transpires that FCA reveals hidden meaning in data that can then be examined in more detail using an FCA approach to traditional data mining methods.


Sign in / Sign up

Export Citation Format

Share Document