software toolkit
Recently Published Documents


TOTAL DOCUMENTS

159
(FIVE YEARS 42)

H-INDEX

17
(FIVE YEARS 4)

2021 ◽  
Vol 11 (24) ◽  
pp. 12135
Author(s):  
László Beinrohr ◽  
Eszter Kail ◽  
Péter Piros ◽  
Erzsébet Tóth ◽  
Rita Fleiner ◽  
...  

Data science and machine learning are buzzwords of the early 21st century. Now pervasive through human civilization, how do these concepts translate to use by researchers and clinicians in the life-science and medical field? Here, we describe a software toolkit, just large enough in scale, so that it can be maintained and extended by a small team, optimised for problems that arise in small/medium laboratories. In particular, this system may be managed from data ingestion statistics preparation predictions by a single person. At the system’s core is a graph type database, so that it is flexible in terms of irregular, constantly changing data types, as such data types are common during explorative research. At the system’s outermost shell, the concept of ’user stories’ is introduced to help the end-user researchers perform various tasks separated by their expertise: these range from simple data input, data curation, statistics, and finally to predictions via machine learning algorithms. We compiled a sizable list of already existing, modular Python platform libraries usable for data analysis that may be used as a reference in the field and may be incorporated into this software. We also provide an insight into basic concepts, such as labelled-unlabelled data, supervised vs. unsupervised learning, regression vs. classification, evaluation by different error metrics, and an advanced concept of cross-validation. Finally, we show some examples from our laboratory using our blood sample and blood clot data from thrombosis patients (sufferers from stroke, heart and peripheral thrombosis disease) and how such tools can help to set up realistic expectations and show caveats.


2021 ◽  
pp. 1-35
Author(s):  
Johanna Björklund ◽  
Frank Drewes ◽  
Anna Jonsson

Abstract We show that a previously proposed algorithm for the N-best trees problem can be made more efficient by changing how it arranges and explores the search space. Given an integer N and a weighted tree automaton (wta) M over the tropical semiring, the algorithm computes N trees of minimal weight with respect to M. Compared to the original algorithm, the modifications increase the laziness of the evaluation strategy, which makes the new algorithm asymptotically more efficient than its predecessor. The algorithm is implemented in the software Betty, and compared to the state-of-the-art algorithm for extracting the N best runs, implemented in the software toolkit Tiburon. The data sets used in the experiments are wtas resulting from real-world natural language processing tasks, as well as artificially created wtas with varying degrees of nondeterminism. We find that Betty outperforms Tiburon on all tested data sets with respect to running time, while Tiburon seems to be the more memory-efficient choice.


2021 ◽  
Author(s):  
Cosmin Safta ◽  
Habib Najm ◽  
Oscar Diaz-Ibarra ◽  
Kyungjoo Kim

Author(s):  
Brandon K B Seah ◽  
Estienne C Swart

Abstract Summary Ciliates are single-celled eukaryotes that eliminate specific, interspersed DNA sequences (internally eliminated sequences, IESs) from their genomes during development. These are challenging to annotate and assemble because IES-containing sequences are typically much less abundant in the cell than those without, and IES sequences themselves often contain repetitive and low-complexity sequences. Long read sequencing technologies from Pacific Biosciences and Oxford Nanopore have the potential to reconstruct longer IESs than has been possible with short reads, but require a different assembly strategy. Here we present BleTIES, a software toolkit for detecting, assembling, and analyzing IESs using mapped long reads. Availability and implementation BleTIES is implemented in Python 3. Source code is available at https://github.com/Swart-lab/bleties (MIT license), and also distributed via Bioconda. Supplementary information Benchmarking of BleTIES with published sequence data.


2021 ◽  
Author(s):  
Garrett Smith ◽  
Shravan Vasishth

We present a new software toolkit for implementing a broad class oftheories of sentence processing. In this framework, processing a word ina sentence is viewed as a continuous-time random walk through a set ofdiscrete states that encode information about the emerging structure of thesentence so far. The state space includes one or more special absorbingstates, which, when reached, indicate the decision to move on to the nextword of the sentence. This setup allows us to ask how how long it takesto reach an absorbing state and what the probability of reaching this stateis. We summarize a number of important statistics that can be directlyrelated to human reading times and comprehension question performance.To illustrate the use of the toolkit, we model two types of garden paths,local coherence effects, and the ambiguity advantage using three qualitativelydifferent theories of sentence processing. While the modeler must still makedefensible theoretical and implementation choices, this framework representsan improvement over the descriptive, paper-pencil modeling that is thenorm in psycholinguistics by facilitating quantitative evaluations of modelperformance and laying the groundwork for Bayesian fitting of free parametersin a model. An open-source Python package is provided.


2021 ◽  
Vol 2021 (2) ◽  
pp. 14-18
Author(s):  
Oleg Vdovichenko ◽  
Andrey Averchenkov

The article considers the application of the Apriori association rule construction algorithm to analyze the results of the thyroid gland ultrasound examination. The algorithm is applied to solve a specific problem of organizational support of the thyroid gland examination. A software toolkit has been developed that allows physicians to apply the specified algorithm to carry out the necessary research in the process of solving diagnostic problems.


2021 ◽  
pp. 104864
Author(s):  
Torsten Mayer-Gürr ◽  
Saniya Behzadpour ◽  
Annette Eicker ◽  
Matthias Ellmer ◽  
Beate Koch ◽  
...  

2021 ◽  
Author(s):  
Gavin Poludniowski ◽  
Artur Omar ◽  
Robert Bujila ◽  
Pedro Andreo

2021 ◽  
Author(s):  
Oscar Diaz-Ibarra ◽  
Kyungjoo Kim ◽  
Cosmin Safta ◽  
Habib Najm

Sign in / Sign up

Export Citation Format

Share Document