scholarly journals Bayesian Linear Regression: Teaching resources and software

2021 ◽  
Author(s):  
soumya banerjee

Bayesian models are very important in modern data science. These models can be used to derive estimatesfor noisy and sparse data. This manuscript outlines the basics and derivations of a Bayesian linearregression model. Source code for performing Bayesian linear regression is also provided. I hope thisresource will enable broader understanding of the basics of Bayesian models.

Author(s):  
Biliana S. Güner ◽  
Svetlozar T. Rachev ◽  
John S. J. Hsu ◽  
Frank J. Fabozzi

The unique IDs that firms assign to all important models typically appear in just three places: model documents, validation documents, and model inventory databases. Where the IDs do not, as a rule, appear is within the actual model source code. Incomplete model inventory information (including usage) is a chronic issue throughout the financial industry. Few firms can accurately answer such vexing questions as how many times each model in inventory was executed during the last year, which models exhibit significant seasonality, which models are used in each geographic region or legal entity, or whether any unvalidated models were used during the last year on any firm computer. This article will demonstrate that a root cause of model usage opacity is, unfortunately, that most models do not actually know who they are. This article will further explain how software-embedded model IDs can be leveraged to increase transparency and address some of the most difficult questions that may be posed about model usage.


2019 ◽  
Vol 37 (11) ◽  
pp. 1409-1410 ◽  
Author(s):  
Joanna Emerson ◽  
Rachel Bacon ◽  
Alma Kent ◽  
Peter J. Neumann ◽  
Joshua T. Cohen

2019 ◽  
Vol 35 (23) ◽  
pp. 5063-5065
Author(s):  
Lee H Bergstrand ◽  
Josh D Neufeld ◽  
Andrew C Doxey

Abstract Summary A critical step in comparative genomics is the identification of differences in the presence/absence of encoded biochemical pathways among organisms. Our library, Pygenprop, facilitates these comparisons using data from the Genome Properties database. Pygenprop is written in Python and, unlike existing libraries, it is compatible with a variety of tools in the Python data science ecosystem, such as Jupyter Notebooks for interactive analyses and scikit-learn for machine learning. Pygenprop assigns YES, NO, or PARTIAL support for each property based on InterProScan annotations of open reading frames from an organism’s genome. The library contains classes for representing the Genome Properties database as a whole and methods for detecting differences in property assignments between organisms. As the Genome Properties database grows, we anticipate widespread adoption of Pygenprop for routine genome analyses and integration within third-party bioinformatics software. Availability and implementation Pygenprop is written in Python and is compatible with versions 3.6 or higher. Source code is available under Apache Licence Version 2 at https://github.com/Micromeda/pygenprop. The package can be installed from both PyPi (https://pypi.org/project/pygenprop) and Anaconda (https://anaconda.org/lbergstrand/pygenprop). Documentation is available on Read the Docs (http://pygenprop.rtfd.io/).


Sign in / Sign up

Export Citation Format

Share Document