scholarly journals Machado: Open source genomics data integration framework

GigaScience ◽  
2020 ◽  
Vol 9 (9) ◽  
Author(s):  
Mauricio de Alvarenga Mudadu ◽  
Adhemar Zerlotini

Abstract Background Genome projects and multiomics experiments generate huge volumes of data that must be stored, mined, and transformed into useful knowledge. All this information is supposed to be accessible and, if possible, browsable afterwards. Computational biologists have been dealing with this scenario for more than a decade and have been implementing software and databases to meet this challenge. The GMOD's (Generic Model Organism Database) biological relational database schema, known as Chado, is one of the few successful open source initiatives; it is widely adopted and many software packages are able to connect to it. Findings We have been developing an open source software package named Machado, a genomics data integration framework implemented in Python, to enable research groups to both store and visualize genomics data. The framework relies on the Chado database schema and, therefore, should be very intuitive for current developers to adopt it or have it running on top of already existing databases. It has several data-loading tools for genomics and transcriptomics data and also for annotation results from tools such as BLAST, InterproScan, OrthoMCL, and LSTrAP. There is an API to connect to JBrowse, and a web visualization tool is implemented using Django Views and Templates. The Haystack library integrated with the ElasticSearch engine was used to implement a Google-like search, i.e., single auto-complete search box that provides fast results and filters. Conclusion Machado aims to be a modern object-relational framework that uses the latest Python libraries to produce an effective open source resource for genomics research.

2020 ◽  
Author(s):  
Mauricio de Alvarenga Mudadu ◽  
Adhemar Zerlotini

ABSTRACTBackgroundGenome projects and multiomics experiments generate huge volumes of data that must be stored, mined and transformed into useful knowledge. All this information is supposed to be accessible and, if possible, browsable afterwards. Computational biologists have been dealing with this scenario for over a decade and have been implementing software libraries, toolkits, platforms, and databases to succeed in this matter. The GMOD’s (Generic Model Organism Database project) biological relational database schema, known as Chado, is one of the few successful open source initiatives, it is widely adopted and many softwares are able to connect to it.ResultsWe have been developing an open source software named Machado (https://github.com/lmb-embrapa/machado), a genomics data integration framework implemented in Python, to enable research groups to both store and browse, query, and visualize genomics data. The framework relies on the Chado database schema and, therefore, should be very intuitive for current developers to adopt it or have it running on the top of already existing databases. It has several data loading tools for genomics and transcriptomics data and also for annotation results from tools such as BLAST, InterproScan, OrthoMCL and LSTrAP. There is an API to connect to JBrowse and a web browsing visualisation tool is implemented using Django Views and Templates. The Haystack library integrated with the ElasticSearch engine was used to implement a google-like search i.e. single auto-complete search box that provides fast results and incremental filters.ConclusionMachado aims to be a modern object-relational framework that uses the latests Python libraries to produce an effective open source resource for genomics research.


2008 ◽  
Vol 9 (6) ◽  
pp. R102 ◽  
Author(s):  
Brian D O'Connor ◽  
Allen Day ◽  
Scott Cain ◽  
Olivier Arnaiz ◽  
Linda Sperling ◽  
...  

2015 ◽  
Vol 44 (D1) ◽  
pp. D1195-D1201 ◽  
Author(s):  
Carson M. Andorf ◽  
Ethalinda K. Cannon ◽  
John L. Portwood ◽  
Jack M. Gardiner ◽  
Lisa C. Harper ◽  
...  

genesis ◽  
2015 ◽  
Vol 53 (8) ◽  
pp. 498-509 ◽  
Author(s):  
Leyla Ruzicka ◽  
Yvonne M. Bradford ◽  
Ken Frazer ◽  
Douglas G. Howe ◽  
Holly Paddock ◽  
...  

2008 ◽  
Vol 2008 ◽  
pp. 1-10 ◽  
Author(s):  
Carolyn J. Lawrence ◽  
Lisa C. Harper ◽  
Mary L. Schaeffer ◽  
Taner Z. Sen ◽  
Trent E. Seigfried ◽  
...  

In 2001 maize became the number one production crop in the world with the Food and Agriculture Organization of the United Nations reporting over 614 million tonnes produced. Its success is due to the high productivity per acre in tandem with a wide variety of commercial uses. Not only is maize an excellent source of food, feed, and fuel, but also its by-products are used in the production of various commercial products. Maize's unparalleled success in agriculture stems from basic research, the outcomes of which drive breeding and product development. In order for basic, translational, and applied researchers to benefit from others' investigations, newly generated data must be made freely and easily accessible. MaizeGDB is the maize research community's central repository for genetics and genomics information. The overall goals of MaizeGDB are to facilitate access to the outcomes of maize research by integrating new maize data into the database and to support the maize research community by coordinating group activities.


Symmetry ◽  
2019 ◽  
Vol 11 (2) ◽  
pp. 224
Author(s):  
Mihaela Muntean ◽  
Claudiu Brândaş ◽  
Tanita Cîrstea

An Application-to-Application integration framework in the cloud environment is proposed. The methodological demarche is developed using a data symmetry approach. Implementation aspects of integration considered the Open Data Protocol (OData) service as an integrator. An important issue in the cloud environment is to integrate and ensure the quality of transferred and processed data. An efficient way of ensuring the completeness and integrity of data transferred between different applications and systems is the symmetry of data integration. With these considerations, the integration of SAP Hybris Cloud for Customer with S/4 HANA Cloud was implemented.


Sign in / Sign up

Export Citation Format

Share Document