End-To-End Computer Vision Framework: An Open-Source Platform for Research and Education

Computer Vision is a cross-research field with the main purpose of understanding the surrounding environment as closely as possible to human perception. The image processing systems is continuously growing and expanding into more complex systems, usually tailored to the certain needs or applications it may serve. To better serve this purpose, research on the architecture and design of such systems is also important. We present the End-to-End Computer Vision Framework, an open-source solution that aims to support researchers and teachers within the image processing vast field. The framework has incorporated Computer Vision features and Machine Learning models that researchers can use. In the continuous need to add new Computer Vision algorithms for a day-to-day research activity, our proposed framework has an advantage given by the configurable and scalar architecture. Even if the main focus of the framework is on the Computer Vision processing pipeline, the framework offers solutions to incorporate even more complex activities, such as training Machine Learning models. EECVF aims to become a useful tool for learning activities in the Computer Vision field, as it allows the learner and the teacher to handle only the topics at hand, and not the interconnection necessary for visual processing flow.

Download Full-text

A survey on various image processing techniques and machine learning models to detect, quantify and classify foliar plant disease

Proceedings of the Indian National Science Academy ◽

10.1007/s43538-021-00027-4 ◽

2021 ◽

Author(s):

Akruti Naik ◽

Hetal Thaker ◽

Dhaval Vyas

Keyword(s):

Machine Learning ◽

Image Processing ◽

Plant Disease ◽

Learning Models ◽

Image Processing Techniques ◽

Processing Techniques ◽

Machine Learning Models

Download Full-text

A Platform to Manage the End-to-End Lifecycle of Batch-Prediction Machine Learning Models

2021 IEEE 15th International Symposium on Applied Computational Intelligence and Informatics (SACI) ◽

10.1109/saci51354.2021.9465588 ◽

2021 ◽

Author(s):

Adrian-Ioan Argesanu ◽

Gheorghe-Daniel Andreescu

Keyword(s):

Machine Learning ◽

Learning Models ◽

End To End ◽

Machine Learning Models

Download Full-text

Machine Learning Boosted Docking (HASTEN): An Open-Source Tool To Accelerate Structurebased Virtual Screening Campaigns

10.26434/chemrxiv.14345849 ◽

2021 ◽

Author(s):

Tuomo Kalliokoski

Keyword(s):

Machine Learning ◽

Virtual Screening ◽

Open Source ◽

Learning Models ◽

Open Source Tool ◽

The Mean ◽

Machine Learning Models

The software macHine leArning booSTed dockiNg (HASTEN) was developed to accelerate structure-based virtual screening using machine learning models. It has been validated using datasets both from literature (12 datasets, each containing three million molecules docked with FRED) and in-house sources (one dataset of four million compounds docked with Glide). HASTEN showed reasonable performance by having the mean recall value of 0.78 of the top one percent scoring molecules after docking 10 % of the dataset for the literature data, whereas excellent recall value of 0.95 was achieved for the in-house data. The program can be used with any docking- and machine learning methodology, and is freely available from https://github.com/TuomoKalliokoski/HASTEN.

Download Full-text

Improving Logging Prediction on Imbalanced Datasets

International Journal of Open Source Software and Processes ◽

10.4018/ijossp.2016040103 ◽

2016 ◽

Vol 7 (2) ◽

pp. 43-71 ◽

Cited By ~ 3

Author(s):

Sangeeta Lal ◽

Neetu Sardana ◽

Ashish Sureka

Keyword(s):

Machine Learning ◽

Open Source ◽

Class Imbalance ◽

Learning Model ◽

Learning Models ◽

Class Imbalance Problem ◽

Imbalanced Datasets ◽

Imbalance Problem ◽

Machine Learning Model ◽

Machine Learning Models

Logging is an important yet tough decision for OSS developers. Machine-learning models are useful in improving several steps of OSS development, including logging. Several recent studies propose machine-learning models to predict logged code construct. The prediction performances of these models are limited due to the class-imbalance problem since the number of logged code constructs is small as compared to non-logged code constructs. No previous study analyzes the class-imbalance problem for logged code construct prediction. The authors first analyze the performances of J48, RF, and SVM classifiers for catch-blocks and if-blocks logged code constructs prediction on imbalanced datasets. Second, the authors propose LogIm, an ensemble and threshold-based machine-learning model. Third, the authors evaluate the performance of LogIm on three open-source projects. On average, LogIm model improves the performance of baseline classifiers, J48, RF, and SVM, by 7.38%, 9.24%, and 4.6% for catch-blocks, and 12.11%, 14.95%, and 19.13% for if-blocks logging prediction.

Download Full-text

Saga: An Open Source Platform for Training Machine Learning Models and Community-driven Sharing of Techniques

2019 International Conference on Content-Based Multimedia Indexing (CBMI) ◽

10.1109/cbmi.2019.8877455 ◽

2019 ◽

Author(s):

Rune Johan Borgli ◽

Hakon Kvale Stensland ◽

Pal Halvorsen ◽

Michael Alexander Riegler

Keyword(s):

Machine Learning ◽

Open Source ◽

Learning Models ◽

Machine Learning Models

Download Full-text

End-to-End Latency Prediction of Microservices Workflow on Kubernetes: A Comparative Evaluation of Machine Learning Models and Resource Metrics

Proceedings of the 54th Hawaii International Conference on System Sciences ◽

10.24251/hicss.2021.208 ◽

2021 ◽

Author(s):

Haytham Mohamed ◽

Omar El-Gayar

Keyword(s):

Machine Learning ◽

Comparative Evaluation ◽

Learning Models ◽

End To End ◽

Machine Learning Models

Download Full-text

Arangopipe, a tool for machine learning meta-data management

Data Science ◽

10.3233/ds-210034 ◽

2021 ◽

pp. 1-15

Author(s):

Jörg Schad ◽

Rajiv Sambasivan ◽

Christopher Woodward

Keyword(s):

Machine Learning ◽

Life Cycle ◽

Open Source ◽

Data Model ◽

Application Programming Interface ◽

Learning Models ◽

Essential Components ◽

Application Programming ◽

Programming Interface ◽

Machine Learning Models

Experimenting with different models, documenting results and findings, and repeating these tasks are day-to-day activities for machine learning engineers and data scientists. There is a need to keep control of the machine-learning pipeline and its metadata. This allows users to iterate quickly through experiments and retrieve key findings and observations from historical activity. This is the need that Arangopipe serves. Arangopipe is an open-source tool that provides a data model that captures the essential components of any machine learning life cycle. Arangopipe provides an application programming interface that permits machine-learning engineers to record the details of the salient steps in building their machine learning models. The components of the data model and an overview of the application programming interface is provided. Illustrative examples of basic and advanced machine learning workflows are provided. Arangopipe is not only useful for users involved in developing machine learning models but also useful for users deploying and maintaining them.

Download Full-text

Glycowork: A Python package for glycan data science and machine learning

10.1101/2021.04.22.440981 ◽

2021 ◽

Author(s):

Luc Thomès ◽

Rebekka Burkholz ◽

Daniel Bojar

Keyword(s):

Machine Learning ◽

Open Source ◽

Data Science ◽

Biological Processes ◽

Biological Sequence ◽

Learning Models ◽

Related Data ◽

Strong Focus ◽

Python Package ◽

Machine Learning Models

AbstractAs a biological sequence, glycans occur in every domain of life and comprise monosaccharides that are chained together to form oligo- or polysaccharides. While glycans are crucial for most biological processes, existing analysis modalities make it difficult for researchers with limited computational background to include information from these diverse and nonlinear sequences into standard workflows. Here, we present glycowork, an open-source Python package that was designed for the processing and analysis of glycan data by end users, with a strong focus on glycan-related data science and machine learning. Glycowork includes numerous functions to, for instance, automatically annotate glycan motifs and analyze their distributions via heatmaps and statistical enrichment. We also provide visualization methods, routines to interact with stored databases, trained machine learning models, and learned glycan representations. We envision that glycowork can extract further insights from any glycan dataset and demonstrate this with several workflows that analyze glycan motifs in various biological contexts. Glycowork can be freely accessed at https://github.com/BojarLab/glycowork/.

Download Full-text

Comparing two classes of end-to-end machine-learning models in lung nodule detection and classification: MTANNs vs. CNNs

Pattern Recognition ◽

10.1016/j.patcog.2016.09.029 ◽

2017 ◽

Vol 63 ◽

pp. 476-486 ◽

Cited By ~ 71

Author(s):

Nima Tajbakhsh ◽

Kenji Suzuki

Keyword(s):

Machine Learning ◽

Lung Nodule ◽

Learning Models ◽

Lung Nodule Detection ◽

Nodule Detection ◽

End To End ◽

Machine Learning Models

Download Full-text

Machine Learning for Organic Cage Property Prediction

10.26434/chemrxiv.6995018.v2 ◽

2018 ◽

Author(s):

Lukas Turcani ◽

Rebecca L. Greenaway ◽

Kim Jelfs

Keyword(s):

Machine Learning ◽

Open Source ◽

Data Sets ◽

Cavity Size ◽

Learning Models ◽

Property Prediction ◽

Online Tool ◽

Machine Learning Models

We use machine learning to predict shape persistence and cavity size in porous organic cages. The majority of hypothetical organic cages suffer from a lack of shape persistence and as a result lack intrinsic porosity, rendering them unsuitable for many applications. We have created the largest computational database of these molecules to date, numbering 63,472 cages, formed through a range of reaction chemistries and in multiple topologies. We study our database and identify features which lead to the formation of shape persistent cages. We find that the imine condensation of trialdehydes and diamines in a [4+6] reaction is the most likely to result in shape persistent cages, whereas thiol reactions are most likely to give collapsed cages. Using this database, we develop machine learning models capable of predicting shape persistence with an accuracy of up to 93%, reducing the time taken to predict this property to milliseconds, and removing the need for specialist software. In addition, we develop machine learning models for two other key properties of these molecules, cavity size and symmetry. We provide open-source implementations of our models, together with the accompanying data sets, and an online tool giving users access to our models to easily obtain predictions for a hypothetical cage prior to a synthesis attempt.

Download Full-text