Using Genetic Programming for Data Science: Lessons Learned

Pharmacogenomic information must be incorporated into electronic health records (EHRs) with clinical decision support in order to fully realize its potential to improve drug therapy. Supported by various clinical knowledge resources, pharmacogenomic workflows have been implemented in several healthcare systems. Little standardization exists across these efforts, however, which limits scalability both within and across clinical sites. Limitations in information standards, knowledge management, and the capabilities of modern EHRs remain challenges for the widespread use of pharmacogenomics in the clinic, but ongoing efforts are addressing these challenges. Although much work remains to use pharmacogenomic information more effectively within clinical systems, the experiences of pioneering sites and lessons learned from those programs may be instructive for other clinical areas beyond genomics. We present a vision of what can be achieved as informatics and data science converge to enable further adoption of pharmacogenomics in the clinic.

Download Full-text

Data-Driven Construction Safety Information Sharing System Based on Linked Data, Ontologies, and Knowledge Graph Technologies

International Journal of Environmental Research and Public Health ◽

10.3390/ijerph19020794 ◽

2022 ◽

Vol 19 (2) ◽

pp. 794

Author(s):

Akeem Pedro ◽

Anh-Tuan Pham-Hang ◽

Phong Thanh Nguyen ◽

Hai Chien Pham

Keyword(s):

Information Sharing ◽

Industry 4.0 ◽

Linked Data ◽

Data Science ◽

Safety Information ◽

Construction Safety ◽

Lessons Learned ◽

Data Driven ◽

Knowledge Graph ◽

Ontological Approach

Accident, injury, and fatality rates remain disproportionately high in the construction industry. Information from past mishaps provides an opportunity to acquire insights, gather lessons learned, and systematically improve safety outcomes. Advances in data science and industry 4.0 present new unprecedented opportunities for the industry to leverage, share, and reuse safety information more efficiently. However, potential benefits of information sharing are missed due to accident data being inconsistently formatted, non-machine-readable, and inaccessible. Hence, learning opportunities and insights cannot be captured and disseminated to proactively prevent accidents. To address these issues, a novel information sharing system is proposed utilizing linked data, ontologies, and knowledge graph technologies. An ontological approach is developed to semantically model safety information and formalize knowledge pertaining to accident cases. A multi-algorithmic approach is developed for automatically processing and converting accident case data to a resource description framework (RDF), and the SPARQL protocol is deployed to enable query functionalities. Trials and test scenarios utilizing a dataset of 200 real accident cases confirm the effectiveness and efficiency of the system in improving information access, retrieval, and reusability. The proposed development facilitates a new “open” information sharing paradigm with major implications for industry 4.0 and data-driven applications in construction safety management.

Download Full-text

Successful Data Science Projects: Lessons Learned from Kaggle Competition

Kurdistan Journal of Applied Research ◽

10.24017/science.2017.3.18 ◽

2017 ◽

Vol 2 (3) ◽

pp. 40-49 ◽

Cited By ~ 1

Author(s):

Mohammed Zuhair Al-Taie ◽

Naomie Salim ◽

Adekunle Isiaka Obasa

Keyword(s):

Data Science ◽

Model Building ◽

Lessons Learned ◽

Human Interaction ◽

Model Design ◽

Science Project ◽

Search Results ◽

Science Projects ◽

Insight Into ◽

Shed Light

The workflow from data understanding to deployment of an analytical model of a data science project begins at framing the problem at hand, a task that is typically business-oriented and requires human-to-human interaction. However, the next three steps: data understanding, feature extraction, and model building that come next in the pipeline are the key to successful data science projects. Failing to fully understand the requirements of each of these three steps can negatively affect the performance of the proposed system. Hence, the current study tries to answer the following question “What are the requirements of a successful data science project?” To answer this question, we will use the solution that we built to measure the relevance of local search results of small online e-businesses and submitted to Kaggle data science platform to shed light on why our solution did not achieve a top position among other competitors. Evaluation of the design that we submitted to the competition is going to be carried out in the spirit of the three winning submissions. Our results revealed that well-performed data preprocessing, well-defined features, and model ensembling are critical for building successful data science projects. Such a clarification provides insight into specific aspects of model design to help others including Kagglers avoid possible mistakes while approaching their data science projects.

Download Full-text

Library Carpentry: Software Skills Training for Library Professionals

International Journal of Digital Curation ◽

10.2218/ijdc.v12i2.576 ◽

2018 ◽

Vol 12 (2) ◽

pp. 266-273

Author(s):

Jez Cope ◽

James Baker

Keyword(s):

Data Science ◽

Skills Training ◽

Training Programme ◽

Lessons Learned ◽

Non Profit ◽

Professional Career ◽

Research Software ◽

Direct Benefits ◽

Software Skills ◽

Time And Energy

Much time and energy is now being devoted to developing the skills of researchers in the related areas of data analysis and data management. However, less attention is currently paid to developing the data skills of librarians themselves: these skills are often brought in by recruitment in niche areas rather than considered as a wider development need for the library workforce, and are not widely recognised as important to the professional career development of librarians. We believe that building computational and data science capacity within academic libraries will have direct benefits for both librarians and the users we serve. Library Carpentry is a global effort to provide training to librarians in technical areas that have traditionally been seen as the preserve of researchers, IT support and systems librarians. Established non-profit volunteer organisations, such as Software Carpentry and Data Carpentry, offer introductory research software skills training with a focus on the needs and requirements of research scientists. Library Carpentry is a comparable introductory software skills training programme with a focus on the needs and requirements of library and information professionals. This paper describes how the material was developed and delivered, and reports on challenges faced, lessons learned and future plans.

Download Full-text

Using Genetic Programming for Data Science: Lessons Learned

Lessons Learned Using Genetic Programming in a Stock Picking Context

Lessons Learned from Challenging Data Science Case Studies

Leadership in Data Science: Lessons Learned From Time Invested in Helping to Build the Field

Corrigendum on Data Science Approaches in Criminal Justice and Public Health Research: Lessons Learned From Opioid Projects

How Competitive Is Genetic Programming in Business Data Science Applications?

Issues and Lessons Learned in the Development of Academic Study Programs in Data Science

Biomedical Data Science and Informatics Challenges to Implementing Pharmacogenomics with Electronic Health Records

Data-Driven Construction Safety Information Sharing System Based on Linked Data, Ontologies, and Knowledge Graph Technologies

Successful Data Science Projects: Lessons Learned from Kaggle Competition

Library Carpentry: Software Skills Training for Library Professionals

Export Citation Format