Model Data

Author(s):  
Kai R. Larsen ◽  
Daniel S. Becker

After preparing your dataset, the business problem should be quite familiar, along with the subject matter and the content of the dataset. This section is about modeling data, using data to train algorithms to create models that can be used to predict future events or understand past events. The section shows where data modeling fits in the overall machine learning pipeline. Traditionally, we store real-world data in one or more databases or files. This data is extracted, and features and a target (T) are created and submitted to the “Model Data” stage (the topic of this section). Following the completion of this stage, the model produced is examined (Section V) and placed into production. With the model in the production system, present data generated from the real-world environment is inputted into the system. In the example case of a diabetes patient, we enter a new patient’s information electronic health record into the system, and a database lookup retrieves additional data for feature creation.

2020 ◽  
Vol 13 ◽  
pp. 175628642092268 ◽  
Author(s):  
Francesco Patti ◽  
Andrea Visconti ◽  
Antonio Capacchione ◽  
Sanjeev Roy ◽  
Maria Trojano ◽  
...  

Background: The CLARINET-MS study assessed the long-term effectiveness of cladribine tablets by following patients with multiple sclerosis (MS) in Italy, using data from the Italian MS Registry. Methods: Real-world data (RWD) from Italian MS patients who participated in cladribine tablets randomised clinical trials (RCTs; CLARITY, CLARITY Extension, ONWARD or ORACLE-MS) across 17 MS centres were obtained from the Italian MS Registry. RWD were collected during a set observation period, spanning from the last dose of cladribine tablets during the RCT (defined as baseline) to the last visit date in the registry, treatment switch to other disease-modifying drugs, date of last Expanded Disability Status Scale recording or date of the last relapse (whichever occurred last). Time-to-event analysis was completed using the Kaplan–Meier (KM) method. Median duration and associated 95% confidence intervals (CI) were estimated from the model. Results: Time span under observation in the Italian MS Registry was 1–137 (median 80.3) months. In the total Italian patient population ( n = 80), the KM estimates for the probability of being relapse-free at 12, 36 and 60 months after the last dose of cladribine tablets were 84.8%, 66.2% and 57.2%, respectively. The corresponding probability of being progression-free at 60 months after the last dose was 63.7%. The KM estimate for the probability of not initiating another disease-modifying treatment at 60 months after the last dose of cladribine tablets was 28.1%, and the median time-to-treatment change was 32.1 (95% CI 15.5–39.5) months. Conclusion: CLARINET-MS provides an indirect measure of the long-term effectiveness of cladribine tablets. Over half of MS patients analysed did not relapse or experience disability progression during 60 months of follow-up from the last dose, suggesting that cladribine tablets remain effective in years 3 and 4 after short courses at the beginning of years 1 and 2.


Cancers ◽  
2021 ◽  
Vol 13 (4) ◽  
pp. 875
Author(s):  
Kerri Beckmann ◽  
Hans Garmo ◽  
Ingela Franck Lissbrant ◽  
Pär Stattin

Real-world data (RWD), that is, data from sources other than controlled clinical trials, play an increasingly important role in medical research. The development of quality clinical registers, increasing access to administrative data sources, growing computing power and data linkage capacities have contributed to greater availability of RWD. Evidence derived from RWD increases our understanding of prostate cancer (PCa) aetiology, natural history and effective management. While randomised controlled trials offer the best level of evidence for establishing the efficacy of medical interventions and making causal inferences, studies using RWD offer complementary evidence about the effectiveness, long-term outcomes and safety of interventions in real-world settings. RWD provide the only means of addressing questions about risk factors and exposures that cannot be “controlled”, or when assessing rare outcomes. This review provides examples of the value of RWD for generating evidence about PCa, focusing on studies using data from a quality clinical register, namely the National Prostate Cancer Register (NPCR) Sweden, with longitudinal data on advanced PCa in Patient-overview Prostate Cancer (PPC) and data linkages to other sources in Prostate Cancer data Base Sweden (PCBaSe).


2021 ◽  
Vol 39 (28_suppl) ◽  
pp. 253-253
Author(s):  
Maureen Canavan ◽  
Xiaoliang Wang ◽  
Mustafa Ascha ◽  
Rebecca A. Miksad ◽  
Timothy N Showalter ◽  
...  

253 Background: Among patients with cancer, receipt of systemic oncolytic therapy near the end-of-life (EOL) does not improve outcomes and worsens patient and caregiver experience. Accordingly, the ASCO/NQF measure, Proportion Receiving Chemotherapy in the Last 14 Days of Life, was published in 2012. Over the last decade there has been exponential growth in high cost targeted and immune therapies which may be perceived as less toxic than traditional chemotherapy. In this study, we identified rates and types of EOL systemic therapy in today’s real-world practice; these can serve as benchmarks for cancer care organizations to drive improvement efforts. Methods: Using data from the nationwide Flatiron Health electronic health record (EHR)-derived de-identified database we included patients who died during 2015 through 2019, were diagnosed after 2011, and who had documented cancer treatment. We identified the use of aggressive EOL systemic treatment (including, chemotherapy, immunotherapy, and combinations thereof) at both 30 days and 14 days prior to death. We estimated standardized EOL rates using mixed-level logistic regression models adjusting for patient and practice-level factors. Year-specific adjusted rates were estimated in annualized stratified analysis. Results: We included 57,127 patients, 38% of whom had documentation of having received any type of systemic cancer treatment within 30 days of death (SD: 5%; range: 25% - 56%), and 17% within 14 days of death (SD: 3%; range: 10% - 30%). Chemotherapy alone was the most common EOL treatment received (18% at 30 days, 8% at 14 days), followed by immunotherapy (± other treatment) (11% at 30 days, 4% at 14 days). Overall rates of EOL treatment did not change over the study period: treatment within 30 days (39% in 2015 to 37% in 2019) and within 14 days (17% in 2015 to 17% in 2019) of death. However, the rates of chemotherapy alone within 30 days of death decreased from 24% to 14%, and within 14 days, from 10% to 6% during the study period. In comparison, rates for immunotherapy with chemotherapy (0%-6% for 30 days, 0% -2% for 14 days), and immunotherapy alone or with other treatment types (4%-13% for 30 days, 1%-4% for 14 days) increased over time for both 30 and 14 days. Conclusions: End of life systemic cancer treatment rates have not substantively changed over time despite national efforts and expert guidance. While rates of traditional chemotherapy have decreased, rates of costly immunotherapy and targeted therapy have increased, which has been associated with higher total cost of care and overall healthcare utilization. Future work should examine the drivers of end-of-life care in the era of immune-oncology.


Author(s):  
Siddhartha Bhattacharyya ◽  
Paramartha Dutta

The field of industrial informatics has emerged as one of the key disciplines for the purpose of intelligent management and dissemination of information in today’s world. With the advent of newer technical know-how, the subject of informative intelligence has assumed increasing importance in the industrial arena, thanks to the evolution of data intensive industry. Real world data exhibit varied amount of unquantifiable uncertainty in the information content. Conventional logic is often unable to explain the associated uncertainty and imprecision therein due to the principles of finiteness of observations and quantifying propositions employed. Fuzzy sets and fuzzy logic provide a logical framework for description of the varied amount of ambiguity, uncertainty and imprecision exhibited in real world data under consideration. The resultant fuzzy inference engine and the fuzzy logic control theory supplement the power of the framework in design of robust failsafe real life systems.


Author(s):  
Mario S. Staller ◽  
Swen Körner

Abstract Professionalism in law enforcement requires the identification and development of expertise of police use of force (PUOF) coaches. Effective PUOF training includes the transfer from the training into the real-world environment of policing. This difference between working in the field and working as a PUOF coach has not been thoroughly investigated. However, research in other professional domains has shown that practical competence in the subject matter itself does not make a coach effective or successful. With this article, we conceptualize expert practice in PUOF instruction on the basis of a conflict management training setting in the security domain. First, by discussing a model of “territories of expertise”, we point out the dynamic and contextual character of expertise within the PUOF domain. Second, by conceptualizing expertise as a process and effect of communication, we provide a framework that describes and examines the interdependency between performance-based and reputation-based expertise. These considerations present two practical challenges, which we recommend professional law enforcement institutions to engage. We close by providing practical orientations and pointers for addressing these issues.


2009 ◽  
Vol 103 (1) ◽  
pp. 62-68
Author(s):  
Kathleen Cage Mittag ◽  
Sharon Taylor

Using activities to create and collect data is not a new idea. Teachers have been incorporating real-world data into their classes since at least the advent of the graphing calculator. Plenty of data collection activities and data sets exist, and the graphing calculator has made modeling data much easier. However, the authors were in search of a better physical model for a quadratic. We wanted students to see an actual parabola take shape in real time and then explore its characteristics, but we could not find such a hands-on model.


2019 ◽  
Vol 64 ◽  
pp. 1-20 ◽  
Author(s):  
Alireza Farhadi ◽  
Mohammad Ghodsi ◽  
Mohammad Taghi Hajiaghayi ◽  
Sébastien Lahaie ◽  
David Pennock ◽  
...  

We study fair allocation of indivisible goods to agents with unequal entitlements. Fair allocation has been the subject of many studies in both divisible and indivisible settings. Our emphasis is on the case where the goods are indivisible and agents have unequal entitlements. This problem is a generalization of the work by Procaccia and Wang (2014) wherein the agents are assumed to be symmetric with respect to their entitlements. Although Procaccia and Wang show an almost fair (constant approximation) allocation exists in their setting, our main result is in sharp contrast to their observation. We show that, in some cases with n agents, no allocation can guarantee better than 1/n approximation of a fair allocation when the entitlements are not necessarily equal. Furthermore, we devise a simple algorithm that ensures a 1/n approximation guarantee. Our second result is for a restricted version of the problem where the valuation of every agent for each good is bounded by the total value he wishes to receive in a fair allocation. Although this assumption might seem without loss of generality, we show it enables us to find a 1/2 approximation fair allocation via a greedy algorithm. Finally, we run some experiments on real-world data and show that, in practice, a fair allocation is likely to exist. We also support our experiments by showing positive results for two stochastic variants of the problem, namely stochastic agents and stochastic items.


2020 ◽  
Author(s):  
Vernon J. Richardson ◽  
Marcia Weidenmier Watson

Technology is revolutionizing accounting. To survive, accountants must focus on areas where they can complement technology and carve out a competitive advantage where the expertise of accountants is uniquely needed. To do so, we highlight the new core competencies emphasizing the use of data analytics. We propose a new, revolutionized curriculum focusing on (1) providing students with a step-by-step framework/approach for analyzing the data that includes the use of statistics; (2) using data analytics across the accounting curriculum to build data analytics skills; and (3) incorporating the use of real-world data for its analysis. This new curriculum combines business acumen to provide context as well as technological adeptness to analyze the data, and prepares the CPA professional for the future. We conclude by arguing that the accounting profession faces a choice: either master technology or be mastered by technology. The choice is ours. Act or be acted upon.


The game developed uses real world map data to generate real world 3D environment in Augmented Reality (AR). This Real world Game developed represents a way of technology implemented in augmented reality which is used to play Multi player Role Play Game (RPG). The game can be played on two different modes like role playing or multi-player mode. This game system is made as multiplayer game on a role playing environment for specific player, in which every player has their own role playing in it. The entire real world game environment is loaded in AR on any of the object that acts as the marker in the real world environment. This game system is developed using the UNITY game engine for creating game environment and the environment itself is created using real world data obtained using Map box. Vuforia studio frame work is used for implementation of game environment in AR. C# programming language is used to program game play actions and environment generations and modifications in runtime.


Author(s):  
B. Khalfi ◽  
C. de Runz ◽  
S. Faiz ◽  
H. Akdag

In the real world, data is imperfect and in various ways such as imprecision, vagueness, uncertainty, ambiguity and inconsistency. For geographic data, the fuzzy aspect is mainly manifested in time, space and the function of objects and is due to a lack of precision. Therefore, the researchers in the domain emphasize the importance of modeling data structures in GIS but also their lack of adaptation to fuzzy data. The F-Perceptory approachh manages the modeling of imperfect geographic information with UML. This management is essential to maintain faithfulness to reality and to better guide the user in his decision-making. However, this approach does not manage fuzzy complex geographic objects. The latter presents a multiple object with similar or different geographic shapes. So, in this paper, we propose to improve the F-Perceptory approach by proposing to handle fuzzy complex geographic objects modeling. In a second step, we propose its transformation to the UML modeling.


Sign in / Sign up

Export Citation Format

Share Document