Advances in Data Mining and Database Management - Emerging Methods in Predictive Analytics
Latest Publications


TOTAL DOCUMENTS

15
(FIVE YEARS 0)

H-INDEX

1
(FIVE YEARS 0)

Published By IGI Global

9781466650633, 9781466650640

Author(s):  
Wesam Elshamy ◽  
William H. Hsu

Topic models are probabilistic models for discovering topical themes in collections of documents. These models provide us with the means of organizing what would otherwise be unstructured collections. The first wave of topic models developed was able to discover the prevailing topics in a big collection of documents spanning a period of time. These time-invariant models were not capable of modeling 1) the time varying number of topics they discover and 2) the time changing structure of these topics. Few models were developed to address these two deficiencies. The online-hierarchical Dirichlet process models the documents with a time varying number of topics, and the continuous-time dynamic topic model evolves topic structure in continuous-time. In this chapter, the authors present the continuous-time infinite dynamic topic model that combines the advantages of these two models. It is a probabilistic topic model that changes the number of topics and topic structure over continuous-time.


Author(s):  
Sunita Soni

Medical data mining has great potential for exploring the hidden pattern in the data sets of the medical domain. A predictive modeling approach of Data Mining has been systematically applied for the prognosis, diagnosis, and planning for treatment of chronic disease. For example, a classification system can assist the physician to predict if the patient is likely to have a certain disease, or by considering the output of the classification model, the physician can make a better decision on the treatment to be applied to the patient. Once the model is evaluated and verified, it may be embedded within clinical information systems. The objective of this chapter is to extensively study the various predictive data mining methods to evaluate their usage in terms of accuracy, computational time, comprehensibility of the results, ease of use of the algorithm, and advantages and disadvantages to relatively naive medical users. The research has shown that there is not a single best prediction tool, but instead, the best performing algorithm will depend on the features of the dataset to be analyzed.


Author(s):  
Jyothi Pillai ◽  
O. P. Vyas

Data Mining is largely known to extract knowledge from large databases in an attempt to discover existing trends and newer patterns. While data mining refers to information extraction, soft computing is more inclined to information processing. Using Soft Computing, the tolerance for imprecision, uncertainty, approximate reasoning, and partial truth for achieving tractability, robustness, and low-cost solutions can be revealed. For effective knowledge discovery from large databases, both Soft Computing and Data Mining can be merged. Soft computing techniques are Fuzzy Logic (FL), Neural Network (NN), Genetic Algorithm (GA), Rough Set (RS), etc. For handling different types of uncertainty in huge data, FL and RS are highly suitable. NNs are a nonparametric, robust technique and provide good learning and generalization capabilities in data-rich environments. GAs provide efficient search algorithms for selecting a model, from mixed-media data, based on some priority criterion. In one of its realms, Association Rule Mining (ARM) and Itemset mining have been a focus of research in data mining for a decade, including finding most frequent item sets and corresponding association rules and extracting rare itemsets including temporal and fuzzy concepts in discovered patterns. The objective of this chapter is to explore the usage of Soft Computing approaches in itemset utility mining, both frequent and rare itemsets. In addition, a literature review of applications of soft computing techniques in temporal mining is described.


Author(s):  
Rituparna Das

Liquidity Risk Management (LRM) in the banking industry happens at two levels: (1) the Central Bank (i.e. the regulator) and (2) the commercial banks. The term “liquidity” for the Central Bank means the monetary base consisting of the currency and the reserves in the banking system. These are the supply side of the interest rate market. The Central Bank being the only supplier of the same can target the interest rates by varying supply of monetary base and vice versa. There are several ways including auctioning and redeeming the government securities for squeezing and pumping liquidity into the system. However, before such recourse, the Central Bank needs an assessment of the liquidity requirement of the system and applies the forecasting techniques, which are mostly econometric by nature involving the time series data. This chapter explores this process.


Author(s):  
Robert L. Foster Jr.

Recent studies prove that the current method of statically allocating and assigning RF spectrum is restrictive and inefficient and identify Dynamic Spectrum Access (DSA) as a feasible alternative. The author defines a system design which implemenst a declarative policy language using Semantic Web technologies. Swarms of multi-band, reconfigurable Cognitive Radio (CR) technology with fast channel switching and real time spectrum sensing capabilities have been identified as candidates for Multi-Channel Jamming Electronic Attacks (EA). Past demonstrations have shown that a single CR can be deployed in an EA on 802.11 networks. To extend the behavior of CRs beyond single multi-channel jamming, this chapter introduces a system architecture design that relies on a minimal set of CR capabilities to reduce the cost of designing the system.


Author(s):  
Ming Yang ◽  
William H. Hsu ◽  
Surya Teja Kallumadi

In this chapter, the authors survey the general problem of analyzing a social network in order to make predictions about its behavior, content, or the systems and phenomena that generated it. They begin by defining five basic tasks that can be performed using social networks: (1) link prediction; (2) pathway and community formation; (3) recommendation and decision support; (4) risk analysis; and (5) planning, especially causal interventional planning. Next, they discuss frameworks for using predictive analytics, availability of annotation, text associated with (or produced within) a social network, information propagation history (e.g., upvotes and shares), trust, and reputation data. They also review challenges such as imbalanced and partial data, concept drift especially as it manifests within social media, and the need for active learning, online learning, and transfer learning. They then discuss general methodologies for predictive analytics involving network topology and dynamics, heterogeneous information network analysis, stochastic simulation, and topic modeling using the abovementioned text corpora. They continue by describing applications such as predicting “who will follow whom?” in a social network, making entity-to-entity recommendations (person-to-person, business-to-business [B2B], consumer-to-business [C2B], or business-to-consumer [B2C]), and analyzing big data (especially transactional data) for Customer Relationship Management (CRM) applications. Finally, the authors examine a few specific recommender systems and systems for interaction discovery, as part of brief case studies.


Author(s):  
Josh Weese

Pitch detection and instrument identification can be achieved with relatively high accuracy when considering monophonic signals in music; however, accurately classifying polyphonic signals in music remains an unsolved research problem. Pitch and instrument classification is a subset of Music Information Retrieval (MIR) and automatic music transcription, both having numerous research and real-world applications. Several areas of research are covered in this chapter, including the fast Fourier transform, onset detection, convolution, and filtering. Polyphonic signals with many different voices and frequencies can be exceptionally complex. This chapter presents a new model for representing the spectral structure of polyphonic signals: Uniform MAx Gaussian Envelope (UMAGE). The new spectral envelope precisely approximates the distribution of frequency parts in the spectrum while still being resilient to oscillating rapidly and is able to generalize well without losing the representation of the original spectrum.


Author(s):  
William Alberto Cruz Castañeda ◽  
Renato Garcia Ojeda

According to the World Health Organization, Healthcare Technology (HT) is defined as the application of techniques and knowledge in the way of devices, medicaments, vaccines, procedures, and systems in order to develop solutions for healthcare problems and enhance the quality of life. Clinical Engineering has emerged as an interdisciplinary profession in the areas of medical equipment and technology management. With the correct support of Information and Communication Technologies (ICTs), these and others questions may be resolved through the ubiquitous environments and services that allow the acquisition, processing, diagnostic, transmission, and information-sharing in real time. Ubiquitous healthcare is a new paradigm that allows developing models and tools that improve the processes through monitoring, evaluation, prediction, and decision-making of the medical equipment condition. This chapter presents an ubiquitous management methodology for predictive maintenance with support of ICT and predictive analysis techniques that enhance decision-making in medical equipment.


Author(s):  
Misha Voloshin

User authentication is the keystone of information security. Even the most craftily built and diligently monitored computer system will crumble if there is a flaw in its user authentication system(Minaev, 2010). A hacker able to exploit such a flaw will be able to convince the computer system that he is a legitimate user – possibly even a specific legitimate user, entitled to all of the abilities to read or modify that user's data, or implicate that user in misdeeds that could lead to personal or professional harm. A hacker could even impersonate the system administrator herself, giving the hacker the ability to not only access all of that system's data but also to subvert the very same network monitors and automated alerting systems that would notify the real administrator of the hacker's activity(occupytheweb, 2013). This chapter introduces a mechanism that an administrator can use for increasing the strength of a computer's user authentication system and triggering a lockout and/or emailing an alert if an impostor is suspected to be accessing a user's account. It works by measuring the time intervals between keystrokes as a user types, relying on the fact that most individuals have distinct and identifiable typing patterns that can be discerned through statistical analysis.


Author(s):  
John Benjamin Cassel

This chapter provides a stakeholder discovery model for distributed risk governance suitable to machine learning and decision-theoretic planning. Distributed risk governance concerns when the underlying risk is not localized or has unknown locality so that any initial interaction with stakeholders is limited and educational and participatory initiatives are costly. Therefore, expecting the initial reaction to communications is critical. To capture this initial reaction, the authors sample the population of potential stakeholders to discover both their concerns and knowledge while handling inaccuracies and contradictions. This chapter provides a stakeholder discovery model that can accommodate these inconsistencies. Stakeholder discovery provides a timely strategic assessment of the risk situation. This assessment forecasts projected stakeholder actions to find if those actions are in line with their strategic interests or if there are better choices using reinforcement learning. Unlike other reinforcement learning formulations, it does not take the state space, criteria, potential observations, other agents, actions, or rewards for granted, but discovers these factors non-parametrically. Overall, this chapter introduces machine learning researchers and risk governance professionals to the compatibility between non-parametric models and early-stage stakeholder discovery problems and addresses widely known biases and deficits within risk governance and intelligence practices.


Sign in / Sign up

Export Citation Format

Share Document