Map /Reduce Design and Implementation of Apriori Algorithm for Handling Voluminous Data - Sets

Apriori is one all instructed the key algorithms to come again up with frequent itemsets. Analysing frequent itemset could be an critical step in analysing based info and recognize association dating among matters. This stands as degree standard basis to supervised gaining knowledge of, that encompasses classifier and feature extraction strategies. making use of this system is vital to grasp the behaviour of structured data. maximum of the dependent information in scientific domain square measure voluminous. method such moderately info desires country of the artwork computing machines. setting up region such degree infrastructure is high priced. so a allotted environment admire a clustered setup is hired for grappling such situations. Apache Hadoop distribution is one all advised the cluster frameworks in allotted environment that enables by means of distributing voluminous data across style of nodes most of the framework. This paper specializes in map/reduce trend and implementation of Apriori formula for dependent info analysis.

Download Full-text

The Player Load Associated With Typical Activities in Elite Netball

International Journal of Sports Physiology and Performance ◽

10.1123/ijspp.2016-0378 ◽

2017 ◽

Vol 12 (9) ◽

pp. 1218-1223 ◽

Cited By ~ 10

Author(s):

Jared A. Bailey ◽

Paul B. Gastin ◽

Luke Mackey ◽

Dan B. Dwyer

Keyword(s):

Training Programs ◽

Data Sets ◽

Specific Training ◽

Start Time ◽

Total Load ◽

Future Design ◽

Design And Implementation ◽

Video Recordings ◽

Wearable Accelerometers

Context:Most previous investigations of player load in netball have used subjective methodologies, with few using objective methodologies. While all studies report differences in player activities or total load between playing positions, it is unclear how the differences in player activity explain differences in positional load. Purpose:To objectively quantify the load associated with typical activities for all positions in elite netball. Methods:The player load of all playing positions in an elite netball team was measured during matches using wearable accelerometers. Video recordings of the matches were also analyzed to record the start time and duration of 13 commonly reported netball activities. The load associated with each activity was determined by time-aligning both data sets (load and activity). Results:Off-ball guarding produced the highest player load per instance, while jogging produced the greatest player load per match. Nonlocomotor activities contributed least to total match load for attacking positions (goal shooter [GS], goal attack [GA], and wing attack [WA]) and most for defending positions (goalkeeper [GK], goal defense [GD], and wing defense [WD]). Specifically, centers (Cs) produced the greatest jogging load, WA and WD accumulated the greatest running load, and GS and WA accumulated the greatest shuffling load. WD and Cs accumulated the greatest guarding load, while WD and GK accumulated the greatest off-ball guarding load. Conclusions:All positions exhibited different contributions from locomotor and nonlocomotor activities toward total match load. In addition, the same activity can have different contributions toward total match load, depending on the position. This has implications for future design and implementation of position-specific training programs.

Download Full-text

A Novel Approach for Crawling the Opinions from World Wide Web

International Journal of Information Retrieval Research ◽

10.4018/ijirr.2016040101 ◽

2016 ◽

Vol 6 (2) ◽

pp. 1-23 ◽

Cited By ~ 4

Author(s):

Surbhi Bhatia ◽

Manisha Sharma ◽

Komal Kumar Bhatia

Keyword(s):

World Wide ◽

Opinion Mining ◽

Real Data ◽

User Generated Content ◽

Decision Making Process ◽

Web Pages ◽

Data Sets ◽

Web Technologies ◽

Design And Implementation ◽

Novel Approach

Due to the sudden and explosive increase in web technologies, huge quantity of user generated content is available online. The experiences of people and their opinions play an important role in the decision making process. Although facts provide the ease of searching information on a topic but retrieving opinions is still a crucial task. Many studies on opinion mining have to be undertaken efficiently in order to extract constructive opinionated information from these reviews. The present work focuses on the design and implementation of an Opinion Crawler which downloads the opinions from various sites thereby, ignoring rest of the web. Besides, it also detects web pages which frequently undergo updation by calculating the timestamp for its revisit in order to extract relevant opinions. The performance of the Opinion Crawler is justified by taking real data sets that prove to be much more accurate in terms of precision and recall quality attributes.

Download Full-text

On Efficient Evaluation of XML Queries

Theoretical and Practical Advances in Information Systems Development ◽

10.4018/978-1-60960-521-6.ch011 ◽

2011 ◽

pp. 239-293

Author(s):

Sherif Sakr

Keyword(s):

World Wide Web ◽

World Wide ◽

Query Language ◽

Data Sets ◽

Relational Approach ◽

Complex Queries ◽

Xml Documents ◽

Design And Implementation ◽

The World

Recently, the use of XML continues to grow in popularity, large repositories of XML documents are going to emerge, and users are likely to pose increasingly more complex queries on these data sets. In 2001 XQuery is decided by the World Wide Web Consortium (W3C) as the standard XML query language. In this article, we describe the design and implementation of an efficient and scalable purely relational XQuery processor which translates expressions of the XQuery language into their equivalent SQL evaluation scripts. The experiments of this article demonstrated the efficiency and scalability of our purely relational approach in comparison to the native XML/XQuery functionality supported by conventional RDBMSs and has shown that our purely relational approach for implementing XQuery processor deserves to be pursued further.

Download Full-text

Design and implementation of a visualization strategy for maintaining integrity of large data sets

10.1063/1.1306124 ◽

2000 ◽

Author(s):

Krishnan R. Nair

Keyword(s):

Large Data ◽

Large Data Sets ◽

Data Sets ◽

Design And Implementation ◽

Visualization Strategy

Download Full-text

Evaluating the quality of linked open data in digital libraries

Journal of Information Science ◽

10.1177/0165551520930951 ◽

2020 ◽

pp. 016555152093095

Author(s):

Gustavo Candela ◽

Pilar Escobar ◽

Rafael C Carrasco ◽

Manuel Marco-Such

Keyword(s):

Digital Libraries ◽

Open Data ◽

Quality Measures ◽

Linked Open Data ◽

Data Sets ◽

Design And Implementation ◽

Bibliographic Data ◽

Description Framework ◽

Resource Description

Cultural heritage institutions have recently started to share their metadata as Linked Open Data (LOD) in order to disseminate and enrich them. The publication of large bibliographic data sets as LOD is a challenge that requires the design and implementation of custom methods for the transformation, management, querying and enrichment of the data. In this report, the methodology defined by previous research for the evaluation of the quality of LOD is analysed and adapted to the specific case of Resource Description Framework (RDF) triples containing standard bibliographic information. The specified quality measures are reported in the case of four highly relevant libraries.

Download Full-text

Design and Implementation of Fishery Rescue Data Mart System

Advanced Materials Research ◽

10.4028/www.scientific.net/amr.850-851.557 ◽

2013 ◽

Vol 850-851 ◽

pp. 557-560

Author(s):

Hai Guang Huang

Keyword(s):

Original Data ◽

Online Analytical Processing ◽

Configuration File ◽

Data Sets ◽

Two Dimensional ◽

Data Grids ◽

Data Mart ◽

Design And Implementation ◽

Analytical Processing ◽

Crystal Reports

A novel data mart based system for fishery rescue field was designed and implemented. The system runs ETL process to deal with original data from various databases and data warehouses, and then reorganized the data into the fishery rescue data mart. Next, online analytical processing (OLAP) are carried out and statistical reports are generated automatically. Particularly, quick configuration schemes are designed to configure query dimensions and OLAP data sets. The configuration file will be transformed into statistic interfaces automatically through a wizard-style process. The system provides various forms of reporting files, including crystal reports, flash graphical reports, and two-dimensional data grids. In addition, a wizard style interface was designed to guide users customizing inquiry processes, making it possible for nontechnical staffs to access customized reports. Characterized by quick configuration, safeness and flexibility, the system has been successfully applied in city fishery rescue department

Download Full-text

Design and Implementation of a New Meteorology Geographic Information System

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.411-414.440 ◽

2013 ◽

Vol 411-414 ◽

pp. 440-443

Author(s):

Wei Jiang Zheng ◽

Bing Luo ◽

Zheng Guang Hu ◽

Zhong Liang Lv

Keyword(s):

Information System ◽

Geographic Information System ◽

Data Management ◽

Meteorological Data ◽

Geographic Information ◽

Data Sets ◽

Massive Vector ◽

Design And Implementation ◽

Gis Technologies ◽

Cross Platform

Meteorology Geographic Information System (MeteoGIS) is a professional meteorological GIS platform with completely independent intelligent properties. It fully utilizes the national innovative GIS technologies in the meteorological scenario; MeteoGIS supports multiple databases, browsers and a variety of development environments, has a good cross-platform capability. It also has a massive vector and raster data management and distribution capacity. MeteoGIS extends the meteorological data models and data sets, and is able to produce meteorological thematic maps, layout and printing. It has integrated algorithms for meteorological applications and special-use analysis. The platform is comprised of development kits, data engine, desktop software, and Web development platforms.

Download Full-text

XML Retrieval with Results Clustering on Android

Advanced Materials Research ◽

10.4028/www.scientific.net/amr.756-759.1300 ◽

2013 ◽

Vol 756-759 ◽

pp. 1300-1303

Author(s):

Peng Fei Liu ◽

Yan Hua Chen ◽

Wen Jie Xie ◽

Qiao Yi Hu

Keyword(s):

Information Management ◽

Mobile Devices ◽

Clustering Algorithm ◽

Data Sets ◽

Mobile Platforms ◽

Xml Retrieval ◽

Design And Implementation ◽

Clustering Model ◽

Computing Platforms ◽

Retrieval Engine

XML receives widely interests in data exchanging and information management on both traditional desktop computing platforms and rising mobile computing platforms. However, traditional XML retrieval does not work on mobile devices due to the mobile platforms limitations and diversities. Considering that XML retrieval on mobile devices will become increasingly popular, in this article, we have paid attention to the design and implementation of XML retrieval and results clustering model on the android platform, building on jaxen and dom4j, the XML parser and retrieval engine; furthermore, the K-means clustering algorithm. As an example of usage, we have tested the prototype on some data sets to the mobile scenario and illustrated the feasibility of the proposed approach. The model demonstrated in this article is available on the mobile XML Retrieval project website: http://code.google.com/p/mobilexmlretrieval/.

Download Full-text

Design and Implementation of Test Case Selection Based on Map Reduce

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.654.378 ◽

2014 ◽

Vol 654 ◽

pp. 378-381

Author(s):

Yu Lin Liu ◽

Yan Wang ◽

Jian Tao Zhou

Keyword(s):

Cloud Computing ◽

Software Testing ◽

Map Reduce ◽

Test Case ◽

Test Cases ◽

Large Collection ◽

Test Case Selection ◽

Design And Implementation ◽

Case Selection ◽

Hadoop Platform

In the traditional software testing, a large collection of test cases of the tested system automatically generated , in the process of actual execution, all of the test cases are executed is not possible. Normally, we test a certain function of the tested system, so choosing the test cases about a certain function is very important. This paper focuses on solving the problem of choosing test cases about a certain function of the tested system based on CPN model, the method which is based on purpose is used in this process. In the process of test cases selection, there are a whole lot of repeated calculation and operation, this characteristic just can make it combined with the parallel advantage of cloud computing. In summary, this dissertation focus on the test cases selection problem, using MapReduce programming on Hadoop platform, a test case selection tool is designed to improve the efficiency and service capabilities of test selection, the result of the experiment is consistent with the expected result.

Download Full-text