Using data-mining to identify and study patterns in lexical innovation on the web

2018 ◽  
Vol 25 (1) ◽  
pp. 174-200
Author(s):  
Daphné Kerremans ◽  
Jelena Prokić ◽  
Quirin Würschinger ◽  
Hans-Jörg Schmid

Abstract This paper presents the NeoCrawler – a tailor-made webcrawler, which identifies and retrieves neologisms from the Internet and systematically monitors the use of detected neologisms on the web by means of weekly searches. It enables researchers to use the web as a corpus in order to investigate the dynamics of lexical innovation on a large-scale and systematic basis. The NeoCrawler represents an innovative web-mining tool which opens up new opportunities for linguists to tackle a number of unresolved and under-researched issues in the field of lexical innovation. This paper presents the design as well as the most important characteristics of two modules, the Discoverer and the Observer, with regard to the usage-based study of lexical innovation and diffusion.

Author(s):  
Parimala Boobalan

With the recent advancements in supercomputer technologies, large-scale, high-precision, and realistic model 3D simulations have been dominant in the field of solar-terrestrial physics, virtual reality, and health. Since 3D numeric data generated through simulation contain more valuable information than available in the past, innovative techniques for efficiently extracting such useful information are being required. One such technique is visualization—the process of turning phenomena, events, or relations not directly visible to the human eye into a visible form. Visualizing numeric data generated by observation equipment, simulations, and other means is an effective way of gaining intuitive insight into an overall picture of the data of interest. Meanwhile, data mining is known as the art of extracting valuable information from a large amount of data relative to finance, marketing, the internet, and natural sciences, and enhancing that information to knowledge.


2011 ◽  
pp. 236-253
Author(s):  
Kuldeep Kumar ◽  
John Baker

Data mining has emerged as one of the hottest topics in recent years. It is an extraordinarily broad area and is growing in several directions. With the advancement of the Internet and cheap availability of powerful computers, data is flooding the market at a tremendous pace. However, the technology for navigating, exploring, visualizing and summarizing large databases are still in their infancy. The quantity and diversity of data available to make decisions has increased dramatically during the past decade. Large databases are being built to hold and deliver these data. Data mining is defined as the process of seeking interesting or valuable information within large data sets. Some examples of data mining applications in the area of management science are analysis of direct-mailing strategies, sales data analysis for customer segmentation, credit card fraud detection, mass customization, etc. With the advancement of the Internet and World Wide Web, both management scientists and interested end-users can get large data sets for their research from this source. The Web not only contains a vast amount of useful information, but also provides a powerful infrastructure for communication and information sharing. For example, Ma, Liu and Wong (2000) have developed a system called DS-Web that uses the Web to help data mining. A recent survey on Web mining research can be seen in the paper by Kosala and Blockeel (2000).


Data Mining ◽  
2013 ◽  
pp. 1312-1319
Author(s):  
Marco Scarnò

CASPUR allows many academic Italian institutions located in the Centre-South of Italy to access more than 7 million articles through a digital library platform. The behaviour of its users were analyzed by considering their “traces”, which are stored in the web server log file. Using several web mining and data mining techniques the author discovered a gradual and dynamic change in the way articles are accessed. In particular there is evidence of a journal browsing increase in comparison to the searching mode. Such phenomenon were interpreted using the idea that browsing better meets the needs of users when they want to keep abreast about the latest advances in their scientific field, in comparison to a more generic searching inside the digital library.


2013 ◽  
Vol 433-435 ◽  
pp. 1885-1889
Author(s):  
Lu Feng ◽  
Zhan Quan Wen ◽  
Jie Mei Lin

We used the principle of hyperlink analysis method to mine the website data according to the indicators of the hyperlink analysis. We selected Taobao.com as an object of study. The evaluation indicators of network marketing effect were page views, sales quantity, sales, the number of adding store to bookmark . According to our research, we find Taobao.com stores can use data mining tool to obtain the very good marketing effect.


Author(s):  
Jayanti Mehra ◽  
Ramjeevan Singh Thakur

Weblog analysis takes raw data from access logs and performs study on this data for extracting statistical information. This info incorporates a variety of data for the website activity such as average no. of hits, total no. of user visits, failed and successful cached hits, average time of view, average path length over a website; analytical information such as page was not found errors and server errors; server information, which includes exit and entry pages, single access pages, and top visited pages; requester information like which type of search engines is used, keywords and top referring sites, and so on. In general, the website administrator uses this kind of knowledge to make the system act better, helping in the manipulation process of site, then also forgiving marketing decisions support. Most of the advanced web mining systems practice this kind of information to take out more difficult or complex interpretations using data mining procedures like association rules, clustering, and classification.


2017 ◽  
Vol 871 ◽  
pp. 44-51
Author(s):  
Christian Sand ◽  
Florian Renz ◽  
Akin Cüneyt Aslanpinar ◽  
Jörg Franke

Modern large-scale assembly lines need to deliver a highly varied and flexible output, while achieving 0 ppm scrap. This is becoming more and more demanding due to an increasing complexity of the products. Thus, it will be a major step in manufacturing processes to develop process monitoring strategies which increase productivity as well as flexibility and reliability of the entire assembly process. Therefore, it is necessary to advance the entire chained assembly line instead of only isolated processes and stations. For this reason, technological processes have to be assessed as a chain of upstream and downstream partial processes instead of being considered in isolation. [3] Moreover, data mining projects depend on the available data bases, while additional data sources may increase the derived knowledge. [2] These ideas are extendable by energy data measurements, besides process and quality data. Existing monitoring approaches to reduce scrap usually use dashboards linked with process and quality data. [5] Therefore, this paper presents a new methodology using data mining analysis of energy data for assembly presses as well as complete assembly lines for electromagnetic actuators. This novel holistic approach realized by a Quick Reaction System allows to increase efficiency, while decreasing energy and resource consumption for actuator manufacturing on large scale assembly lines. In particular, the data base consists of process and quality data, enriched by energy data measurements. This approach enables a comprehensive process characterization as well as monitoring of whole assembly lines by using data mining tools. Furthermore, this paper describes a quantitative evaluation of its data mining based event detection of critical process parameters.


2002 ◽  
Vol 185 ◽  
pp. 168-169
Author(s):  
S. Morgan

AbstractFourier coefficients are a valuable tool in the study of a wide variety of pulsating stars. They can be used to derive various physical parameters, including mass, luminosity, metallicity and effective temperature and are frequently used to discriminate between different pulsation modes. With the increase in large-scale surveys and the availability of data on the Internet, the number of Fourier coefficients available for study has expanded greatly and it is difficult to find all current data for individual stars or a subset of stars. To assist others in obtaining and making use of Fourier coefficients, an archive of published values of Fourier coefficients has been set up. Users can search for data on individual stars or for a range of parameters. Several Java programs are used to display the data in a variety of ways. The archive is located at the Web site http://www.earth.uni.edu/fourier/.


2013 ◽  
Vol 846-847 ◽  
pp. 1868-1872
Author(s):  
Shuai Gang

In recent years, the number and size of Web services on the Internet have a rapid development. Industry and academia start to study the web service. In Internet resources, if the web cannot be found, the web service will become meaningless. So for web services, large-scale managements and problems are the keys of the study of Internet service resources. This paper studies large-scale distributed web services in network resources based on SOA architecture ideas. It also designs the unified management and organization system of ideological and political education which treat the ideological and political education as the content. It proposes SN network resource service model of ideological and political education. With the development and popularization of the Internet today, the study on Internet resources of ideological and political education in this paper provides a theoretical reference for the innovation of the ideological and political education.


Sign in / Sign up

Export Citation Format

Share Document