Data Mining Medical Digital Libraries

Author(s):  
Colleen Cunningham ◽  
Xiaohua Hu

Given the exponential growth rate of medical data and the accompanying biomedical literature, more than 10,000 documents per week (Leroy et al., 2003), it has become increasingly necessary to apply data mining techniques to medical digital libraries in order to assess a more complete view of genes, their biological functions and diseases. Data mining techniques, as applied to digital libraries, are also known as text mining.

2011 ◽  
pp. 334-340
Author(s):  
Colleen Cunningham

Given the exponential growth rate of medical data and the accompanying biomedical literature, more than 10,000 documents per week (Leroy et al., 2003), it has become increasingly necessary to apply data mining techniques to medical digital libraries in order to assess a more complete view of genes, their biological functions and diseases. Data mining techniques, as applied to digital libraries, are also known as text mining.


2008 ◽  
pp. 1810-1816
Author(s):  
Colleen Cunningham ◽  
Xiaohua Hu

Given the exponential growth rate of medical data and the accompanying biomedical literature, more than 10,000 documents per week (Leroy et al., 2003), it has become increasingly necessary to apply data mining techniques to medical digital libraries in order to assess a more complete view of genes, their biological functions and diseases. Data mining techniques, as applied to digital libraries, are also known as text mining.


Author(s):  
Mahwish Abid ◽  
Muhammad Usman ◽  
Muhammad Waleed Ashraf

<strong>As the technology is growing very fast and usage of computer systems is increased  as compared to the old times, plagiarism is the phenomenon which is increasing day by day. Wrongful appropriation of someone else’s work is known as plagiarism. Manually detection of plagiarism is difficult so this process should be automated. There are various tools which can be used for plagiarism detection. Some works on intrinsic plagiarism while other work on extrinsic plagiarism. Data mining the field which can help in detecting the plagiarism as well as can help to improve the efficiency of the process. Different data mining techniques can be used to detect plagiarism. Text mining, clustering, bi-gram, tri-grams, n-grams are the techniques which can help in this process</strong>


1998 ◽  
Vol 01 (04) ◽  
pp. 473-486 ◽  
Author(s):  
Roberto Baviera ◽  
Michele Pasquini ◽  
Maurizio Serva ◽  
Angelo Vulpiani

We consider a stochastic model of investment on an asset in a stock market for a prudent investor. she decides to buy permanent goods with a fraction α of the maximum amount of money owned in her life in order that her economic level never decreases. The optimal strategy is obtained by maximizing the exponential growth rate for a fixed α. We derive analytical expressions for the typical exponential growth rate of the capital and its fluctuations by solving an one-dimensional random walk with drift.


2014 ◽  
Vol 25 (08) ◽  
pp. 937-953
Author(s):  
ARSENY M. SHUR

We study FAD-languages, which are regular languages defined by finite sets of forbidden factors, together with their “canonical” recognizing automata. We are mainly interested in the possible asymptotic orders of growth for such languages. We analyze certain simplifications of sets of forbidden factors and show that they “almost” preserve the canonical automata. Using this result and structural properties of canonical automata, we describe an algorithm that effectively lists all canonical automata having a sink strong component isomorphic to a given digraph, or reports that no such automata exist. This algorithm can be used, in particular, to prove the existence of a FAD-language over a given alphabet with a given exponential growth rate. On the other hand, we give an example showing that the algorithm cannot prove non-existence of a FAD-language having a given growth rate. Finally, we provide some examples of canonical automata with a nontrivial condensation graph and of FAD-languages with a “complex” order of growth.


2000 ◽  
Vol 63 (2) ◽  
pp. 268-272 ◽  
Author(s):  
DANA M. McELROY ◽  
LEE-ANN JAYKUS ◽  
PEGGY M. FOEGEDING

The growth of psychrotrophic Bacillus cereus 404 from spores in boiled rice was examined experimentally at 15, 20, and 30°C. Using the Gompertz function, observed growth was modeled, and these kinetic values were compared with kinetic values for the growth of mesophilic vegetative cells as predicted by the U.S. Department of Agriculture's Pathogen Modeling Program, version 5.1. An analysis of variance indicated no statistically significant difference between observed and predicted values. A graphical comparison of kinetic values demonstrated that modeled predictions were “fail safe” for generation time and exponential growth rate at all temperatures. The model also was fail safe for lag-phase duration at 20 and 30°C but not at l5°C. Bias factors of 0.55, 0.82, and 1.82 for generation time, lag-phase duration, and exponential growth rate, respectively, indicated that the model generally was fail safe and hence provided a margin of safety in its growth predictions. Accuracy factors of 1.82, 1.60, and 1.82 for generation time, lag-phase duration, and exponential growth rate, respectively, quantitatively demonstrated the degree of difference between predicted and observed values. Although the Pathogen Modeling Program produced reasonably accurate predictions of the growth of psychrotrophic B. cereus from spores in boiled rice, the margin of safety provided by the model may be more conservative than desired for some applications. It is recommended that if microbial growth modeling is to be applied to any food safety or processing situation, it is best to validate the model before use. Once experimental data are gathered, graphical and quantitative methods of analysis can be useful tools for evaluating specific trends in model prediction and identifying important deviations between predicted and observed data.


Author(s):  
Scott Nicholson ◽  
Jeffrey Stanton

Most people think of a library as the little brick building in the heart of their community or the big brick building in the center of a campus. These notions greatly oversimplify the world of libraries, however. Most large commercial organizations have dedicated in-house library operations, as do schools, non-governmental organizations, as well as local, state, and federal governments. With the increasing use of the Internet and the World Wide Web, digital libraries have burgeoned, and these serve a huge variety of different user audiences. With this expanded view of libraries, two key insights arise. First, libraries are typically embedded within larger institutions. Corporate libraries serve their corporations, academic libraries serve their universities, and public libraries serve taxpaying communities who elect overseeing representatives. Second, libraries play a pivotal role within their institutions as repositories and providers of information resources. In the provider role, libraries represent in microcosm the intellectual and learning activities of the people who comprise the institution. This fact provides the basis for the strategic importance of library data mining: By ascertaining what users are seeking, bibliomining can reveal insights that have meaning in the context of the library’s host institution. Use of data mining to examine library data might be aptly termed bibliomining. With widespread adoption of computerized catalogs and search facilities over the past quarter century, library and information scientists have often used bibliometric methods (e.g., the discovery of patterns in authorship and citation within a field) to explore patterns in bibliographic information. During the same period, various researchers have developed and tested data mining techniques—advanced statistical and visualization methods to locate non-trivial patterns in large data sets. Bibliomining refers to the use of these bibliometric and data mining techniques to explore the enormous quantities of data generated by the typical automated library.


1994 ◽  
Vol 05 (02) ◽  
pp. 213-218 ◽  
Author(s):  
GABRIEL P. PATERNAIN ◽  
MIGUEL PATERNAIN

Using Yomdin's Theorem [8], we show that for a compact Riemannian manifold M, the geodesic entropy — defined as the exponential growth rate of the average number of geodesic segments between two points — is ≤ the topological entropy of the geodesic flow of M. We also show that if M is simply connected and N ⊂ M is a compact simply connected submanifold, then the exponential growth rate of the sequence given by the Betti numbers of the space of paths starting in N and ending in a fixed point of M, is bounded above by the topological entropy of the geodesic flow on the normal sphere bundle of N.


2018 ◽  
Vol 150 ◽  
pp. 06003 ◽  
Author(s):  
Saima Anwar Lashari ◽  
Rosziati Ibrahim ◽  
Norhalina Senan ◽  
N. S. A. M. Taujuddin

This paper investigates the existing practices and prospects of medical data classification based on data mining techniques. It highlights major advanced classification approaches used to enhance classification accuracy. Past research has provided literature on medical data classification using data mining techniques. From extensive literature analysis, it is found that data mining techniques are very effective for the task of classification. This paper analysed comparatively the current advancement in the classification of medical data. The findings of the study showed that the existing classification of medical data can be improved further. Nonetheless, there should be more research to ascertain and lessen the ambiguities for classification to gain better precision.


Sign in / Sign up

Export Citation Format

Share Document