Blog Snippets Based Drug Effects Extraction System Using Lexical and Grammatical Restrictions

Obtaining medical information has a beneficial influence on patients' treatment and QOL (quality of life). The authors aim to make a system that helps patients to collect narrative information. Extracting information from data written by patients will allow the acquisition of information which is easy to understand and provides encouragement. Additionally, by using large-scale data, the system can be utilized for discovering unknown effects or patterns. As the first step, the purpose of this paper is to extract descriptions of the effects caused by taking drugs as a triplet of expressions from illness survival blogs' snippets. This paper proposes a method to extract the triplets using specific clue words and parsing the results in order to extract from blogs written in free natural language. Moreover, recall was improved by combining their proposed method and a baseline system, and precision was improved by filtering using dictionaries we created from existing medical documents.

Download Full-text

Vertical Integration Between Providers With Possible Cloud Migration

Advances in Computer and Electrical Engineering - Advanced Methodologies and Technologies in Network Architecture, Mobile Computing, and Data Analytics ◽

10.4018/978-1-5225-7598-6.ch020 ◽

2019 ◽

pp. 274-284

Author(s):

Aleksandra Kostic-Ljubisavljevic ◽

Branka Mikavica

Keyword(s):

Large Scale ◽

Virtual Machines ◽

Wholesale Price ◽

Rejection Rate ◽

Large Scale Data ◽

Cloud Migration ◽

Computational Resources ◽

Charging Strategy ◽

Scale Data

All vertically integrated participants in content provisioning process are influenced by bandwidth requirements. Provisioning of self-owned resources that satisfy peak bandwidth demand leads to network underutilization and it is cost ineffective. Under-provisioning leads to rejection of customers' requests. Vertically integrated providers need to consider cloud migration in order to minimize costs and improve quality of service and quality of experience of their customers. Cloud providers maintain large-scale data centers to offer storage and computational resources in the form of virtual machines instances. They offer different pricing plans: reservation, on-demand, and spot pricing. For obtaining optimal integration charging strategy, revenue sharing, cost sharing, wholesale price is applied frequently. The vertically integrated content provider's incentives for cloud migration can induce significant complexity in integration contracts, and consequently improvements in costs and requests' rejection rate.

Download Full-text

A Natural Language Processing Tool for Large-Scale Data Extraction from Echocardiography Reports

PLoS ONE ◽

10.1371/journal.pone.0153749 ◽

2016 ◽

Vol 11 (4) ◽

pp. e0153749 ◽

Cited By ~ 20

Author(s):

Chinmoy Nath ◽

Mazen S. Albaghdadi ◽

Siddhartha R. Jonnalagadda

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Large Scale ◽

Data Extraction ◽

Large Scale Data ◽

Natural Language Processing Tool ◽

Scale Data

Download Full-text

PENGELOMPOKAN PERSENTASE BUTA HURUF UMUR 15-44 MENURUT PROVINSI MENGGUNAKAN ALGORITMA K-MEANS

KLIK - KUMPULAN JURNAL ILMU KOMPUTER ◽

10.20527/klik.v7i3.329 ◽

2020 ◽

Vol 7 (3) ◽

pp. 230

Author(s):

Saifullah Saifullah ◽

Nani Hidayati

Keyword(s):

Data Mining ◽

Human Resources ◽

Data Clustering ◽

Large Scale ◽

Market Basket ◽

Large Scale Data ◽

The Government ◽

Large Scale Data Processing ◽

Scale Data

Data Mining is a method that is often needed in large-scale data processing, so data mining has important access to the fields of life including industry, finance, weather, science and technology. In data mining techniques there are methods that can be used, namely classification, clustering, regression, variable selection, and market basket analysis. Illiteracy is one of the factors that hinder the quality of human resources. One of the basic things that must be fulfilled to improve the quality of human resources is the eradication of illiteracy among the community. The purpose of this study is to determine the clustering of illiterate communities based on provinces in Indonesia. The results of the study are illiterate data clustering according to the age proportion of 15-44 namely 1 high group node, low group has 27 nodes, and medium group 6 nodes. The results of this study become input for the government to determine illiteracy eradication policies in Indonesia based on provinces.Kata Kunci: Illiterate, Data mining, K-Means ClusteringData Mining termasuk metode yang sering dibutuhkan dalam pengolahan data berskala besar, maka data mining mempunyai akses penting pada bidang kehidupan diantaranya yaitu bidang industri, bidang keuangan, cuaca, ilmu dan teknologi. Pada teknik data mining terdapat metode-metode yang dapat digunakan yaitu klasifikasi, clustering, regresi, seleksi variabel, dan market basket analisis. Buta huruf merupakan salah satu faktor yang menghambat kualitas sumber daya manusia. Salah satu hal mendasar yang harus dipenuhi untuk meningkatkan kualitas sumber daya manusia adalah pemberantasan buta huruf di kalangan masyarakat Adapun tujuan penelitian ini adalah menetukan clustering masyarakat buta huruf berdasarkan propinsi di Indonesia. Hasil dari penelitian adalah data clustering buta huruf menurut propisi umur 15-44 yaitu 1 node kelompok tinggi, kelompok rendah memiliki 27 node, dan kelompok sedang 6 node. Hasil penelitian ini menjadi bahan masukan kepada pemerintah untuk menentukan kebijakan pemberantasan buta huruf di Indonesia berdasarakn propinsi.Kata Kunci: Buta Huruf, Data mining, K-Means Clustering

Download Full-text

Obtaining High-Quality Label by Distinguishing between Easy and Hard Items in Crowdsourcing

Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2017/413 ◽

2017 ◽

Cited By ~ 4

Author(s):

Wei Wang ◽

Xiang-Yu Guo ◽

Shao-Yuan Li ◽

Yuan Jiang ◽

Zhi-Hua Zhou

Keyword(s):

Large Scale ◽

Experimental Results ◽

Training Set ◽

High Quality ◽

Quality Label ◽

Large Scale Data ◽

Voluntary Workers ◽

Scale Data

Crowdsourcing systems make it possible to hire voluntary workers to label large-scale data by offering them small monetary payments. Usually, the taskmaster requires to collect high-quality labels, while the quality of labels obtained from the crowd may not satisfy this requirement. In this paper, we study the problem of obtaining high-quality labels from the crowd and present an approach of learning the difficulty of items in crowdsourcing, in which we construct a small training set of items with estimated difficulty and then learn a model to predict the difficulty of future items. With the predicted difficulty, we can distinguish between easy and hard items to obtain high-quality labels. For easy items, the quality of their labels inferred from the crowd could be high enough to satisfy the requirement; while for hard items, the crowd could not provide high-quality labels, it is better to choose a more knowledgable crowd or employ specialized workers to label them. The experimental results demonstrate that the proposed approach by learning to distinguish between easy and hard items can significantly improve the label quality.

Download Full-text

Describing and improving homoeopathy

British Homeopathic Journal ◽

10.1016/s0007-0785(05)80859-2 ◽

1994 ◽

Vol 83 (03) ◽

pp. 135-141 ◽

Cited By ~ 2

Author(s):

P. Fisher ◽

R. Van Haselen

Keyword(s):

Data Collection ◽

Large Scale ◽

Daily Practice ◽

Efficacy And Safety ◽

Modern Information Technology ◽

Large Scale Data ◽

Modern Information ◽

User Friendly ◽

Scale Data

AbstractLarge scale data collection combined with modern information technology is a powerful tool to evaluate the efficacy and safety of homoeopathy. It also has great potential to improve homoeopathic practice. Data collection has not been widely used in homoeopathy. This appears to be due to the clumsiness sof the methodology and the perception that it is of little value to daily practice. 3 protocols addressing different aspects of this issue are presented.- A proposal to establish common basic data collection methodology for homoeopaths throughout Europe.- A systematic survey of the results of homoeopathic treatment of patients with rheumatoid arthritis using quality of life and objective assessments.- Verification of a set of homoeopathic prescribing features for Rhus toxicodendron.These proposals are designed to be ‘user-friendly’ and to provide practical information relevant to daily homoeopathic practice.

Download Full-text

Evaluating the healthiness of chain-restaurant menu items using crowdsourcing: a new method

Public Health Nutrition ◽

10.1017/s1368980016001804 ◽

2016 ◽

Vol 20 (1) ◽

pp. 18-24 ◽

Cited By ~ 3

Author(s):

Lenard I Lesser ◽

Leslie Wu ◽

Timothy B Matthiessen ◽

Harold S Luft

Keyword(s):

Public Health ◽

Nutritional Quality ◽

Large Scale ◽

Registered Dietitian ◽

Food Items ◽

Large Scale Data ◽

Nutrient Profiling ◽

The Cost ◽

Scale Data

AbstractObjectiveTo develop a technology-based method for evaluating the nutritional quality of chain-restaurant menus to increase the efficiency and lower the cost of large-scale data analysis of food items.DesignUsing a Modified Nutrient Profiling Index (MNPI), we assessed chain-restaurant items from the MenuStat database with a process involving three steps: (i) testing ‘extreme’ scores; (ii) crowdsourcing to analyse fruit, nut and vegetable (FNV) amounts; and (iii) analysis of the ambiguous items by a registered dietitian.ResultsIn applying the approach to assess 22 422 foods, only 3566 could not be scored automatically based on MenuStat data and required further evaluation to determine healthiness. Items for which there was low agreement between trusted crowd workers, or where the FNV amount was estimated to be >40 %, were sent to a registered dietitian. Crowdsourcing was able to evaluate 3199, leaving only 367 to be reviewed by the registered dietitian. Overall, 7 % of items were categorized as healthy. The healthiest category was soups (26 % healthy), while desserts were the least healthy (2 % healthy).ConclusionsAn algorithm incorporating crowdsourcing and a dietitian can quickly and efficiently analyse restaurant menus, allowing public health researchers to analyse the healthiness of menu items.

Download Full-text

Assessing the quality of large-scale data standards: A case of XBRL GAAP Taxonomy

Decision Support Systems ◽

10.1016/j.dss.2014.01.006 ◽

2014 ◽

Vol 59 ◽

pp. 351-360 ◽

Cited By ~ 14

Author(s):

Hongwei Zhu ◽

Harris Wu

Keyword(s):

Large Scale ◽

Data Standards ◽

Large Scale Data ◽

Scale Data

Download Full-text

Trust, Religiousity, Income, Quality of Accounting Information, and Muzaki Decision to Pay Zakat

JURNAL AKUNTANSI DAN KEUANGAN ISLAM ◽

10.35836/jakis.v9i1.217 ◽

2021 ◽

Vol 9 (1) ◽

pp. 39-58

Author(s):

Efri Syamsul Bahri ◽

◽

Ade Suhaeti ◽

Nursanita Nasution ◽

◽

...

Keyword(s):

Large Scale ◽

Sampling Method ◽

Negative Impact ◽

Linear Regression Analysis ◽

Accounting Information ◽

Multiple Linear Regression Analysis ◽

Large Scale Data ◽

Quality Of Accounting Information ◽

Scale Data

This study tests the factors that influence the decision of muzaki in channeling zakat, namely: trust, religiosity, income, and quality of accounting information. This study is a survey of 40 muzaki from Amil Zakat Institution (known as LAZ) Zakat Sukses in Depok. This study uses the purposive sampling method. Data analysis using SPSS 25 software with multiple linear regression analysis. The results of this study indicate that trust, religiosity, income, and the quality of accounting information simultaneously influence the decision of muzaki to distribute zakat through LAZ Zakat Sukses in Depok. Partially, trust, religiosity, and income positively affect the decision of muzaki to distribute zakat through LAZ Zakat Sukses. Meanwhile, the quality of accounting information has a negative impact on muzakki's decisions in distributing zakat through LAZ Zakat Sukses. This study's scope is on the muzaki at LAZ Zakat Sukses Depok. Therefore, the results may not represent nationally. Therefore, similar studies in collecting more large-scale data and broader areas will be useful. The implication is that LAZ Zakat Sukses need to show zakat management's performance to increase muzaki trust.

Download Full-text

Measuring happiness of large-scaled online Turkish unstructured data (Preprint)

10.2196/preprints.24037 ◽

2020 ◽

Author(s):

Esra Kahya Özyirmidokuz ◽

Kumru Uyar ◽

Raian Ali ◽

Eduard Alexandru Stoica ◽

Betül Karakaş

Keyword(s):

Social Networks ◽

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Large Scale ◽

Well Being ◽

Emotional Awareness ◽

Large Scale Data ◽

Processing Algorithms ◽

Scale Data

BACKGROUND Measuring online Turkish happiness requires a Turkish happiness dictionary which could reflect norms and social values more culturally and linguistically instead of using a translation-oriented method. Analyzing data without neglecting cultural characteristics will not be reliable. Turkish translation of an English word in the Affective Norms of English Words (ANEW) dictionary does not express the same feeling of a Turkish word. In addition, existing emotional dictionaries are not developed for specifically for the social networks with emoticons. OBJECTIVE This research presents the Turkish Happiness Index (THI) which is a set of psychological normative happiness scores to measure an average level of happiness of Turkish online unstructured large-scale data. A well-being informatics analytics research is also done by using THI. METHODS Turkish Happiness Index was completely generated on social networks. 20000 words were extracted with web text mining from social networks. Natural Language Processing algorithms were applied. After data reduction quantitative research methodology is applied. The happiness scores were based detected based on 667 participants’ subjective happiness levels and their thoughts about the 1874 Turkish words. Alexithymia scale was also used to identify the emotional awareness of the participants. The evaluations of the words were done in the dimension of valence using the Self-Assessment Manikin in an online platform. NLP was used to measure online Turkish happiness of data. Data was collected from Facebook with negative #war and positive #family hashtags in a duration of one month using a 3rd party software tool. Natural language processing algorithms including tokenization, transformation, filtering and stemming after converting data to documents. The happiness levels of the documents based on hashtags were determined using the Turkish Happiness Index dictionary. RESULTS THI which contains 345 words and their happiness scores in the Turkish language was developed. The THI is given in Appendix 1. We also put a comparison between words of dictionaries to understand the cultural differences. CONCLUSIONS THI provide researchers with standard materials through which they can automatically measure online happiness of Turkish large-scale data. THI can be used in in real-time big data analytics.

Download Full-text

Assessment of quality of life in patients with the effects of transient ischemic stroke

East European Journal of Neurology ◽

10.33444/2411-5797.2016.6(12).37-39 ◽

2016 ◽

pp. 37-39

Author(s):

A. Babirad

Keyword(s):

Quality Of Life ◽

Ischemic Stroke ◽

Large Scale ◽

Cerebrovascular Diseases ◽

Control Measures ◽

Sociological Research ◽

Cerebrovascular Events ◽

Sf 36 ◽

Angioplasty And Stenting

Cerebrovascular diseases are a problem of the world today, and according to the forecast, the problem of the near future arises. The main risk factors for the development of ischemic disorders of the cerebral circulation include oblique and aging, arterial hypertension, smoking, diabetes mellitus and heart disease. An effective strategy for the prevention of cerebrovascular events is based on the implementation of large-scale risk control measures, including the use of antiagregant and anticoagulant therapy, invasive interventions such as atheromectomy, angioplasty and stenting. In this connection, the efforts of neurologists, cardiologists, angiosurgery, endocrinologists and other specialists are the basis for achieving an acceptable clinical outcome. A review of the SF-36 method for assessing the quality of life in patients with the effects of transient ischemic stroke is presented. The assessment of quality of life is recognized in world medical practice and research, an indicator that is also used to assess the quality of the health system and in general sociological research.

Download Full-text