scholarly journals Web News Data Extraction Technology Based on Text Keywords

Complexity ◽  
2021 ◽  
Vol 2021 ◽  
pp. 1-11
Author(s):  
Kun Zhang

In order to shorten the time for users to query news on the Internet, this paper studies and designs a network news data extraction technology, which can obtain the main news information through the extraction of news text keywords. Firstly, the TF-IDF keyword extraction algorithm, TextRank keyword extraction algorithm, and LDA keyword extraction algorithm are analyzed to understand the keyword extraction process, and the TF-IDF algorithm is optimized by Zipf’s law. By introducing the idea of model fusion, five schemes based on waterfall fusion and parallel combination fusion are designed, and the effects of the five schemes are verified by experiments. It is found that the designed extraction technology has a good effect on network news data extraction. News keyword extraction has a great application prospect, which can provide the basis for the research fields of news key phrases, news abstracts, and so on.

Pharmaceutics ◽  
2021 ◽  
Vol 13 (3) ◽  
pp. 358 ◽  
Author(s):  
Chiara R. M. Brambilla ◽  
Ogochukwu Lilian Okafor-Muo ◽  
Hany Hassanin ◽  
Amr ElShaer

Three-dimensional (3D) printing is a recent technology, which gives the possibility to manufacture personalised dosage forms and it has a broad range of applications. One of the most developed, it is the manufacture of oral solid dosage and the four 3DP techniques which have been more used for their manufacture are FDM, inkjet 3DP, SLA and SLS. This systematic review is carried out to statistically analyze the current 3DP techniques employed in manufacturing oral solid formulations and assess the recent trends of this new technology. The work has been organised into four steps, (1) screening of the articles, definition of the inclusion and exclusion criteria and classification of the articles in the two main groups (included/excluded); (2) quantification and characterisation of the included articles; (3) evaluation of the validity of data and data extraction process; (4) data analysis, discussion, and conclusion to define which technique offers the best properties to be applied in the manufacture of oral solid formulations. It has been observed that with SLS 3DP technique, all the characterisation tests required by the BP (drug content, drug dissolution profile, hardness, friability, disintegration time and uniformity of weight) have been performed in the majority of articles, except for the friability test. However, it is not possible to define which of the four 3DP techniques is the most suitable for the manufacture of oral solid formulations, because the selection is affected by different parameters, such as the type of formulation, the physical-mechanical properties to achieve. Moreover, each technique has its specific advantages and disadvantages, such as for FDM the biggest challenge is the degradation of the drug, due to high printing temperature process or for SLA is the toxicity of the carcinogenic risk of the photopolymerising material.


2013 ◽  
Vol 278-280 ◽  
pp. 2058-2064
Author(s):  
Cheng Ying Chi ◽  
Hong Li ◽  
Xue Gang Zhan ◽  
Sheng Nan Jiang

In this paper, through analysis of the structure of web news texts, we have proposed an improvement measure for term weighting in hot topics detection, and a topic weighting scheme for hot topics ranking. Experiment result comparison shows that our method is effective and ranking of hot topics is closer to reality.


2010 ◽  
Vol 44-47 ◽  
pp. 4041-4049 ◽  
Author(s):  
Hong Zhao ◽  
Chen Sheng Bai ◽  
Song Zhu

Search engines can bring a lot of benefit to the website. For a site, each page’s search engine ranking is very important. To make web page ranking in search engine ahead, Search engine optimization (SEO) make effect on the ranking. Web page needs to set the keywords as “keywords" to use SEO. The paper focuses on the content of a given word, and extracts the keywords of each page by calculating the word frequency. The algorithm is implemented by C # language. Keywords setting of webpage are of great importance on the information and products


2021 ◽  
Vol 16 (10) ◽  
pp. 1934578X2110461
Author(s):  
Hua Jiang ◽  
Jun Li ◽  
Ning Zhang ◽  
Hai-Yang He ◽  
Jia-Min An ◽  
...  

Chlorogenic acid has been proved to have cardiovascular protection, antibacterial, antiviral, hemostatic, and hypolipidemia effects. Modern scientific research on the bioactivity of chlorogenic acid has been extended to the fields of food, medicine, health care and daily-use chemical industry. The aim of this research was to optimize the extraction conditions for chlorogenic acid from Eucommia ulmoides (Eucommiaceae) leaves. The significant variables were screened and optimized by a combination of Plackett-Burman test and Box-Behnken design. Optimum extraction parameters with ethanol concentration of 50%, solvent pH value of 3, and particle size of 60 mesh were determined according to variance analysis and contour plots. Under these conditions, the yield of chlorogenic acid was up to 4.36 mg/g, which was basically consistent with the theoretical prediction value of 4.50 mg/g. This study also proved the potential antioxidant activity of E. ulmoides leaves. The optimal extract of E. ulmoides leaves rich in chlorogenic acid showed the highest antioxidant activity in the FRAP method, which was 219.8 μM Trolox equivalents (TE) per g extract weight (EW) (μM TE/g EW). The DPPH method gave a similar value (168 μM TE/g EW) to the ABTS method (152 μM TE/g EW). The established extraction process was efficient in the recovery of chlorogenic acid from E. ulmoides leaves, encouraging its valorization as a cheap and sustainable alternative for the isolation of chlorogenic acid.


2010 ◽  
Vol 160-162 ◽  
pp. 704-708
Author(s):  
Ru Bing Han

The surface of the wafer is easy to be polluted by the organic pollution material. The supercritical fluid extraction technology works well in extracting organic pollution material. Whether the extraction process influences the surface performance of the wafer can be determined through the SEM(scanning electron microscope), AFM (atomic force microscope), and XPS (X-ray photoelectron spectroscopy). Compare the feature and the electronic structure of the wafer before and after supercritical CO2 extraction to get how supercritical CO2 extraction process influences the wafer surface performance. The conclusion helps to determine whether the extraction technology can be applied in the wafer surface cleaning technology. Tests show that supercritical CO2 extraction process almost does not influence the surface performance of the wafer, and, the supercritical CO2 extraction technology has a good prospect in the wafer cleaning.


Author(s):  
Yevgeny Shamin ◽  
Dmitry Zhevnenko ◽  
Fedor Meschaninov ◽  
Vladislav Kozhevnikov ◽  
Yevgeny Gornev

The work is devoted to the analysis of various approaches to the problem of the empirical memristor model parameters extraction. A description of the peculiarities of the extraction process is given, and an original version of the extraction algorithm is proposed. The proposed algorithm is compared with other considered ones.


Author(s):  
Francisco Andres Rivera-Quiroz ◽  
Jeremy Miller

Traditional taxonomic publications have served as a biological data repository accumulating vast amounts of data on species diversity, geographical and temporal distributions, ecological interactions, taxonomic relations, among many other types of information. However, the fragmented nature of taxonomic literature has made this data difficult to access and use to its full potential. Current anthropogenic impact on biodiversity demands faster knowledge generation, but also making better use of what we already have. This could help us make better-informed decisions about conservation and resources management. In past years, several efforts have been made to make taxonomic literature more mobilized and accessible. These include online publications, open access journals, the digitization of old paper literature and improved availability through online specialized repositories such as the Biodiversity Heritage Library (BHL) and the World Spider Catalog (WSC), among others. Although easy to share, PDF publications still have most of their biodiversity data embedded in strings of text making them less dynamic and more difficult or impossible to read and analyze without a human interpreter. Recently developed tools as GoldenGATE-Imagine (GGI) allow transforming PDFs in XML files that extract and categorize taxonomically relevant data. These data can then be aggregated in databases such as Plazi TreatmentBank, where it can be re-explored, queried and analyzed. Here we combined several of these cybertaxonomic tools to test the data extraction process for one potential application: the design and planning of an expedition to collect fresh material in the field. We targeted the ground spider Teutamus politus and other related species from the Teutamus group (TG) (Araneae; Liocranidae). These spiders are known from South East Asia and have been cataloged in the family Liocranidae; however, their relations, biology and evolution are still poorly understood. We marked-up 56 publications that contained taxonomic treatments with specimen records for the Liocranidae. Of these publications, 20 contained information on members of the TG. Geographical distributions and occurrences of 90 TG species were analyzed based on 1,309 specimen records. These data were used to design our field collection in a way that allowed us to optimize the collection of adult specimens of our target taxa. The TG genera were most common in Indonesia, Thailand and Malaysia. From these, Thailand was the second richest but had the most records of T. politus. Seasonal distribution of TG specimens in Thailand suggested June and July as the best time for collecting adults. Based on these analyses, we decided to sample from mid-July to mid-August 2018 in the three Thai provinces that combined most records of TG species and T. politus. Relying on the results of our literature analyses and using standard collection methods for ground spiders, we captured at least one specimen of every TG genus reported for Thailand. Our one-month expedition captured 231 TG spiders; from these, T. politus was the most abundant species with 188 specimens (95 adults). By comparison, a total of 196 specimens of the TG and 66 of T. politus had been reported for the same provinces in the last 40 years. Our sampling greatly increased the number of available specimens, especially for the genera Teutamus and Oedignatha. Also, we extended the known distribution of Oedignatha and Sesieutes within Thailand. These results illustrate the relevance of making biodiversity data contained within taxonomic treatments accessible and reusable. It also exemplifies one potential use of taxonomic legacy data: to more efficiently use existing biodiversity data to fill knowledge gaps. A similar approach can be used to study neglected or interesting taxa and geographic areas, generating a better biodiversity documentation that could aid in decision making, management and conservation.


2021 ◽  
Vol ahead-of-print (ahead-of-print) ◽  
Author(s):  
Norhazlina Ibrahim ◽  
Safeza Mohd Sapian

Purpose This study, using systematic literature review (SLR) aims to highlight and summarise current studies on the factors influencing customers’ Islamic home financing (IHF) selection and Islamic banking product preference, which has gained popularity within the banking sector over the past three decades. The SLR could map evolution and research fields, recommend a particular categorisation and determine primary issues to demonstrate current trends, future research directions and theoretical development. Design/methodology/approach The SLR was performed with a four-step reporting standard for the systematic evidence syntheses review method (research question formulation, systematic searching, quality assessment and data extraction) using 33 screened articles between 2008 and 2020 from two primary databases (Scopus and Web of Science) and one supporting database (Google Scholar). Findings The resulting factors could be categorised into four primary themes: consumer behaviour, consumer attributes, bank attributes and bank attributes (Islamic). The themes were subsequently divided into 16 sub-themes. Notably, all the factors proved essential for consumers’ evolving preferences and product competitiveness in the market. Research limitations/implications This study encountered two limitations based on database selection and research period. Practical implications This SLR aimed to offer useful insights into the factors that should be prioritised by financial institutions for marketing approaches by investigating consumer behaviours. Originality/value This study pioneered an SLR on the study area for useful insights into the current research limitations and recommendations on future study directions. Specifically, the study method facilitated critical discussions and comparisons to past research outcomes and objectivity with triangulation from distinct perspectives.


Sign in / Sign up

Export Citation Format

Share Document