INTELLIGENT KNOWLEDGE EXTRACTION FROM THE WEB

Author(s):  
JESÚS CARDEÑOSA ◽  
EDMUNDO TOVAR

Many websites are in general poorly defined and its users are not able to find the information they need. That is the reason why many papers are addressed to propose techniques able to find the right information for a user. Most of these techniques focus on finding the required information in the whole Internet. Many times the owner of the website gives incomplete/imprecise information with low level of usefulness for the user. The re-structuring of the information is many times enough for detecting lacks of information, inconsistencies and imprecisions. However this work is normally very difficult without losing performances of the website. The authors have developed a novel application to exploit existing information in a website in a more profitable way restructuring the information without the intervention of the content provider. This paper describes the authors' experience during their participation in the European Commission ESPRIT 29158 FLEX Project.

Sensi Journal ◽  
2020 ◽  
Vol 6 (2) ◽  
pp. 236-246
Author(s):  
Ilamsyah Ilamsyah ◽  
Yulianto Yulianto ◽  
Tri Vita Febriani

The right and appropriate system of receiving and transferring goods is needed by the company. In the process of receiving and transferring goods from the central warehouse to the branch warehouse at PDAM Tirta Kerta Raharja, Tangerang Regency, which is currently done manually is still ineffective and inaccurate because the Head of Subdivision uses receipt documents, namely PPBP and mutation of goods, namely MPPW in the form of paper as a submission media. The Head of Subdivision enters the data of receipt and mutation of goods manually and requires a relatively long time because at the time of demand for the transfer of goods the Head of Subdivision must check the inventory of goods in the central warehouse first. Therefore, it is necessary to hold a design of information systems for the receipt and transfer of goods from the central warehouse to a web-based branch warehouse that is already database so that it is more effective, efficient and accurate. With the web-based system of receiving and transferring goods that are already datatabed, it can facilitate the Head of Subdivision in inputing data on the receipt and transfer of goods and control of stock inventory so that the Sub Head of Subdivision can do it periodically to make it more effective, efficient and accurate. The method of data collection is done by observing, interviewing and studying literature from various previous studies, while the system analysis method uses the Waterfall method which aims to solve a problem and uses design methods with visual modeling that is object oriented with UML while programming using PHP and MySQL as a database.


2020 ◽  
Vol 27 (3) ◽  
pp. 284-301
Author(s):  
Salvatore Fabio Nicolosi ◽  
Lisette Mustert

In a resolution adopted on 1 February 2018, the European Committee of the Regions noted that a legislative proposal of the European Commission concerning a Regulation that changes the rules governing the EU regional funds for 2014-2020 did not comply with the principle of subsidiarity. Accordingly, the Committee considered challenging the legislative proposal before the Court of Justice if the proposal was formally agreed upon. Although at a later stage the European Commission decided to take into account the Committee’s argument and amended the proposal accordingly, such a context offers the chance to investigate more in detail the role of the Committee of the Regions in the legislative process of the EU and, more in particular, its role as a watchdog of the principle of subsidiarity. This paper aims to shed light on a rather neglected aspect of the EU constitutional practice, such as the potential of the Committee of the Regions to contribute to the legislative process, and answer the question of whether this Committee is the right body to guarantee compliance with the principle of subsidiarity.


2021 ◽  
Vol 2 (4) ◽  
pp. 42-48
Author(s):  
S. V. ZAYTSEV ◽  

In March 2018 the European Commission presented a proposal to adopt a digital services tax (DST) on certain types of revenues of multinational digital Companies. The purpose of the digital services tax is to compensate in the short term for the low level of corporate taxation of these companies in the European Union and thus meet the urgent need of civil society for greater tax fairness. DST is presented as an indirect tax on turnover and is often compared to value-added tax (VAT). In this article, the author seeks to highlight the many differences that exist between the harmonized European Union VAT and the new DST. In addition, the author challenges the idea that the DST will actually be an indirect tax and, most importantly, that it will effectively increase tax justice in the European Union.


Bioderecho.es ◽  
2021 ◽  
Author(s):  
Gloria María González Suárez

Con motivo de la situación actual a la que nos enfrentamos por la pandemia de la COVID-19 se ha planteado en diversas ocasiones la implantación de un certificado verde digital. El 17 de marzo de 2021 la Comisión Europea presentó una propuesta de creación del certificado con el fin de facilitar el ejercicio del derecho a la libre circulación dentro de la Unión Europea durante la pandemia. Todo ello plantea diversas cuestiones jurídicas en cuanto a la protección de datos sanitarios, el derecho a la libre circulación y la eficacia y proporcionalidad de medidas que deben ser objeto de análisis tanto desde el punto de vista jurídico como del punto de vista ético ya que, en ciertas ocasiones la aplicación de medidas puede afectar al derecho a la igualdad de los ciudadanos. Due to the current situation we are facing due to the COVID-19 pandemic, the implementation of a digital green certificate has been proposed on several occasions. On March 17, 2021, the European Commission presented a proposal to create the certificate in order to facilitate the exercise of the right of free movement within the European Union during the pandemic. All this raises various legal questions regarding the protection of health data, the right of free movement and the efficacy and proportionality of measures that must be analyzed from both the legal and ethical point of view since, on certain occasions the application of measures may affect the right of equality of citizens.


2020 ◽  
Vol 5 (4) ◽  
pp. 43-55
Author(s):  
Gianpiero Bianchi ◽  
Renato Bruni ◽  
Cinzia Daraio ◽  
Antonio Laureti Palma ◽  
Giulio Perani ◽  
...  

AbstractPurposeThe main objective of this work is to show the potentialities of recently developed approaches for automatic knowledge extraction directly from the universities’ websites. The information automatically extracted can be potentially updated with a frequency higher than once per year, and be safe from manipulations or misinterpretations. Moreover, this approach allows us flexibility in collecting indicators about the efficiency of universities’ websites and their effectiveness in disseminating key contents. These new indicators can complement traditional indicators of scientific research (e.g. number of articles and number of citations) and teaching (e.g. number of students and graduates) by introducing further dimensions to allow new insights for “profiling” the analyzed universities.Design/methodology/approachWebometrics relies on web mining methods and techniques to perform quantitative analyses of the web. This study implements an advanced application of the webometric approach, exploiting all the three categories of web mining: web content mining; web structure mining; web usage mining. The information to compute our indicators has been extracted from the universities’ websites by using web scraping and text mining techniques. The scraped information has been stored in a NoSQL DB according to a semi-structured form to allow for retrieving information efficiently by text mining techniques. This provides increased flexibility in the design of new indicators, opening the door to new types of analyses. Some data have also been collected by means of batch interrogations of search engines (Bing, www.bing.com) or from a leading provider of Web analytics (SimilarWeb, http://www.similarweb.com). The information extracted from the Web has been combined with the University structural information taken from the European Tertiary Education Register (https://eter.joanneum.at/#/home), a database collecting information on Higher Education Institutions (HEIs) at European level. All the above was used to perform a clusterization of 79 Italian universities based on structural and digital indicators.FindingsThe main findings of this study concern the evaluation of the potential in digitalization of universities, in particular by presenting techniques for the automatic extraction of information from the web to build indicators of quality and impact of universities’ websites. These indicators can complement traditional indicators and can be used to identify groups of universities with common features using clustering techniques working with the above indicators.Research limitationsThe results reported in this study refers to Italian universities only, but the approach could be extended to other university systems abroad.Practical implicationsThe approach proposed in this study and its illustration on Italian universities show the usefulness of recently introduced automatic data extraction and web scraping approaches and its practical relevance for characterizing and profiling the activities of universities on the basis of their websites. The approach could be applied to other university systems.Originality/valueThis work applies for the first time to university websites some recently introduced techniques for automatic knowledge extraction based on web scraping, optical character recognition and nontrivial text mining operations (Bruni & Bianchi, 2020).


Tourism ◽  
2020 ◽  
Vol 68 (4) ◽  
pp. 466-481
Author(s):  
Dario Bertocchi ◽  
Nicola Camatti ◽  
Jan Van der Borg

Following the precedent set by the Tourism Observatory (TO) run by the European Commission-DG GROW a few years ago, several initiatives have taken place to design and manage tourism observatories at both the transnational and local level. However, these initiatives do not yet seem able to provide adequate operational responses to the challenges that the Commission launched with the original TO. While the opportunities offered by the Web 2.0 still do not seem to have been sufficiently taken advantage of, such initiatives also have not yet developed suitable methodologies to operationally include the tourism industry in the studies and monitoring performed by the TOs. This work presents the lesion learnt from the ShapeTourism prototype including two different tools: an observatory with official and unofficial indicators, and a simulation tool to predict different scenarios and different sustainability levels, designed specifically to overcome the aforementioned limits. The prototype was tested in 2017 on the entire eligible area of ​​the 2014-2020 MED Programme covering 52 regions. The potentialities of this tool are shown through the creation on indicators, benchmarking and applications.


1984 ◽  
Vol 4 (7) ◽  
pp. 1278-1285 ◽  
Author(s):  
J Hicks ◽  
J Strathern ◽  
A Klar ◽  
S Ismail ◽  
J Broach

The SAD mutation, an extra mating type cassette, has been shown to arise from an unequal mitotic crossover between the MAT and HMR loci, resulting in the formation of a hybrid cassette and a duplication of the MAT-HMR interval. The SAD cassette contains the "a" information and left-hand flanking regions from the parental HMRa cassette and the right-hand flanking sequences of the parental MAT cassette. This arrangement of flanking sequences causes a leaky but reproducible mating phenotype correlated with a low-level expression of the cassette as measured by RNA blotting. This weak expression is attributed to the loss of one flanking control site normally present at the silent HM storage loci.


Author(s):  
Paolo Cappellari ◽  
Jie Shi ◽  
Mark Roantree ◽  
Crionna Tobin ◽  
Niall Moyna

Sign in / Sign up

Export Citation Format

Share Document