scholarly journals Lessons Learned from Creating a Balanced Corpus from Online Data

Author(s):  
Roberts Darģis ◽  
Kristīne Levāne-Petrova ◽  
Ilmārs Poikāns

This paper describes lessons learned from developing the most recent Balanced Corpus of Modern Latvian (LVK2018) from various online sources. Most of the new corpora are created from data obtained from various text holders, which requires cooperation agreements with each of the text holders. Reaching these cooperation agreements is a difficult and time consuming task and may not be necessary if the resource to be created is not of hundred millions of size. Although there are many different resources available on the Internet today for a particular language, finding viable online resources to create a balanced corpus is still a challenging task. Developing a balanced corpus from various online sources does not require agreements with text holders, but it presents many more technical challenges, including text extraction, cleaning and validation.

Author(s):  
Aubrey Bloomfield ◽  
Sean Jacobs

The Internet and social media increasingly are becoming sources about the African past and present in ways that will influence to some extent how history will be learnt and the form that methods of historical research will take. Social media have increasingly dislodged print journalism as “the first rough draft of history” and tended to democratize and hasten information sharing and communication. Historians are working through difficult debates about the Internet as a source archive, the usability of websites, and related matters. The debate over online resources and their use in historical and other studies on one level remains unresolved. Nevertheless, online sources add another rich layer to narratives, stories, and perspectives that are already being recorded or told, and in this regard they will add to the storehouse of empirical data to be crunched by future historians.


2016 ◽  
Vol 78 (4) ◽  
pp. 335-337
Author(s):  
William D. Stansfield

When biology students are in the field or in the laboratory observing common animals or pictures thereof, we would like them to be able to identify some of the differences between, say, a frog and a toad, or a hare and a rabbit. These differences may be anatomical, physiological, behavioral, reproductive, or developmental. This article suggests a way for students at the high school or higher educational levels to learn how to use the Internet to distinguish between some common or well-known animal pairs (such as butterflies and moths). A starter list of online sources of information is provided for distinguishing between 16 such animal pairs.


2015 ◽  
Vol 10 (2) ◽  
pp. 188-202 ◽  
Author(s):  
Jose Rodrigo Cordoba-Pachon ◽  
Cecilia Loureiro-Koechlin

Purpose – Qualitative research has made important contributions to social science by enabling researchers to engage with people and get an in-depth understanding of their views, beliefs and perceptions about social phenomena. With new and electronically mediated forms of human interaction (e.g. the online world), there are new opportunities for researchers to gather data and participate with or observe people in online groups. The purpose of this paper is to present features, challenges and possibilities for online ethnography as an innovative form of qualitative research. Design/methodology/approach – Ethnography is about telling a story about what happens in a particular setting or settings. In order to do this online, it is important to revisit, adopt and adapt some ideas about traditional (offline) ethnography. The paper distinguishes online ethnography from other types of research. It draws some generic features of online ethnography and identifies challenges for it. With these ideas in mind the paper presents and provides a reflection of an online ethnography of software developers. Findings – Online ethnography can provide valuable insights about social phenomena. The paper identifies generic features of this approach and a number of challenges related to its practice. These challenges have to do with to the choice of settings, use of online data for research, representation of people and generation of valuable and useful knowledge. The paper also highlights issues for future consideration in research and practice. Practical implications – The ethnography helped the researcher to identify and address a number of methodological challenges in practice and position herself in relation to relevant audiences she wanted to speak to. The paper also suggests different orientations to online ethnography. Lessons learned highlight potential contributions as well as further possibilities for qualitative research in the online world. Originality/value – Online ethnography offers possibilities to engage with a global audience of research subjects. For academics and practitioners the paper opens up possibilities to use online tools for research and it shows that the use of these tools can help overcome difficulties in access and interaction with people and to study a diversity of research topics, not only those that exist online. The paper offers guidance for researchers about where to start and how to proceed if they want to conduct online ethnography and generate useful and valuable knowledge in their area of interest.


Author(s):  
Julia Hörnle

Jurisdiction is the foundational concept for both national laws and international law as it provides the link between the sovereign government and its territory, and ultimately its people. The internet challenges this concept at its root: data travels across the internet without respecting political borders or territory. This book is about this Jurisdictional Challenge created by internet technologies. The Jurisdictional Challenge arises as civil disputes, criminal cases, and regulatory action span different countries, rising questions as to the international competence of courts, law enforcement, and regulators. From a technological standpoint, geography is largely irrelevant for online data flows and this raises the question of who governs “YouTubistan.” Services, communication, and interaction occur online between persons who may be located in different countries. Data is stored and processed online in data centres remote from the actual user, with cloud computing provided as a utility. Illegal acts such as hacking, identity theft and fraud, cyberespionage, propagation of terrorist propaganda, hate speech, defamation, revenge porn, and illegal marketplaces (such as Silkroad) may all be remotely targeted at a country, or simply create effects in many countries. Software applications (“apps”) developed by a software developer in one country are seamlessly downloaded by users on their mobile devices worldwide, without regard to applicable consumer protection, data protection, intellectual property, or media law. Therefore, the internet has created multi-facetted and complex challenges for the concept of jurisdiction and conflicts of law. Traditionally, jurisdiction in private law and jurisdiction in public law have belonged to different areas of law, namely private international law and (public) international law. The unique feature of this book is that it explores the notion of jurisdiction in different branches of “the” law. It analyses legislation and jurisprudence to extract how the concept of jurisdiction is applied in internet cases, taking a comparative law approach, focusing on EU, English, German, and US law. This synthesis and comparison of approaches across the board has produced new insights on how we should tackle the Jurisdictional Challenge. The first three chapters explain the Jurisdictional Challenge created by the internet and place this in the context of technology, sovereignty, territory, and media regulation. The following four chapters focus on public law aspects, namely criminal law and data protection jurisdiction. The next five chapters are about private law disputes, including cross-border B2C e-commerce, online privacy and defamation disputes, and internet intellectual property disputes. The final chapter harnesses the insights from the different areas of law examined.


2019 ◽  
Vol 4 (2) ◽  
pp. 81
Author(s):  
Fransiska Timoria Samosir ◽  
Fransiska Timoria SAmosir ◽  
Fransiska Timoria SAmosir

In this era of globalization, internet usage is growing very rapidly. It is characterized by the use of gadgets among students who are always connected to the Internet. Students like to find information in the form of real or audiovisual. Youtube presence also gives a great impact to this generation. This is what makes the basis for knowing how to use youtube as a medium lessons learned,This research is deskriptif qualitatif approach. The informants are students of the Faculty of Social and Political Sciences, University of Bengkulu. The number of informants in this study were as many as 16. Data was collected through interviews. The results of this study indicate that students have a high level of usage of gadgets and is always connected to the internet. Students almost daily open the youtube application on their gadgets. Students who use youtube as a medium lesson learnded to add knowledge and support the work of the lecture. Youtube usage rate is influenced by gender, courses and classes.  Keyword: gadget, youtube, internet  


2017 ◽  
Vol 25 (2) ◽  
pp. 13
Author(s):  
Jo Ann Carr

This article reviews the development of three Web-based education resources and the potential for each of these resources to meet the needs of users for a 'killer app'. Three case studies (the Annotated List of Education Journals, the IDEAS Portal Web Site and the Eisenhower National Clearinghouse Web Site)review the purpose, audience, content, funding, publicity and structure of the sites. Differences in staffing, funding and the centrality of these sites to the mission of their sponsoring institutions impacted the growth of these sites. Technological changes and the diffuse nature of the Internet also impacted the development of these resources.


Author(s):  
Ahmad Syafi'i

 ABSTRACT  Writing becomes one of fundamental skills in learning language. In tis digital era, when everything is possible to be done in online,  the students prefer to easily find some references in the websites or other online sources. The students hesitate to produce their own ideas in that they can get anything they want on the internet. This cases take a big part of the quality of the students in mastering writing skill. Thus, the teachers should have many alternative to help the students, more to EFL stuents to improve their writing performance through such an online based-learning. Grammarly is one of possible solution can be taken as the students writing assistant while the teachers giving some exercises. The students will be helpful working on Grammarly in that they will be able to produce any languages they know, then Grammarly will take a place to help them correct their work.    


Author(s):  
Duanning Zhou ◽  
Arsen Djatej ◽  
Robert Sarikas ◽  
David Senteney

This chapter discusses a growth framework for industry web portals which present a new opportunity in the internet business. The framework contains five stages: business plan stage, website development stage, attraction stage, entrenchment stage, and defense stage. The actions to be taken and strategies to be applied in each stage are set out. Two industry web portals are investigated in detail. The two examples illustrate the applicability of the proposed growth framework to the real world. The combination of a conceptual growth framework and the application of this conceptual framework to two real-world examples yields a set of guidelines based in large part on lessons learned from the two examples. Thus, this chapter provides a concept-based growth framework and a set of real-world-based guidelines that will very possibly provide a practical benefit to industry web portal business practitioners.


Author(s):  
Leila Zemmouchi-Ghomari

Data play a central role in the effectiveness and efficiency of web applications, such as the Semantic Web. However, data are distributed across a very large number of online sources, due to which a significant effort is needed to integrate this data for its proper utilization. A promising solution to this issue is the linked data initiative, which is based on four principles related to publishing web data and facilitating interlinked and structured online data rather than the existing web of documents. The basic ideas, techniques, and applications of the linked data initiative are surveyed in this paper. The authors discuss some Linked Data open issues and potential tracks to address these pending questions.


Sign in / Sign up

Export Citation Format

Share Document