unstructured information
Recently Published Documents


TOTAL DOCUMENTS

143
(FIVE YEARS 41)

H-INDEX

12
(FIVE YEARS 2)

2022 ◽  
pp. 1876-1891
Author(s):  
A. Jayanthiladevi ◽  
Surendararavindhan ◽  
Sakthivel

Big data depicts information volume – petabytes to exabytes in organized, semi-organized, and unstructured information that can possibly be broken down for data. Fast data are facts streaming into applications and computing environments from hundreds of thousands to millions of endpoints. Fast data is totally different from big data. There is no question that we will continue generating large volumes of data, especially with the wide variety of handheld units and internet-connected devices expected to grow exponentially. Data streaming analytics is vital for disruptive applications. Streaming analytics permits the processing of terabytes of data in memory. This chapter explores fast data and big data with IoT streaming analytics.


Data ◽  
2021 ◽  
Vol 6 (12) ◽  
pp. 129
Author(s):  
Panagiotis Panagiotidis ◽  
Kyriakos Giannakis ◽  
Nikolaos Angelopoulos ◽  
Angelos Liapis

Recent tragic marine incidents indicate that more efficient safety procedures and emergency management systems are needed. During the 2014–2019 period, 320 accidents cost 496 lives, and 5424 accidents caused 6210 injuries. Ideally, we need historical data from real accident cases of ships to develop data-driven solutions. According to the literature, the most critical factor to the post-incident management phase is human error. However, no structured datasets record the crew’s actions during an incident and the human factors that contributed to its occurrence. To overcome the limitations mentioned above, we decided to utilise the unstructured information from accident reports conducted by governmental organisations to create a new, well-structured dataset of maritime accidents and provide intuitions for its usage. Our dataset contains all the information that the majority of the marine datasets include, such as the place, the date, and the conditions during the post-incident phase, e.g., weather data. Additionally, the proposed dataset contains attributes related to each incident’s environmental/financial impact, as well as a concise description of the post-incident events, highlighting the crew’s actions and the human factors that contributed to the incident. We utilise this dataset to predict the incident’s impact and provide data-driven directions regarding the improvement of the post-incident safety procedures for specific types of ships.


2021 ◽  
pp. 001112872110475
Author(s):  
Roos Geurts ◽  
Niels Raaijmakers ◽  
Marc J. M. H. Delsing ◽  
Toine Spapens ◽  
Jacqueline Wientjes ◽  
...  

Following the EU Victim Directive, Dutch police officers are obliged to assess a victim’s vulnerability to repeat victimization. This study explored the utility of unstructured police information for the prediction of repeat victimization, as well as its incremental value over and above structured police information. Police records over a period of 6 years were retrieved for a sample of 116,680 victims. Unstructured information was transformed into numeric features using count-vector and TF/IDF methods. Classification models were built using decision tree and random forest models. AUC values indicate that a combination of structured and unstructured police information could be used to correctly classify a majority of repeat and non-repeat victims.


Author(s):  
А. Mukasheva

The purpose of this article is to study one of the methods of social networks analysis – text sentiment analysis. Today, social media has become a big data base that social network analysis is used for various purposes – from setting up targeted advertising for a cosmetics store to preventing riots at the state level. There are various methods for analyzing social networks such as graph method, text sentiment analysis, audio, and video object analysis. Among them, sentiment analysis is widely used for political, social, consumer research, and also for cybersecurity. Since the analysis of the sentiment of the text involves the analysis of the emotional opinions expressed in the text, the first step is to define the term opinion. An opinion can be simple, that is, a positive, negative or neutral emotion towards a particular object or its aspect. Comparison is also an opinion, but devoid of emotional connotation. To work with simple opinions, the first task of text sentiment analysis is to classify the text. There are three levels of classifications: classification at the text level, at the level of a sentence, and at the aspect level of the object. After classifying the text at the desired level, the next task is to extract structured data from unstructured information. The problem can be solved using the five-tuple method. One of the important elements of a tuple is the aspect in which an opinion is usually expressed. Next, aspect-based sentiment analysis is applied, which involves identifying aspects of the desired object and assessing the polarity of mood for each aspect. This task is divided into two sub-tasks such as aspect extraction and aspect classification. Sentiment analysis has limitations such as the definition of sarcasm and difficulty of working with abbreviated words.


Author(s):  
Aihsan Suhail ◽  

In the present made world, dependably, individuals around the planet grant through different stages on the Web. It has been addressed, about 71% of by and large online customers read online surveys going before buying a thing. Thing considers, particularly the early surveys (i.e., the investigations posted at the beginning time of a thing), astoundingly impact coming about thing deals. We call the clients who posted the early examinations as "early investigators". Be that as it may, early specialists contribute just a little level of surveys, their feelings can pick the achievement or disappointment of new things and associations. It is immense for relationship to perceive early spectators since their responses can assist relationship with changing publicizing frameworks and improve thing plans, which can at last incite the accomplishment of their new things. And in dependably, a mass extent of unstructured information is made. This information is as text, which is accumulated from get-togethers, online media regions, surveys. Such information is named as gigantic information. Client feelings are identified with a wide degree of spotlights like on express things also. These investigations can be mined utilizing different movements and are of everything considered significance to make checks since they unmistakably pass on the perspective of the bigger part. Online outlines moreover have become a basic wellspring of data for clients going before settling on an educated buy choice. Early examiner's appraisals and their got strength scores are apparently going to influence thing notoriety. The test is to assemble all the audits, in like way find and investigate the assessments, to locate something refined, that scores high evaluating.


Author(s):  
MV Shivaani

Comparative analysis commands special attention in financial analysis as it not only facilitates understanding of  year-on-year changes but also of trends in the performance and position of a company. It is often a go-to tool for competitor analysis. In this note, I illustrate the use of  R (software), its allied packages, and textual analysis algorithms to extend the use of comparative analysis to ‘unstructured’ information presented in the MD&A section of annual reports. For this use case, I consider two giant tech rivals, Apple and Amazon, and present a comparative analysis of their MD&A section using Cosine and Jaccard similarity measures. I also compare the most important words based on tf-idf and sentiments for each company and across the two companies. When supplemented with financial information, comparative analysis can offer novel insights for analysts, managers, researchers, and academics and is a valuable tool to include in accounting curricula.


2021 ◽  
Vol 2 (1) ◽  
pp. 71-77
Author(s):  
R. A. Bagutdinov ◽  
D. V. Bezhuashvili

Currently, there is an increase in information for data mining in transport systems, the main reason is the increase in the number of heterogeneous sources. The relevance of the topic lies in the need to collect, process, aggregate, and model large volumes of unstructured information that cannot be effectively processed by traditional methods. With the increasing flow of vehicles, its diversity, there is a need to optimize the processes of transportation and logistics, increase the system safety of road traffic. The creation of an information knowledge base will help to solve a number of important problems, including: the efficiency of road use, reduction of toxic emissions, control and unloading of traffic flows, reduction in the number of accidents, and prompt notification of services.The idea of developing a unified centralized traffic control system is described. To collect, store and process heterogeneous information, it is proposed to use a cloud infrastructure with split computation. For the purpose of high-quality processing and aggregation of heterogeneous information, it is recommended to investigate hidden dependencies in the data, build and analyze various aggregation options and interpret them in relation to specific tasks.The system should connect all participants in ground traffic, collect dissimilar materials that can be obtained from their devices and a variety of sensors, and also automate the management and decision-making in transport systems. Unstructured information must be correctly interpreted, categorized, and consistently labeled to identify implicit relationships between data.The scientific novelty of the research consists in the formation of the functions of the system being developed, the description of the main aspects, requirements, interfaces, models and methods for aggregating heterogeneous data.The results of the work can be used not only for analyzing big data in the field of transport, but also in other directions when solving problems of processing heterogeneous information.


2021 ◽  
Author(s):  
David Chartash ◽  
Marc B Rosenman ◽  
Johan Bollen ◽  
Markus Dickinson ◽  
Stephen M Downs

AbstractBackgroundThe act of diagnosis is one which precipitates semiotic closure, the complex integration of signs and symptoms through cognitive perspectives to ultimately activate causal reasoning and calibrate the assignment of a disease entity to the patient. In writing about this act, physicians encode both structured and unstructured information into the medical record. Unstructured information contains a latent structure which entwines both the cognitive components of the diagnostic act and the linguistic patterns associated with clinical documentation. Existing models of clinical language primarily use a physical or dialogic model of information as their basis, and do not adequately account for the complexity inherent in the diagnostic act.MethodsFraming the diagnostic information collected in clinical care as a narrative, we developed a model representative of said information, accounting for its content and structure, as well as the inherent complexity therein. Using an exemplar text, we present the use of known predication and semantic relations from ontological (the Unified Medical Language System) and linguistic theory (Rhetorical Structure Theory) to facilitate the operationalization of the model, and analyze the result.ResultsThe resulting model is demonstrated to be complex, representative of the clinical narrative text, and is fundamentally aligned with the clinical acts of both documentation and diagnosis. We find the model’s representation of the cognitive aspects of narrative consistent with models of reading, as well as an adequate model of information as presented by clinical medicine and the clinical sub-language.ConclusionsWe present a model to represent diagnostic information in the physician’s note which accounts for the clinical and textual narrative precipitated by the cognition involved in encoding said information into the unstructured medical record. This model prepends the development of (computational) linguistic models of the clinical sublanguage within the physician’s note as it relates to diagnosis, beyond the information level of the lexical unit. Such analysis would facilitate better reflection on the structure and meaning of the clinical note, offering improvements to medical education and care.


2021 ◽  
Vol 3 (1) ◽  
pp. 58-86
Author(s):  
Khikmah Susanti ◽  
Mercy Lona Darwaty Ryndang Sriganda

YouTube channels that have sprung up in the last 15 years, you could say these channels are owned by the general public and artists. They become YouTubers offering content that is both informative and entertainment in nature. DiTivi is a YouTube channel owned by Didi Riyadi, a well-known Indonesian artist. Didi as DiTivi's content creator creates several contents, one of which is Ferdy and Didi Show. These impressions provide informative and entertaining content. The purpose of this research is to find out how the communication style of Ferdy and Didi as hosts on the Ferdy and Didi Show shows according to the indicators, namely, language selection, word selection, pronunciation techniques and message source delivery. The method used is a qualitative descriptive approach. Data collection was obtained from the YouTube channel Ditivi broadcast by Ferdy and Didi Show by selecting two episodes to be studied based on adjustments to time and situational context. The results of this study, the communication style developed by Ferdy and Didi is an aggressive and assertive communication style. With the concept of intimate talks in the sense that the conversation takes place in a friendly and informal atmosphere, the flow of two-way communication where both of them play their role well, when they become communicators or communicants so that the feed that is thrown gets a quick response. The language selection is Indonesian with the Betawi dialect, mixed with Sundanese and English. Selection of words contains entertainment and unstructured information, the delivery of words that are inverted and repeated. Pronunciation techniques, there are differences in the way of delivery, Ferdy uses a soft and calm voice, Didi uses a morefirm and clear voice. Delivery of the source of the message, both convey based on the field of their own experiences and frames of reference for other people's thoughts.


Sign in / Sign up

Export Citation Format

Share Document