scholarly journals A study on time models in graph databases for security log analysis

2021 ◽  
Vol ahead-of-print (ahead-of-print) ◽  
Author(s):  
Daniel Hofer ◽  
Markus Jäger ◽  
Aya Khaled Youssef Sayed Mohamed ◽  
Josef Küng

Purpose For aiding computer security experts in their study, log files are a crucial piece of information. Especially the time domain is very important for us because in most cases, timestamps are the only linking points between events caused by attackers, faulty systems or simple errors and their corresponding entries in log files. With the idea of storing and analyzing this log information in graph databases, we need a suitable model to store and connect timestamps and their events. This paper aims to find and evaluate different approaches how to store timestamps in graph databases and their individual benefits and drawbacks. Design/methodology/approach We analyse three different approaches, how timestamp information can be represented and stored in graph databases. For checking the models, we set up four typical questions that are important for log file analysis and tested them for each of the models. During the evaluation, we used the performance and other properties as metrics, how suitable each of the models is for representing the log files’ timestamp information. In the last part, we try to improve one promising looking model. Findings We come to the conclusion, that the simplest model with the least graph database-specific concepts in use is also the one yielding the simplest and fastest queries. Research limitations/implications Limitations to this research are that only one graph database was studied and also improvements to the query engine might change future results. Originality/value In the study, we addressed the issue of storing timestamps in graph databases in a meaningful, practical and efficient way. The results can be used as a pattern for similar scenarios and applications.

2020 ◽  
Vol 31 (6) ◽  
pp. 1587-1601
Author(s):  
Md. Sazol Ahmmed ◽  
Md. Faisal Arif ◽  
Md. Mosharraf Hossain

PurposeSolid waste (SW) is the result of rapid urbanization and industrialization, and is increasing day by day by the increasing number of population. This thesis paper emphasizes on the prediction of SW generation in the city of Dhaka and finding sustainable pathways for minimizing the gaps in the existing system.Design/methodology/approachIn this paper, the survey of different questionnaires of the Dhaka South City Corporation (DSCC) was conducted. The data of SW generation, for few years of each month, in the city of Dhaka were collected to develop a model named Artificial Neural Network (ANN). The ANN model was used for the accurate prediction of SW generation.FindingsAt first, by using the ANN model with the one hidden layer and changing the number of neurons of the layer different models were created and tested. Finally, according to R values (training, test, all) the structure with six neurons in the hidden layer was selected as the suitable model. Finally, six gaps were found in the existing system of solid waste management (SWM) in the DSCC. These gaps are the main barrier for the better SWM.Originality/valueThe authors propose that the best model for prediction is 12-6-3, and its training and testing results are given as 0.9972 and 0.80380, respectively. So the resulting prediction is so much close in comparison with actual data. In this paper, the opportunities of those gaps are provided for working properly and the DSCC will find the better result in the aspect of SW problem.


2015 ◽  
Vol 33 (4) ◽  
pp. 610-623 ◽  
Author(s):  
Tobias Blanke ◽  
Michael Bryant ◽  
Reto Speck

Purpose – In 2010 the European Holocaust Research Infrastructure (EHRI) was funded to support research into the Holocaust. The project follows on from significant efforts in the past to develop and record the collections of the Holocaust in several national initiatives. The purpose of this paper is to introduce the efforts by EHRI to create a flexible research environment using graph databases. The authors concentrate on the added features and design decisions to enable efficient processing of collection information as a graph. Design/methodology/approach – The paper concentrates on the specific customisations EHRI had to develop, as the graph database approach is new, and the authors could not rely on existing solutions. The authors describe the serialisations of collections in the graph to provide for efficient processing. Because the EHRI infrastructure is highly distributed, the authors also had to invest a lot of effort into reliable distributed access control mechanisms. Finally, the authors analyse the user-facing work on a portal and a virtual research environment (VRE) in order to discover, share and analyse Holocaust material. Findings – Using the novel graph database approach, the authors first present how we can model collection information as graphs and why this is effective. Second, we show how we make collection information persistent and describe the complex access management system we have developed. Third, we outline how we integrate user interaction with the data through a VRE. Originality/value – Scholars require specialised access to information. The authors present the results of the work to develop integrated research with collections on the Holocaust researchers and the proposals for a socio-technical ecosystem based on graph database technologies. The use of graph databases is new and the authors needed to work on several innovative customisations to make them work in the domain.


2016 ◽  
Vol 28 (3) ◽  
pp. 98-114
Author(s):  
Mona Lundin ◽  
Johan Lundin

Purpose In this study, online in-service training for people employed in the food production industry is scrutinized. The purpose of this study is to analyse how the participants adapt to such online environments in terms of the kind of discussions they establish. The more specific interest relates to how the participants discuss current work experiences in relation to the contents of quality assurance they are expected to learn. Design/methodology/approach The data analyzed are Web discussions in forms of chat log files from ten courses. Findings The results show that, on the one hand, general principles have to be substantiated in the form of concrete examples to actually function as principles and, on the other hand, concrete examples are made interesting only if they have a bearing on a more general issue. Another interesting finding is that the course participants gradually take over the vocabulary of quality assurance; they more frequently write about their work in terms of, e.g. criteria, relevance, estimations and hazards. The conclusion is that Web discussions as part of in-service training constitute a new arena for reflection in and on practice. Originality/value This is interesting to explore, as it is designed to meet the needs of employers and employees to learn the new set of rules and procedures, which regulate the European food industry. In this respect, the training activities are of direct relevance to daily work practices. Simultaneously, online environments seem to offer flexibility and thus constitute a solution for training in a dispersed industry.


2020 ◽  
Vol 36 (8) ◽  
pp. 29-31

Purpose Reviews the latest management developments across the globe and pinpoints practical implications from cutting-edge research and case studies. Design/methodology/approach This briefing is prepared by an independent writer who adds their own impartial comments and places the articles in context. Findings The problem with developing a reputation of being something of an oracle in the business world is that all of a sudden, everyone expects you to pull off the trick of interpreting the future on a daily basis. Like a freak show circus act or one-hit wonder pop singer, people expect you to perform when they see you, and they expect you to perform the thing that made you famous, even if it is the one thing in the world you don’t want to do. And when you fail to deliver on these heightened expectations, you are dismissed as a one trick pony, however good that trick is in the first place. Originality/value The briefing saves busy executives and researchers hours of reading time by selecting only the very best, most pertinent information and presenting it in a condensed and easy-to-digest format.


Kybernetes ◽  
2019 ◽  
Vol 49 (4) ◽  
pp. 1083-1102
Author(s):  
Georgios N. Aretoulis ◽  
Jason Papathanasiou ◽  
Fani Antoniou

Purpose This paper aims to rank and identify the most efficient project managers (PMs) based on personality traits, using Preference Ranking Organization METHod for Enrichment Evaluations (PROMETHEE) methodology. Design/methodology/approach The proposed methodology relies on the five personality traits. These were used as the selection criteria. A questionnaire survey among 82 experienced engineers was used to estimate the required weights per personality trait. A second two-part questionnaire survey aimed at recording the PMs profile and assess the performance of personality traits per PM. PMs with the most years of experience are selected to be ranked through Visual PROMETHEE. Findings The findings suggest that a competent PM is the one that scores low on the “Neuroticism” trait and high especially on the “Conscientiousness” trait. Research limitations/implications The research applied a psychometric test specifically designed for Greek people. Furthermore, the proposed methodology is based on the personality characteristics to rank the PMs and does not consider the technical skills. Furthermore, the type of project is not considered in the process of ranking PMs. Practical implications The findings could contribute in the selection of the best PM that maximizes the project team’s performance. Social implications Improved project team communication and collaboration leading to improved project performance through better communication and collaboration. This is an additional benefit for the society, especially in the delivery of public infrastructure projects. A lot of public infrastructure projects deviate largely as far as cost and schedule is concerned and this is an additional burden for public and society. Proper project management through efficient PMs would save people’s money and time. Originality/value Identification of the best PMbased on a combination of multicriteria decision-making and psychometric tests, which focus on personality traits.


2021 ◽  
Vol 22 (S2) ◽  
Author(s):  
Daniele D’Agostino ◽  
Pietro Liò ◽  
Marco Aldinucci ◽  
Ivan Merelli

Abstract Background High-throughput sequencing Chromosome Conformation Capture (Hi-C) allows the study of DNA interactions and 3D chromosome folding at the genome-wide scale. Usually, these data are represented as matrices describing the binary contacts among the different chromosome regions. On the other hand, a graph-based representation can be advantageous to describe the complex topology achieved by the DNA in the nucleus of eukaryotic cells. Methods Here we discuss the use of a graph database for storing and analysing data achieved by performing Hi-C experiments. The main issue is the size of the produced data and, working with a graph-based representation, the consequent necessity of adequately managing a large number of edges (contacts) connecting nodes (genes), which represents the sources of information. For this, currently available graph visualisation tools and libraries fall short with Hi-C data. The use of graph databases, instead, supports both the analysis and the visualisation of the spatial pattern present in Hi-C data, in particular for comparing different experiments or for re-mapping omics data in a space-aware context efficiently. In particular, the possibility of describing graphs through statistical indicators and, even more, the capability of correlating them through statistical distributions allows highlighting similarities and differences among different Hi-C experiments, in different cell conditions or different cell types. Results These concepts have been implemented in NeoHiC, an open-source and user-friendly web application for the progressive visualisation and analysis of Hi-C networks based on the use of the Neo4j graph database (version 3.5). Conclusion With the accumulation of more experiments, the tool will provide invaluable support to compare neighbours of genes across experiments and conditions, helping in highlighting changes in functional domains and identifying new co-organised genomic compartments.


2021 ◽  
Vol 11 (13) ◽  
pp. 5944
Author(s):  
Gunwoo Lee ◽  
Jongpil Jeong

Semiconductor equipment consists of a complex system in which numerous components are organically connected and controlled by many controllers. EventLog records all the information available during system processes. Because the EventLog records system runtime information so developers and engineers can understand system behavior and identify possible problems, it is essential for engineers to troubleshoot and maintain it. However, because the EventLog is text-based, complex to view, and stores a large quantity of information, the file size is very large. For long processes, the log file comprises several files, and engineers must look through many files, which makes it difficult to find the cause of the problem and therefore, a long time is required for the analysis. In addition, if the file size of the EventLog becomes large, the EventLog cannot be saved for a prolonged period because it uses a large amount of hard disk space on the CTC computer. In this paper, we propose a method to reduce the size of existing text-based log files. Our proposed method saves and visualizes text-based EventLogs in DB, making it easier to approach problems than the existing text-based analysis. We will confirm the possibility and propose a method that makes it easier for engineers to analyze log files.


Database ◽  
2020 ◽  
Vol 2020 ◽  
Author(s):  
Claire M Simpson ◽  
Florian Gnad

Abstract Graph representations provide an elegant solution to capture and analyze complex molecular mechanisms in the cell. Co-expression networks are undirected graph representations of transcriptional co-behavior indicating (co-)regulations, functional modules or even physical interactions between the corresponding gene products. The growing avalanche of available RNA sequencing (RNAseq) data fuels the construction of such networks, which are usually stored in relational databases like most other biological data. Inferring linkage by recursive multiple-join statements, however, is computationally expensive and complex to design in relational databases. In contrast, graph databases store and represent complex interconnected data as nodes, edges and properties, making it fast and intuitive to query and analyze relationships. While graph-based database technologies are on their way from a fringe domain to going mainstream, there are only a few studies reporting their application to biological data. We used the graph database management system Neo4j to store and analyze co-expression networks derived from RNAseq data from The Cancer Genome Atlas. Comparing co-expression in tumors versus healthy tissues in six cancer types revealed significant perturbation tracing back to erroneous or rewired gene regulation. Applying centrality, community detection and pathfinding graph algorithms uncovered the destruction or creation of central nodes, modules and relationships in co-expression networks of tumors. Given the speed, accuracy and straightforwardness of managing these densely connected networks, we conclude that graph databases are ready for entering the arena of biological data.


2020 ◽  
Vol ahead-of-print (ahead-of-print) ◽  
Author(s):  
Ann Sophie K. Löhde ◽  
Giovanna Campopiano ◽  
Andrea Calabrò

PurposeChallenging the static view of family business governance, we propose a model of owner–manager relationships derived from the configurational analysis of managerial behavior and change in governance structure.Design/methodology/approachStemming from social exchange theory and building on the 4C model proposed by Miller and Le Breton-Miller (2005), we consider the evolving owner–manager relationship in four main configurations. On the one hand, we account for family businesses shifting from a generalized to a restricted exchange system, and vice versa, according to whether a family manager misbehaves in a stewardship-oriented governance structure or a nonfamily manager succeeds in building a trusting relationship in an agency-oriented governance structure. On the other hand, we consider that family firms will strengthen a generalized exchange system, rather than a restricted one, according to whether a family manager contributes to the stewardship-oriented culture in the business or a nonfamily manager proves to be driven by extrinsic rewards. Four scenarios are analyzed in terms of the managerial behavior and governance structure that characterize the phases of the relationship between owners and managers.FindingsVarious factors trigger managerial behavior, making the firm deviate from or further build on what is assumed by stewardship and agency theories (i.e. proorganizational versus opportunistic behavior, respectively), which determine the governance structure over time. Workplace deviance, asymmetric altruism and patriarchy on the one hand, and proorganizational behavior, relationship building and long-term commitment on the other, are found to determine how the manager behaves and thus characterize the owner's reactions in terms of governance mechanisms. This enables us to present a dynamic view of governance structures, which adapt to the actual attitudes and behaviors of employed managers.Research limitations/implicationsAs time is a relevant dimension affecting individual behavior and triggering change in an organization, one must consider family business governance as being dynamic in nature. Moreover, it is not family membership that determines the most appropriate governance structure but the owner–manager relationship that evolves over time, thus contributing to the 4C model.Originality/valueThe proposed model integrates social exchange theory and the 4C model to predict changes in governance structure, as summarized in the final framework we propose.


Author(s):  
Jozef Kapusta ◽  
Michal Munk ◽  
Dominik Halvoník ◽  
Martin Drlík

If we are talking about user behavior analytics, we have to understand what the main source of valuable information is. One of these sources is definitely a web server. There are multiple places where we can extract the necessary data. The most common ways are to search for these data in access log, error log, custom log files of web server, proxy server log file, web browser log, browser cookies etc. A web server log is in its default form known as a Common Log File (W3C, 1995) and keeps information about IP address; date and time of visit; ac-cessed and referenced resource. There are standardized methodologies which contain several steps leading to extract new knowledge from provided data. Usu-ally, the first step is in each one of them to identify users, users’ sessions, page views, and clickstreams. This process is called pre-processing. Main goal of this stage is to receive unprocessed web server log file as input and after processing outputs meaningful representations which can be used in next phase. In this pa-per, we describe in detail user session identification which can be considered as most important part of data pre-processing. Our paper aims to compare the us-er/session identification using the STT with the identification of user/session us-ing cookies. This comparison was performed concerning the quality of the se-quential rules generated, i.e., a comparison was made regarding generation useful, trivial and inexplicable rules.


Sign in / Sign up

Export Citation Format

Share Document