scholarly journals Predicting and Analysis of Phishing Attacks and Breaches In E-Commerce Websites

Author(s):  
N. Ram Mohan ◽  
N. Praveen Kumar

Analyzing cyber incident data sets is an important method for deepening our understanding of the evolution of the threat situation. This is a relatively new research topic, and many studies remain to be done. In this paper, I reported a statistical analysis of a breach incident data set corresponding to 12 years (2005–2017) of cyber hacking activities that include malware attacks. I shown that, in contrast to the findings reported in the literature, both hacking breach incident inter-arrival times and breach sizes should be modeled by stochastic processes, rather than by distributions because they exhibit autocorrelations. Then, I proposed a particular stochastic process models to, respectively, fit the inter-arrival times and the breach sizes. I also shown that these models can predict the inter-arrival times and the breach sizes. In order to get deeper insights into the evolution of hacking breach incidents, we conduct both qualitative and quantitative trend analyses on the data set. I drew a set of cyber security insights, including that the threat of cyber hacks is indeed getting worse in terms of their frequency, but not in terms of the magnitude of their damage.

2007 ◽  
Vol 73 ◽  
pp. 169-190 ◽  
Author(s):  
Mandy Jay ◽  
Michael P. Richards

This paper presents the results of new research into British Iron Age diet. Specifically, it summarises the existing evidence and compares this with new evidence obtained from stable isotope analysis. The isotope data come from both humans and animals from ten British middle Iron Age sites, from four locations in East Yorkshire, East Lothian, Hampshire, and Cornwall. These represent the only significant data-set of comparative humans (n = 138) and animals (n = 212) for this period currently available for the UK. They are discussed here alongside other evidence for diet during the middle Iron Age in Britain. In particular, the question of whether fish, or other aquatic foods, were a major dietary resource during this period is examined.The isotopic data suggest similar dietary protein consumption patterns across the groups, both within local populations and between them, although outliers do exist which may indicate mobile individuals moving into the sites. The diet generally includes a high level of animal protein, with little indication of the use of marine resources at any isotopically distinguishable level, even when the sites are situated directly on the coast. The nitrogen isotopic values also indicate absolute variation across these locations which is indicative of environmental background differences rather than differential consumption patterns and this is discussed in the context of the difficulty of interpreting isotopic data without a complete understanding of the ‘baseline’ values for any particular time and place. This reinforces the need for significant numbers of contemporaneous animals to be analysed from the same locations when interpreting human data-sets.


2016 ◽  
Vol 12 (2) ◽  
pp. 182-203 ◽  
Author(s):  
Joke H. van Velzen

There were two purposes for this mixed methods study: to investigate (a) the realistic meaning of awareness and understanding as the underlying constructs of general knowledge of the learning process and (b) a procedure for data consolidation. The participants were 11th-grade high school and first-year university students. Integrated data collection and data transformation provided for positive but small correlations between awareness and understanding. A comparison of the created combined and integrated new data sets showed that the integrated data set provided for an expected statistically significant outcome, which was in line with the participants’ developmental difference. This study can contribute to the mixed methods research because it proposes a procedure for data consolidation and a new research design.


2014 ◽  
Vol 52 (4) ◽  
pp. 737-754 ◽  
Author(s):  
Margit Raich ◽  
Julia Müller ◽  
Dagmar Abfalter

Purpose – The purpose of this paper is to provide insightful evidence of phenomena in organization and management theory. Textual data sets consist of two different elements, namely qualitative and quantitative aspects. Researchers often combine methods to harness both aspects. However, they frequently do this in a comparative, convergent, or sequential way. Design/methodology/approach – The paper illustrates and discusses a hybrid textual data analysis approach employing the qualitative software application GABEK-WinRelan in a case study of an Austrian retail bank. Findings – The paper argues that a hybrid analysis method, fully intertwining qualitative and quantitative analysis simultaneously on the same textual data set, can deliver new insight into more facets of a data set. Originality/value – A hybrid approach is not a universally applicable solution to approaching research and management problems. Rather, this paper aims at triggering and intensifying scientific discussion about stronger integration of qualitative and quantitative data and analysis methods in management research.


Author(s):  
James N. Mihell ◽  
Cameron Rout

Proponents of new pipeline projects are often asked by regulators to provide estimates of risk and reliability for their proposed pipeline. On existing pipelines, the availability of operating and assessment data is generally considered to be essential to the task of performing an accurate and defendable risk or reliability assessment. For proposed or new pipelines, the absence of these data presents a significant challenge to those performing the analysis. The reliance on industry incident data presents problems, since the vast majority of loss-of-containment incidents relate to older pipelines in which the design, routing criteria, material properties, material manufacturing processes, and early operating practices differ significantly from those that are characteristic of modern pipelines. As a consequence, much of the available failure incident data does not accurately reflect the threats or the magnitudes of the threats that are associated with modern pipelines. In order to address this problem, ‘adjustment factors’ are often applied against incident data to try to account for threat differences between the source data and the intended application. The selection of these adjustment factors can often be quite subjective, however, and open to judgment; therefore, they can be difficult to justify. With the rapidly growing practice of regular in-line inspection (ILI) on transmission pipelines, an extensive repository of ILI data has been accumulated — much of it relating to modern pipelines. Through the judicious selection of source data, ILI data sets can be mined so that an analogue data set can be created that constitutes a reasonable representation of the attributes of reliability of a specific new pipeline of interest. Key reliability properties, such as tool error distribution, feature incidence rate, feature size distribution, and apparent feature growth rate distribution can be derived from such analogue data. By applying these reliability properties in an analysis along with known pipeline design and material properties and their associated distributions, and by taking consideration of planned inspection intervals, a reliability basis can be derived for estimating pipeline risk and reliability. Estimates of risk and reliability that are derived in this manner employ methodologies that are repeatable, defendable, transparent, and free of subjectivity. This paper outlines an approach for completing risk and reliability estimates on new pipelines, and presents the results of some sample calculations. The reliability estimates illustrated are based on an approach whereby corrosion feature size and growth rates are obtained from analogue ILI datasets, and treated as random variables. In that regard, they constitute the probability of exceeding a limit state that represents an approximation of the condition for failure.


1988 ◽  
Vol 110 (2) ◽  
pp. 172-179 ◽  
Author(s):  
H. El-Tahan ◽  
S. Venkatesh ◽  
M. El-Tahan

This paper describes the evaluation of a model for predicting the drift of iceberg ensembles. The model was developed in preparation for providing an iceberg forecasting service off the Canadian east coast north of about 45°N. It was envisaged that 1–5 day forecasts of iceberg ensemble drift will be available. Following a critical examination of all available data, 10 data sets containing up to 404 icebergs in the Grand Banks area off Newfoundland were selected for detailed study. The winds measured in the vicinity of the study area as well as the detailed current system developed by the International Ice Patrol were used as inputs to the model. A discussion on the accuracy and limitations of the input data is presented. Qualitative and quantitative criteria were used to evaluate model performance. Applying these criteria to the results of the computer simulations, it is shown that the model provides good predictions. The degree of predictive success varied from one data set to another. The study demonstrated the validity of the assumption of random positioning for icebergs within a grid block, especially for ensembles with large numbers of icebergs. It was found that an “average” iceberg size can be used to represent all icebergs. The study also showed that in order to achieve improved results it will be necessary to account for the deterioration (complete melting of icebergs), especially during the summer months.


2010 ◽  
Vol 18 (4) ◽  
pp. 499-505 ◽  
Author(s):  
Nathaniel Beck

The issue of how qualitative and quantitative information can be used together is critical. Brady, Collier, and Seawright (BCS) have argued that “causal process observations” can be adjoined to “data set observations.” This implies that qualitative methods can be used to add information to quantitative data sets. In a symposium inPolitical Analysis, I argued that such qualitative information cannot be adjoined in any meaningful way to quantitative data sets. In that symposium, the original authors offered several defenses, but, in the end, BCS can be seen as recommending good, but hopefully standard, research design practices that are normally thought of as central in thequantitativearena. It is good that BCS remind us that no amount of fancy statistics can save a bad research design.


Author(s):  
Rahul Yadav ◽  
Phalguni Pathak ◽  
Saumya Saraswat

In recent years, deep learning frameworks are applied in various domains and achieved shows potential performance that includes malware detection software, self-driving cars, identity recognition cameras, adversarial attacks became one crucial security threat to several deep learning applications in today’s world Deep learning techniques became the core part for several cyber security applications like intrusion detection, android malware detection, spam, malware classification, binary analysis and phishing detection. . One of the major research challenges in this field is the insufficiency of a comprehensive data set which reflects contemporary network traffic scenarios, broad range of low footprint intrusions and in depth structured information about the network traffic. For Evaluation of network intrusion detection systems, many benchmark data sets were developed a decade ago. In this paper, we provides a focused literature survey of data sets used for network based intrusion detection and characterize the underlying packet and flow-based network data in detail used for intrusion detection in cyber security. The datasets plays incredibly vital role in intrusion detection; as a result we illustrate cyber datasets and provide a categorization of those datasets.


Geophysics ◽  
2009 ◽  
Vol 74 (6) ◽  
pp. WCB71-WCB79 ◽  
Author(s):  
Stephan Husen ◽  
Tobias Diehl ◽  
Edi Kissling

Despite the increase in quality and number of seismic stations in many parts of the world, accurate timing of individual arrival times remains crucial for many tomographic applications. To achieve a data set of high quality, arrival times need to be picked with high accuracy, including a proper assessment of the uncertainty of timing and phase identification, and a high level of consistency. We have investigated the effects of data quantity and quality on the solution quality in local earthquake tomography. We have compared tomographic results obtained with synthetic and real data of two very different data sets. The first data set consisted of a large set of arrival times of low precision and unknown accuracy taken from the International Seismological Centre (ISC) Bulletin for the greater Alpine region. The second high-quality data set for the same region was seven times smaller and was obtained by automated quality-weighted repicking. During a first series of inversions, synthetic data resembling the two data sets were inverted with the same amount of Gaussian distributed noise added. Subsequently, during a second series of inversions, the noise level was increased successively for ISC data to study the effect of larger Gaussian distributed error on the solution quality. Finally, the real data for both data sets were inverted. These investigations showed that, for Gaussian distributed error, a smaller data set of high quality could achieve a similar or better solution quality than a data set seven times larger but about four times lower in quality. Our results further suggest that the quality of the ISC Bulletin is degraded significantly by inconsistencies, strongly limiting the use of this large data set for local earthquake tomography studies.


2018 ◽  
Vol 154 (2) ◽  
pp. 149-155
Author(s):  
Michael Archer

1. Yearly records of worker Vespula germanica (Fabricius) taken in suction traps at Silwood Park (28 years) and at Rothamsted Research (39 years) are examined. 2. Using the autocorrelation function (ACF), a significant negative 1-year lag followed by a lesser non-significant positive 2-year lag was found in all, or parts of, each data set, indicating an underlying population dynamic of a 2-year cycle with a damped waveform. 3. The minimum number of years before the 2-year cycle with damped waveform was shown varied between 17 and 26, or was not found in some data sets. 4. Ecological factors delaying or preventing the occurrence of the 2-year cycle are considered.


2018 ◽  
Vol 21 (2) ◽  
pp. 117-124 ◽  
Author(s):  
Bakhtyar Sepehri ◽  
Nematollah Omidikia ◽  
Mohsen Kompany-Zareh ◽  
Raouf Ghavami

Aims & Scope: In this research, 8 variable selection approaches were used to investigate the effect of variable selection on the predictive power and stability of CoMFA models. Materials & Methods: Three data sets including 36 EPAC antagonists, 79 CD38 inhibitors and 57 ATAD2 bromodomain inhibitors were modelled by CoMFA. First of all, for all three data sets, CoMFA models with all CoMFA descriptors were created then by applying each variable selection method a new CoMFA model was developed so for each data set, 9 CoMFA models were built. Obtained results show noisy and uninformative variables affect CoMFA results. Based on created models, applying 5 variable selection approaches including FFD, SRD-FFD, IVE-PLS, SRD-UVEPLS and SPA-jackknife increases the predictive power and stability of CoMFA models significantly. Result & Conclusion: Among them, SPA-jackknife removes most of the variables while FFD retains most of them. FFD and IVE-PLS are time consuming process while SRD-FFD and SRD-UVE-PLS run need to few seconds. Also applying FFD, SRD-FFD, IVE-PLS, SRD-UVE-PLS protect CoMFA countor maps information for both fields.


Sign in / Sign up

Export Citation Format

Share Document