scholarly journals Lake and mire isolation data set for the estimation of post-glacial land uplift in Fennoscandia

2020 ◽  
Vol 12 (2) ◽  
pp. 869-873
Author(s):  
Jari Pohjola ◽  
Jari Turunen ◽  
Tarmo Lipping

Abstract. Postglacial land uplift is a complex process related to the continental ice retreat that took place about 10 000 years ago and thus started the viscoelastic response of the Earth's crust to rebound back to its equilibrium state. To empirically model the land uplift process based on past behaviour of shoreline displacement, data points of known spatial location, elevation and dating are needed. Such data can be obtained by studying the isolation of lakes and mires from the sea. Archaeological data on human settlements (i.e. human remains, fireplaces etc.) are also very useful as the settlements were indeed situated on dry land and were often located close to the coast. This information can be used to validate and update the postglacial land uplift model. In this paper, a collection of data underlying empirical land uplift modelling in Fennoscandia is presented. The data set is available at https://doi.org/10.1594/PANGAEA.905352 (Pohjola et al., 2019).

2019 ◽  
Author(s):  
Jari Pohjola ◽  
Jari Turunen ◽  
Tarmo Lipping

Abstract. Postglacial land uplift is a complex process related to the continental ice retreat that took place about 10,000 years ago and thus started the viscoelastic response of the Earth's crust to rebound back to it's equilibrium state. To empirically model the land uplift process based on past behavior of shoreline displacement, data points of known spatial location, elevation and dating are needed. Such data can be obtained by studying the isolation of lakes and mires from the sea. Archaeological data on human settlements (i.e., human remains, fireplaces etc.) are also very useful as the settlements were indeed situated on dry land and were often located close to the coast. This information can be used to validate and update the postglacial land uplift model. In this paper, a collection of data underlying empirical land uplift modeling in Fennoscandia is presented. The data set is available at https://doi.org/10.1594/PANGAEA.905352 (Pohjola et al. 2019).


Author(s):  
Simona Babiceanu ◽  
Sanhita Lahiri ◽  
Mena Lockwood

This study uses a suite of performance measures that was developed by taking into consideration various aspects of congestion and reliability, to assess impacts of safety projects on congestion. Safety projects are necessary to help move Virginia’s roadways toward safer operation, but can contribute to congestion and unreliability during execution, and can affect operations after execution. However, safety projects are assessed primarily for safety improvements, not for congestion. This study identifies an appropriate suite of measures, and quantifies and compares the congestion and reliability impacts of safety projects on roadways for the periods before, during, and after project execution. The paper presents the performance measures, examines their sensitivity based on operating conditions, defines thresholds for congestion and reliability, and demonstrates the measures using a set of Virginia safety projects. The data set consists of 10 projects totalling 92 mi and more than 1M data points. The study found that, overall, safety projects tended to have a positive impact on congestion and reliability after completion, and the congestion variability measures were sensitive to the threshold of reliability. The study concludes with practical recommendations for primary measures that may be used to measure overall impacts of safety projects: percent vehicle miles traveled (VMT) reliable with a customized threshold for Virginia; percent VMT delayed; and time to travel 10 mi. However, caution should be used when applying the results directly to other situations, because of the limited number of projects used in the study.


Algorithms ◽  
2021 ◽  
Vol 14 (2) ◽  
pp. 37
Author(s):  
Shixun Wang ◽  
Qiang Chen

Boosting of the ensemble learning model has made great progress, but most of the methods are Boosting the single mode. For this reason, based on the simple multiclass enhancement framework that uses local similarity as a weak learner, it is extended to multimodal multiclass enhancement Boosting. First, based on the local similarity as a weak learner, the loss function is used to find the basic loss, and the logarithmic data points are binarized. Then, we find the optimal local similarity and find the corresponding loss. Compared with the basic loss, the smaller one is the best so far. Second, the local similarity of the two points is calculated, and then the loss is calculated by the local similarity of the two points. Finally, the text and image are retrieved from each other, and the correct rate of text and image retrieval is obtained, respectively. The experimental results show that the multimodal multi-class enhancement framework with local similarity as the weak learner is evaluated on the standard data set and compared with other most advanced methods, showing the experience proficiency of this method.


2021 ◽  
Author(s):  
Ahmed Al-Sabaa ◽  
Hany Gamal ◽  
Salaheldin Elkatatny

Abstract The formation porosity of drilled rock is an important parameter that determines the formation storage capacity. The common industrial technique for rock porosity acquisition is through the downhole logging tool. Usually logging while drilling, or wireline porosity logging provides a complete porosity log for the section of interest, however, the operational constraints for the logging tool might preclude the logging job, in addition to the job cost. The objective of this study is to provide an intelligent prediction model to predict the porosity from the drilling parameters. Artificial neural network (ANN) is a tool of artificial intelligence (AI) and it was employed in this study to build the porosity prediction model based on the drilling parameters as the weight on bit (WOB), drill string rotating-speed (RS), drilling torque (T), stand-pipe pressure (SPP), mud pumping rate (Q). The novel contribution of this study is to provide a rock porosity model for complex lithology formations using drilling parameters in real-time. The model was built using 2,700 data points from well (A) with 74:26 training to testing ratio. Many sensitivity analyses were performed to optimize the ANN model. The model was validated using unseen data set (1,000 data points) of Well (B), which is located in the same field and drilled across the same complex lithology. The results showed the high performance for the model either for training and testing or validation processes. The overall accuracy for the model was determined in terms of correlation coefficient (R) and average absolute percentage error (AAPE). Overall, R was higher than 0.91 and AAPE was less than 6.1 % for the model building and validation. Predicting the rock porosity while drilling in real-time will save the logging cost, and besides, will provide a guide for the formation storage capacity and interpretation analysis.


2018 ◽  
Vol 11 (2) ◽  
pp. 53-67
Author(s):  
Ajay Kumar ◽  
Shishir Kumar

Several initial center selection algorithms are proposed in the literature for numerical data, but the values of the categorical data are unordered so, these methods are not applicable to a categorical data set. This article investigates the initial center selection process for the categorical data and after that present a new support based initial center selection algorithm. The proposed algorithm measures the weight of unique data points of an attribute with the help of support and then integrates these weights along the rows, to get the support of every row. Further, a data object having the largest support is chosen as an initial center followed by finding other centers that are at the greatest distance from the initially selected center. The quality of the proposed algorithm is compared with the random initial center selection method, Cao's method, Wu method and the method introduced by Khan and Ahmad. Experimental analysis on real data sets shows the effectiveness of the proposed algorithm.


2021 ◽  
Vol 50 (1) ◽  
pp. 138-152
Author(s):  
Mujeeb Ur Rehman ◽  
Dost Muhammad Khan

Recently, anomaly detection has acquired a realistic response from data mining scientists as a graph of its reputation has increased smoothly in various practical domains like product marketing, fraud detection, medical diagnosis, fault detection and so many other fields. High dimensional data subjected to outlier detection poses exceptional challenges for data mining experts and it is because of natural problems of the curse of dimensionality and resemblance of distant and adjoining points. Traditional algorithms and techniques were experimented on full feature space regarding outlier detection. Customary methodologies concentrate largely on low dimensional data and hence show ineffectiveness while discovering anomalies in a data set comprised of a high number of dimensions. It becomes a very difficult and tiresome job to dig out anomalies present in high dimensional data set when all subsets of projections need to be explored. All data points in high dimensional data behave like similar observations because of its intrinsic feature i.e., the distance between observations approaches to zero as the number of dimensions extends towards infinity. This research work proposes a novel technique that explores deviation among all data points and embeds its findings inside well established density-based techniques. This is a state of art technique as it gives a new breadth of research towards resolving inherent problems of high dimensional data where outliers reside within clusters having different densities. A high dimensional dataset from UCI Machine Learning Repository is chosen to test the proposed technique and then its results are compared with that of density-based techniques to evaluate its efficiency.


2014 ◽  
Vol 23 (1) ◽  
pp. 75-82 ◽  
Author(s):  
Cagatay Catal

AbstractPredicting the defect-prone modules when the previous defect labels of modules are limited is a challenging problem encountered in the software industry. Supervised classification approaches cannot build high-performance prediction models with few defect data, leading to the need for new methods, techniques, and tools. One solution is to combine labeled data points with unlabeled data points during learning phase. Semi-supervised classification methods use not only labeled data points but also unlabeled ones to improve the generalization capability. In this study, we evaluated four semi-supervised classification methods for semi-supervised defect prediction. Low-density separation (LDS), support vector machine (SVM), expectation-maximization (EM-SEMI), and class mass normalization (CMN) methods have been investigated on NASA data sets, which are CM1, KC1, KC2, and PC1. Experimental results showed that SVM and LDS algorithms outperform CMN and EM-SEMI algorithms. In addition, LDS algorithm performs much better than SVM when the data set is large. In this study, the LDS-based prediction approach is suggested for software defect prediction when there are limited fault data.


2021 ◽  
Vol 87 (6) ◽  
pp. 445-455
Author(s):  
Yi Ma ◽  
Zezhong Zheng ◽  
Yutang Ma ◽  
Mingcang Zhu ◽  
Ran Huang ◽  
...  

Many manifold learning algorithms conduct an eigen vector analysis on a data-similarity matrix with a size of N×N, where N is the number of data points. Thus, the memory complexity of the analysis is no less than O(N2). We pres- ent in this article an incremental manifold learning approach to handle large hyperspectral data sets for land use identification. In our method, the number of dimensions for the high-dimensional hyperspectral-image data set is obtained with the training data set. A local curvature varia- tion algorithm is utilized to sample a subset of data points as landmarks. Then a manifold skeleton is identified based on the landmarks. Our method is validated on three AVIRIS hyperspectral data sets, outperforming the comparison algorithms with a k–nearest-neighbor classifier and achieving the second best performance with support vector machine.


2021 ◽  
pp. 1-21
Author(s):  
Hany Gamal ◽  
Ahmed Alsaihati ◽  
Salaheldin Elkatatny ◽  
Saleh Haidary ◽  
Abdulazeez Abdulraheem

Abstract The rock unconfined compressive strength (UCS) is one of the key parameters for geomechanical and reservoir modeling in the petroleum industry. Obtaining the UCS by conventional methods such as experimental work or empirical correlation from logging data are time consuming and highly cost. To overcome these drawbacks, this paper utilized the help of artificial intelligence (AI) to predict (in a real-time) the rock strength from the drilling parameters using two AI tools. Random forest (RF) based on principal component analysis (PCA), and functional network (FN) techniques were employed to build two UCS prediction models based on the drilling data such as weight on bit (WOB), drill string rotating-speed (RS), drilling torque (T), stand-pipe pressure (SPP), mud pumping rate (Q), and the rate of penetration (ROP). The models were built using 2,333 data points from well (A) with 70:30 training to testing ratio. The models were validated using unseen data set (1,300 data points) of Well (B) which is located in the same field and drilled across the same complex lithology. The results of the PCA-based RF model outperformed the FN in terms of correlation coefficient (R) and average absolute percentage error (AAPE). The overall accuracy for PCA-based RF was R of 0.99 and AAPE of 4.3 %, and for FN yielded R of 0.97 and AAPE of 8.5%. The validation results showed that R was 0.99 for RF and 0.96 for FN, while the AAPE was 4 and 7.9 % for RF and FN models, respectively. The developed PCA-based RF and FN models provide an accurate UCS estimation in real-time from the drilling data, saving time and cost and enhancing the well stability by generating UCS log from the rig drilling data.


Author(s):  
Baoying Wang ◽  
Imad Rahal ◽  
Richard Leipold

Data clustering is a discovery process that partitions a data set into groups (clusters) such that data points within the same group have high similarity while being very dissimilar to points in other groups (Han & Kamber, 2001). The ultimate goal of data clustering is to discover natural groupings in a set of patterns, points, or objects without prior knowledge of any class labels. In fact, in the machine-learning literature, data clustering is typically regarded as a form of unsupervised learning as opposed to supervised learning. In unsupervised learning or clustering, there is no training function as in supervised learning. There are many applications for data clustering including, but not limited to, pattern recognition, data analysis, data compression, image processing, understanding genomic data, and market-basket research.


Sign in / Sign up

Export Citation Format

Share Document