A method for generating large datasets of organ geometries for radiotherapy treatment planning studies

Abstract Background. With the rapidly increasing application of adaptive radiotherapy, large datasets of organ geometries based on the patient’s anatomy are desired to support clinical application or research work, such as image segmentation, re-planning, and organ deformation analysis. Sometimes only limited datasets are available in clinical practice. In this study, we propose a new method to generate large datasets of organ geometries to be utilized in adaptive radiotherapy. Methods. Given a training dataset of organ shapes derived from daily cone-beam CT, we align them into a common coordinate frame and select one of the training surfaces as reference surface. A statistical shape model of organs was constructed, based on the establishment of point correspondence between surfaces and non-uniform rational B-spline (NURBS) representation. A principal component analysis is performed on the sampled surface points to capture the major variation modes of each organ. Results. A set of principal components and their respective coefficients, which represent organ surface deformation, were obtained, and a statistical analysis of the coefficients was performed. New sets of statistically equivalent coefficients can be constructed and assigned to the principal components, resulting in a larger geometry dataset for the patient’s organs. Conclusions. These generated organ geometries are realistic and statistically representative

Download Full-text

High quality statistical shape modelling of the human nasal cavity and applications

Royal Society Open Science ◽

10.1098/rsos.181558 ◽

2018 ◽

Vol 5 (12) ◽

pp. 181558 ◽

Cited By ~ 4

Author(s):

William Keustermans ◽

Toon Huysmans ◽

Femke Danckaers ◽

Andrzej Zarowski ◽

Bert Schmelzer ◽

...

Keyword(s):

Principal Components ◽

Demographic Data ◽

Principal Component ◽

Statistical Shape Model ◽

High Quality ◽

Shape Model ◽

Shape Modelling ◽

Morphological Variations ◽

Statistical Shape ◽

Human Nose

The human nose is a complex organ that shows large morphological variations and has many important functions. However, the relation between shape and function is not yet fully understood. In this work, we present a high quality statistical shape model of the human nose based on clinical CT data of 46 patients. A technique based on cylindrical parametrization was used to create a correspondence between the nasal shapes of the population. Applying principal component analysis on these corresponded nasal cavities resulted in an average nasal geometry and geometrical variations, known as principal components, present in the population with a high precision. The analysis led to 46 principal components, which account for 95% of the total geometrical variation captured. These variations are first discussed qualitatively, and the effect on the average nasal shape of the first five principal components is visualized. Hereafter, by using this statistical shape model, two application examples that lead to quantitative data are shown: nasal shape in function of age and gender, and a morphometric analysis of different anatomical regions. Shape models, as the one presented here, can help to get a better understanding of nasal shape and variation, and their relationship with demographic data.

Download Full-text

Estimating Growth in Height from Limited Longitudinal Growth Data Using Full-Curves Training Dataset: A Comparison of Two Procedures of Curve Optimization—Functional Principal Component Analysis and SITAR

Children ◽

10.3390/children8100934 ◽

2021 ◽

Vol 8 (10) ◽

pp. 934

Author(s):

Miroslav Králík ◽

Ondřej Klíma ◽

Martin Čuta ◽

Robert M. Malina ◽

Sławomir M. Kozieł ◽

...

Keyword(s):

Longitudinal Data ◽

Principal Components ◽

Reference Sample ◽

Principal Component ◽

Human Growth ◽

Functional Principal Component Analysis ◽

Training Dataset ◽

Estimation Of Parameters ◽

Growth Data ◽

Practical Applications

A variety of models are available for the estimation of parameters of the human growth curve. Several have been widely and successfully used with longitudinal data that are reasonably complete. On the other hand, the modeling of data for a limited number of observation points is problematic and requires the interpolation of the interval between points and often an extrapolation of the growth trajectory beyond the range of empirical limits (prediction). This study tested a new approach for fitting a relatively limited number of longitudinal data using the normal variation of human empirical growth curves. First, functional principal components analysis was done for curve phase and amplitude using complete and dense data sets for a reference sample (Brno Growth Study). Subsequently, artificial curves were generated with a combination of 12 of the principal components and applied for fitting to the newly analyzed data with the Levenberg–Marquardt optimization algorithm. The approach was tested on seven 5-points/year longitudinal data samples of adolescents extracted from the reference sample. The samples differed in their distance from the mean age at peak velocity for the sample and were tested by a permutation leave-one-out approach. The results indicated the potential of this method for growth modeling as a user-friendly application for practical applications in pediatrics, auxology and youth sport.

Download Full-text

Structuring Assessments of Psychopathology

Journal of Individual Differences ◽

10.1027/1614-0001.27.2.87 ◽

2006 ◽

Vol 27 (2) ◽

pp. 87-92 ◽

Cited By ~ 2

Author(s):

Willem K.B. Hofstee ◽

Dick P.H. Barelds ◽

Jos M.F. Ten Berge

Keyword(s):

Principal Components ◽

Personality Assessment ◽

Clinical Sample ◽

Principal Component ◽

Normal Sample ◽

Normative Sample ◽

Assessment Data ◽

Obsessive Compulsive ◽

Oblique Rotation ◽

Two Samples

Hofstee and Ten Berge (2004a) have proposed a new look at personality assessment data, based on a bipolar proportional (-1, .. . 0, .. . +1) scale, a corresponding coefficient of raw-scores likeness L = ΢XY/N, and raw-scores principal component analysis. In a normal sample, the approach resulted in a structure dominated by a first principal component, according to which most people are faintly to mildly socially desirable. We hypothesized that a more differentiated structure would arise in a clinical sample. We analyzed the scores of 775 psychiatric clients on the 132 items of the Dutch Personality Questionnaire (NPV). In comparison to a normative sample (N = 3140), the eigenvalue for the first principal component appeared to be 1.7 times as small, indicating that such clients have less personality (social desirability) in common. Still, the match between the structures in the two samples was excellent after oblique rotation of the loadings. We applied the abridged m-dimensional circumplex design, by which persons are typed by their two highest scores on the principal components, to the scores on the first four principal components. We identified five types: Indignant (1-), Resilient (1-2+), Nervous (1-2-), Obsessive-Compulsive (1-3-), and Introverted (1-4-), covering 40% of the psychiatric sample. Some 26% of the individuals had negligible scores on all type vectors. We discuss the potential and the limitations of our approach in a clinical context.

Download Full-text

Comparison of Principal Component Solutions in Two Populations

Methodology ◽

10.1027/1614-2241/a000099 ◽

2016 ◽

Vol 12 (1) ◽

pp. 11-20 ◽

Cited By ~ 1

Author(s):

Gregor Sočan

Keyword(s):

Simulation Study ◽

Principal Components ◽

Common Factor ◽

Principal Component ◽

Component Model ◽

Bootstrap Procedure ◽

Factor Loadings ◽

Component Loadings ◽

Principal Component Model ◽

Two Populations

Abstract. When principal component solutions are compared across two groups, a question arises whether the extracted components have the same interpretation in both populations. The problem can be approached by testing null hypotheses stating that the congruence coefficients between pairs of vectors of component loadings are equal to 1. Chan, Leung, Chan, Ho, and Yung (1999) proposed a bootstrap procedure for testing the hypothesis of perfect congruence between vectors of common factor loadings. We demonstrate that the procedure by Chan et al. is both theoretically and empirically inadequate for the application on principal components. We propose a modification of their procedure, which constructs the resampling space according to the characteristics of the principal component model. The results of a simulation study show satisfactory empirical properties of the modified procedure.

Download Full-text

Stormwater inflow prediction using radar rainfall data compressed by principal component analysis

Water Practice & Technology ◽

10.2166/wpt.2006.017 ◽

2006 ◽

Vol 1 (1) ◽

Author(s):

K. Katayama ◽

K. Kimijima ◽

O. Yamanaka ◽

A. Nagaiwa ◽

Y. Ono

Keyword(s):

Principal Component Analysis ◽

Prediction Model ◽

Principal Components ◽

Prediction Method ◽

Principal Component ◽

Component Analysis ◽

Rainfall Data ◽

Radar Rainfall ◽

Input Variables ◽

Inflow Prediction

This paper proposes a method of stormwater inflow prediction using radar rainfall data as the input of the prediction model constructed by system identification. The aim of the proposal is to construct a compact system by reducing the dimension of the input data. In this paper, Principal Component Analysis (PCA), which is widely used as a statistical method for data analysis and compression, is applied to pre-processing radar rainfall data. Then we evaluate the proposed method using the radar rainfall data and the inflow data acquired in a certain combined sewer system. This study reveals that a few principal components of radar rainfall data can be appropriate as the input variables to storm water inflow prediction model. Consequently, we have established a procedure for the stormwater prediction method using a few principal components of radar rainfall data.

Download Full-text

DeepSSPred: A Deep Learning Based Sulfenylation site predictor via a novel n-segmented optimize federated feature encoder

Protein and Peptide Letters ◽

10.2174/0929866527666201202103411 ◽

2020 ◽

Vol 27 ◽

Author(s):

Zaheer Ullah Khan ◽

Dechang Pi

Keyword(s):

Large Scale ◽

Computational Models ◽

Research Work ◽

Training Data ◽

Training Dataset ◽

Validation Dataset ◽

Cytokine Signaling ◽

Minority Class ◽

Independent Dataset ◽

Feature Encoding

Background: S-sulfenylation (S-sulphenylation, or sulfenic acid) proteins, are special kinds of post-translation modification, which plays an important role in various physiological and pathological processes such as cytokine signaling, transcriptional regulation, and apoptosis. Despite these aforementioned significances, and by complementing existing wet methods, several computational models have been developed for sulfenylation cysteine sites prediction. However, the performance of these models was not satisfactory due to inefficient feature schemes, severe imbalance issues, and lack of an intelligent learning engine. Objective: In this study, our motivation is to establish a strong and novel computational predictor for discrimination of sulfenylation and non-sulfenylation sites. Methods: In this study, we report an innovative bioinformatics feature encoding tool, named DeepSSPred, in which, resulting encoded features is obtained via n-segmented hybrid feature, and then the resampling technique called synthetic minority oversampling was employed to cope with the severe imbalance issue between SC-sites (minority class) and non-SC sites (majority class). State of the art 2DConvolutional Neural Network was employed over rigorous 10-fold jackknife cross-validation technique for model validation and authentication. Results: Following the proposed framework, with a strong discrete presentation of feature space, machine learning engine, and unbiased presentation of the underline training data yielded into an excellent model that outperforms with all existing established studies. The proposed approach is 6% higher in terms of MCC from the first best. On an independent dataset, the existing first best study failed to provide sufficient details. The model obtained an increase of 7.5% in accuracy, 1.22% in Sn, 12.91% in Sp and 13.12% in MCC on the training data and12.13% of ACC, 27.25% in Sn, 2.25% in Sp, and 30.37% in MCC on an independent dataset in comparison with 2nd best method. These empirical analyses show the superlative performance of the proposed model over both training and Independent dataset in comparison with existing literature studies. Conclusion : In this research, we have developed a novel sequence-based automated predictor for SC-sites, called DeepSSPred. The empirical simulations outcomes with a training dataset and independent validation dataset have revealed the efficacy of the proposed theoretical model. The good performance of DeepSSPred is due to several reasons, such as novel discriminative feature encoding schemes, SMOTE technique, and careful construction of the prediction model through the tuned 2D-CNN classifier. We believe that our research work will provide a potential insight into a further prediction of S-sulfenylation characteristics and functionalities. Thus, we hope that our developed predictor will significantly helpful for large scale discrimination of unknown SC-sites in particular and designing new pharmaceutical drugs in general.

Download Full-text

Evaluation functions for integral mapping

Geodesy and Cartography ◽

10.22389/0016-7126-2017-921-3-24-29 ◽

2017 ◽

Vol 921 (3) ◽

pp. 24-29 ◽

Cited By ~ 2

Author(s):

S.I. Lesnykh ◽

A.K. Cherkashin

Keyword(s):

Principal Component Analysis ◽

Environmental Factors ◽

Principal Components ◽

Principal Component ◽

Evaluation Function ◽

Final Value ◽

Evaluation Functions ◽

Geographical Environment ◽

Environmental Background ◽

Integral Mapping

The proposed procedure of integral mapping is based on calculation of evaluation functions on the integral indicators (II) taking into account the feature of the local geographical environment, when geosystems in the same states in the different environs have various estimates. Calculation of II is realized with application of a Principal Component Analysis for processing of the forest database, allowing to consider in II the weight of each indicator (attribute). The final value of II is equal to a difference of the first (condition of geosystem) and the second (condition of environmental background) principal components. The evaluation functions are calculated on this value for various problems of integral mapping. The environmental factors of variability is excluded from final value of II, therefore there is an opportunity to find the invariant evaluation function and to determine coefficients of this function. Concepts and functions of the theory of reliability for making the evaluation maps of the hazard of functioning and stability of geosystems are used.

Download Full-text

EXPRESS: Exploration of Principal Component Analysis: Deriving PCA Visually Using Spectra

Applied Spectroscopy ◽

10.1177/0003702820987847 ◽

2021 ◽

pp. 000370282098784

Author(s):

James Renwick Beattie ◽

Francis Esmonde-White

Keyword(s):

Principal Components ◽

Principal Component ◽

Original Data ◽

Specific Chemical ◽

Successive Refinement ◽

Minimal Loss ◽

Components Analysis ◽

The Mathematical Model ◽

Application Specific ◽

Reconstructed Data

Spectroscopy rapidly captures a large amount of data that is not directly interpretable. Principal Components Analysis (PCA) is widely used to simplify complex spectral datasets into comprehensible information by identifying recurring patterns in the data with minimal loss of information. The linear algebra underpinning PCA is not well understood by many applied analytical scientists and spectroscopists who use PCA. The meaning of features identified through PCA are often unclear. This manuscript traces the journey of the spectra themselves through the operations behind PCA, with each step illustrated by simulated spectra. PCA relies solely on the information within the spectra, consequently the mathematical model is dependent on the nature of the data itself. The direct links between model and spectra allow concrete spectroscopic explanation of PCA, such the scores representing âconcentrationâ or âweightsâ. The principal components (loadings) are by definition hidden, repeated and uncorrelated spectral shapes that linearly combine to generate the observed spectra. They can be visualized as subtraction spectra between extreme differences within the dataset. Each PC is shown to be a successive refinement of the estimated spectra, improving the fit between PC reconstructed data and the original data. Understanding the data-led development of a PCA model shows how to interpret application specific chemical meaning of the PCA loadings and how to analyze scores. A critical benefit of PCA is its simplicity and the succinctness of its description of a dataset, making it powerful and flexible.

Download Full-text

Analytical Concentrations of Some Elements in Seeds and Crude Extracts from Aesculus hippocastanum, by ICP-OES Technique

Agronomy ◽

10.3390/agronomy11010047 ◽

2020 ◽

Vol 11 (1) ◽

pp. 47

Author(s):

Caterina Durante ◽

Marina Cocchi ◽

Lisa Lancellotti ◽

Laura Maletti ◽

Andrea Marchetti ◽

...

Keyword(s):

Inductively Coupled Plasma ◽

Research Work ◽

Principal Component ◽

Aesculus Hippocastanum ◽

Horse Chestnut ◽

Icp Oes ◽

Wild Type ◽

Commercial Sample ◽

Pure Species ◽

Crude Extracts

The metal content in some samples of horse chestnut seeds (Aesculus hippocastanum) was monitored over time (years 2016–2019) considering the two most common and representative Mediterranean varieties: the pure species (AHP, which gives white flowers) and a hybrid one (AHH, which gives pink flowers). The selected elemental composition of the samples was determined by applying the Inductively Coupled Plasma-Optical Emission Spectroscopy (ICP-OES) technique. Several samples obtained from different preliminary treatments of the peeled seeds were examined, such as: (i) floury samples (wild-type) mineralized with the wet method; (ii) the ashes of both AHP and AHH varieties; (iii) the fraction of total inorganic soluble salts (TISS). Furthermore, the hydroalcoholic crude extracts (as a tincture) were obtained according to the official Pharmacopoeia methods, and the relevant results were compared with those of a commercial sample, an herbal product-food supplement of similar characteristics. The main characteristics of this research work underline that the two botanical varieties give different distinctive characters, due to the Fe content (80.05 vs. 1.42 mg/100 g d.s., for AHP and AHH wild-type flour samples, respectively), along with K, Ca, Mn, Ni and Cu, which are more abundant in the AHP samples. Furthermore, the Principal Component Analysis (PCA) was applied to the experimental dataset in order to classify and discriminate the samples, in relation to their similar botanical origin, but different for the color of the bloom. These results can be useful for the traceability of raw materials potentially intended for the production of auxiliary systems of pharmacological interest.

Download Full-text

The factor structure of the GHQ-60 in a community sample

Psychological Medicine ◽

10.1017/s0033291700002038 ◽

1988 ◽

Vol 18 (1) ◽

pp. 211-218 ◽

Cited By ~ 25

Author(s):

J. L. Vazquez-Barquero ◽

P. Williams ◽

J. F. Diez-Manrique ◽

J. Lequerica ◽

A. Arenal

Keyword(s):

Factor Structure ◽

Principal Components ◽

Good Description ◽

Social Performance ◽

Community Sample ◽

General Health Questionnaire ◽

Principal Component ◽

Northern Spain ◽

Relative Operating Characteristic ◽

Component Structure

SynopsisThe factor structure of the 60-item version of the General Health Questionnaire was explored, using data collected in a community study in a rural area of northern Spain. Six principal components, similar to those previously reported with this instrument, were found to provide a good description of the data structure.The 30-item and 12-item versions of the GHQ were then disembedded from the parent version, and further principal components analyses carried out. Again, the results were similar to previous studies: in each of the three versions analysed here, the two most important components represented a disturbance of mood (‘general dysphoria’)– including aspects of anxiety, depression and irritability– and a disturbance of social performance (‘social function/optimism’).The principal component structure of the GHQ-60 was then utilized to calculate factor scores, and these were compared with PSE ratings using Relative Operating Characteristic (ROC) analysis. While four of the six factors discriminated well (area under the ROC curve 0–75 or more) between PSE ‘cases’ and ‘non-cases’, only one, depressive thoughts, was a good discriminator between depressed and non-depressed PSE ‘cases’.

Download Full-text