### American Society of Medical Physics IMRT Treatment Planning^{35} Cases of Task Group 119 (TG-119)

All planning data were generated and optimized using the segmented MLC (SMLC) mode of the RayStation planning system v5 (RaySearch Laboratories, Stockholm, Sweden). The structural dataset including computed tomography (CT) and phantoms used in this study were downloaded from the AAPM TG-119 case.^{36}. In addition, AAPM TG-119 proposes IMRT targets and beam placements. For each plan, the beam placement and planned dose, such as the angle and number of beam fields, for the Elekta Versa HD linear accelerator with Agility MLC (Elekta AB, Stockholm, Sweden) were set according to the recommendations of AAPM TG 119 ( Supplementary Table S1). Plans were generated using a RayStation v5 TPS with a dose calculation algorithm (Collapsed Cone Convolution (CCC)) with a grid size of 2.0 mm for each beam for static IMRT planning with segmented MLC it was done.

### Simulated MLC leaf position error

In this study, we mainly considered two types of MLC position errors, namely systematic and random errors. DICOM-RT plan files were extracted from RayStation v5 to simulate MLC position errors. The original MLC positions of all control points described in the DICOM-RT plan file were changed to the positions specified in the DICOM-RT plan file and used in-house software developed using MATLAB 2018b (Supplementary See MATLAB source code) was modified using (Mathworks, Natick, MA). The modified DICOM-RT plan file was then imported back into RayStation v5, as shown in Figure 1.^{36}To evaluate only the effect of MLC position error, the MU values for all control points between error-free and error plans were unchanged and the same planning objective function was used. In addition, optimization based on the CCC algorithm was not performed, and only recalculation was performed.

As shown in Figure 1, each beam in the error-free plan experienced two types of MLC position errors:

For systematic MLC position error planning, the MLC positions surrounding the planning target volume (PTV) were shifted 0.5 mm, 1.0 mm, 1.5 mm, and 2.0 mm to the right of the original leaf positions on one side for all control points. rice field. of the bank.

In the random MLC position error plan, at every control point, the MLC positions in both banks were randomly shifted by pseudo-random numbers with a Gaussian distribution of mean values. *μ*0.0 mm, 1.0 mm, 2.0 mm with standard deviation, *σ*1.0 mm wide (1 sigma)^{10,37}. When a leaf misalignment caused it to collide with a leaf on the opposite bank, it was randomly placed within the corresponding Gaussian distribution to avoid collision.

We generated 35 simulated MLC position error plans and total cases for each angle and beam (Table 1).

### gamma analysis

The gamma index method uses percent DD and DTA to assess the agreement between the two dose distributions of the error-free and error data sets.^{38}. Full local 3D between two dose files, error-free plan and error plan with error, based on 3%/3 mm, 3%/2 mm (AAPM TG-218 recommended), 2%/2 criteria I ran a gamma analysis. 1%/1 mm with mm and 10% threshold using PTW Verisoft software, version 6.1 (PTW, Freiburg, Germany)^{7,10,38,39}. Three-dimensional gamma analysis was performed for each angle and for the entire beam.

### SSIM

The SSIM index is designed to compare and evaluate image pairs (error-free and error-prone plans) and can be used to evaluate SSIM brightness, contrast, and structure.^{4,25,38,40}.

In this study, the entire SSIM index and three subcomponents (brightness, contrast and structure index) were also evaluated as a function of beamfield. No preprocessing was required because the dose maps had the same size in SSIM. The SSIM index is calculated using MATLAB 2018b (Mathworks, Natick, MA) and is expressed as

$$SSIM \left(x,y\right)={\left[l\left(x,y\right)\right]}^{\alpha }\cdot {\left[c\left(x,y\right)\right]}^{\beta }\cdot {\left[s\left(x,y\right)\right]}^{\gamma }.$$

(1)

The default SSIM index is based on the following settings: \(\alpha =\beta =\upgamma =1\)and \({C}_{3}= \frac{{C}_{2}}{2}\)the SSIM index can be computed as

$$SSIM\left(x,y\right)=\frac{(2{\mu }_{x}{\mu }_{y}+{C}_{1})(2{\sigma }_ {xy}+{C}_{2})}{({\mu }_{x}^{2}+{\mu }_{y}^{2}+{C}_{1})( {\sigma }_{x}^{2}+{\sigma }_{y}^{2}+{C}_{2})},$$

(2)

where \(l\left(x,y\right)\), \(c\left(x,y\right),\) and \(s\left(x,y\right)\) are the brightness, contrast, and structural subindexes, respectively. \({\mu }_{x}\) and \({\mu }_{y}\), \({\sigma }_{x}\) and \({\sigma }_{y}\)and \({\sigma }_{xy}\) are the local mean, standard deviation and cross variance of image x and image y respectively.

The specific parameters for SSIM calculation were set based on previous work^{4,25,40}. The regularization constant is calculated as: *C.*_{1}= (*K.*_{1}*L.*)^{2}, *C.*_{2}= (*K.*_{2}*L.*)^{2}and *C.*_{3}= *C.*_{2}/2. *K.*_{1} and *K.*_{2} They were set to 0.01 and 0.03, respectively, as the default values suggested by Wang et al.^{40}. Pen et al.^{twenty five} suggested a default value of *K.*_{1} and *K.*_{2} as a suitable factor for evaluating the MLC position error in the result of the regularization constant effect due to *K.*_{1} and *K.*_{2} values. dynamic range,*L.*was set to 200, corresponding to the split dose in this study.

### dosiomics analysis

For the dose-mix analysis, two different dose distribution datasets were generated: (1) the subtracted error-free dataset (simulated error-free dose map – error-free dose map), and (2) a dataset of subtracted errors (simulated error-free dose map – error-free dose map) (Fig. 2); The error-free dose map represents the dose distribution extracted after planning all control points at the original MLC positions via RayStation. To generate a subtracted error-free dose map for dose-mix analysis, two error-free dose maps were required. Using the error-free dose map (uncorrected dose map) of the same dataset as the simulated error-free dose map to create a sub-error-free will result in zero pixel values and the radiomics analysis cannot be performed. Therefore, simulated error-free dose maps were generated by systematically moving the error-free dose maps by 0.01 mm steps. Similarity and correlation between simulated error-free dose maps and uncorrected error-free dose maps were examined using Wilcoxon signed rank test and Spearman rank correlation. Their results showed a statistically significant similarity ( *p* Wilcoxon signed-rank test value > 0.05) and strong correlation (coefficient > 0.97 and* p* values < 0.001) Spearman's rank correlation (Supplementary Table S2). Therefore, it was confirmed that the simulated error-frees have sufficient similarity to produce sub-error-frees when performed in this experiment. Sub-error-free (simulated error-free dose map - error-free dose map), sub-systematic error (simulated systematic error-free dose map - error-free dose map) were classified into three types. Sub-random error (simulated random error dose map - error-free dose map). Class I includes error-free types and types that combine systematic and random errors. Class II consists of error-free types and systematic error types. Class III consists of no error type and random error type. The classes of error types for subtracted dose maps are summarized in Table 2.

In this study, class I, class II, and class III dose-mix analyzes were performed. 275 fluence maps of 4 treatment plans exported from TPS to DICOM-RT files were analyzed using the Local Image features Extraction (LIFEx) version 7.1.0 software package (http://www.lifexsoft.org). it was done.^{41}. For the dosiomics index calculations, the spatial resampling was 2 mm (X direction), 2 mm (Y direction), and 2 mm (Z direction) in Cartesian coordinates. The bin size in the intensity discretization was 1. The 34 radiomics features were classified into 1 conventional feature, 2 histogram features, and 31 texture features.^{42, 43, 44}. Four matrices: CLCM, CLRLM, NGLDM, and GLZLM^{26,42,43,44} It was used to determine 31 texture features. GLCMs were obtained in 13 directions in 3D using one voxel distance relation between adjacent voxels to indicate the arrangement of voxel pairs used for the computation of texture features. GLRLM is computed for 13 different directions and represents the size of a uniform run for each gray level. NGLDM was related to the gray level difference between one voxel and its 26 neighboring voxels in 3D. GLZLM was computed directly in 3D to describe the size of the homogeneous zone for each gray level. Radiomics features were extracted from the entire subtracted dose map and standardized to obtain standardized scores (z-scores) (Supplementary Table S3).

### statistical analysis

All statistical analyzes were performed using RStudio (version 2021.09.1-372; RStudio Software Inc. (Boston, MA, USA)). Error-free data were labeled as ‘0’, systematically and randomly errored data as ‘1’. Gamma, SSIM, and dosiomics indices were examined and selected to develop the MLC position error prediction model. The independence of all measures of gamma, SSIM and dosiomics were investigated using Spearman’s rank correlation, backward stepwise elimination and multicollinearity to prevent overfitting. First, after applying the Holm-Bonferroni correction method to all indices, we removed indices with Spearman’s rank correlation coefficient higher than 0.8.*p* Fix multiple test comparisons using values. The remaining indices were then filtered by performing backward gradual elimination. Indices selected through these two steps were chosen by multicollinearity with a variance inflation factor (VIF < 4).^{45}. In addition, univariate and multivariate logistic regression were also used for index selection.

### Predictive model development and performance

Univariate and multivariate logistic regression models for MLC position error prediction were built in RStudio through the following process. To ensure the reproducibility of random sampling, the “*set seed*‘ feature has been implemented. The dataset was loaded and split into a training set and a test set using “”.*Create data partition*‘ function. The number of training datasets represented 60%–80% of the total number of datasets (sub-error free dataset: 36, sub-systematic error dataset: 36, sub-random error dataset: 27) (Supplementary Table S4 ). ‘*train control*We used the ‘ function to define the training control object, specified that we wanted to use 10-fold cross-validation, and ‘ used 10 iterations.*Repeated CV*‘ Method. ‘*2 Class overview*‘ The function is used to summarize the results of the cross-validation process, ‘*Class Probs*‘ was set to TRUE to enable probability estimation. ‘*grid*‘ was used to find the optimal hyperparameters on the tuning grid. ‘*training*‘ function to train a logistic regression model on the training data.*Gulm*‘ methods and training control objects. ‘*Accuracy*‘ metric was used to assess the accuracy of the classification model in the train function. ‘*Predict*‘ The function is used to predict the class labels of the test data, ‘*confusion matrix*‘ and ‘*rock*We used the function to evaluate the model in terms of sensitivity, specificity, accuracy, accuracy, and area under the curve (AUC) calculated based on the receiver operating characteristic curve (ROC). The workflow for developing a prediction error model is shown in Figure 3.

### DVH

DVH for AAPM TG-119^{36} Cases planned for IMRT treatment were generated using RayStation v5. Based on studies of site-based patient quality assurance (QA) criteria, the criteria for significant difference were underdose for PTV and ≥3% for OAR.^{46, 47, 48}.