Table 1: The number of trials used to generate the model database, number of trials used to evaluate the model, inter-rater reliability for the judges’ scores (ICC), mean scores of the database trials (± standard deviation) and evaluation trials, as well as the results of the Wilcoxon rank-sum test.

Apparatus Database Trials Evaluation Trials ICCall Mean Score of
Database Trials
Mean Score of
Evaluation Trials
Z p
Floor 38 20 0.75 3.94 ± 1.90 3.99 ± 1.99 0.08 .818
Balance Beam 37 20 0.85 3.94 ± 2.23 3.52 ± 2.31 1.24 .216
Vault 37 20 0.83 3.35 ± 1.90 3.42 ± 1.81 0.23 .933

Notes: The inter-rater reliability was calculated for all three judges. Scores were assigned between one and six points according to the judging guidelines of the German Gymnastics Federation for young talented gymnasts [41].