Skip to main content

An IERI – International Educational Research Institute Journal

Table 9 Item-by-country kappa (human-machine inter-rater reliability)

From: Combining machine translation and automated scoring in international large-scale assessments

 

C1

C2

C3

C4

C5

C6

C7

C8

Average

Item 1

0.72

0.81

0.87

0.80

0.82

0.83

0.82

0.73

0.80

Item 2

0.83

0.98

0.91

0.89

0.90

0.94

0.88

0.90

0.90

Item 3

0.94

0.93

0.92

0.87

0.80

0.97

0.94

0.93

0.91

Item 4

0.89

0.85

0.60

0.78

0.71

0.53

0.84

0.69

0.74

Item 5

0.62

0.65

0.66

0.59

0.58

0.70

0.36

0.70

0.61

Item 6

0.95

0.90

0.89

0.81

0.91

0.53

0.86

0.65

0.81

Average

0.82

0.85

0.81

0.79

0.79

0.75

0.78

0.77

0.80

  1. Note C1 & C2 = German-speaking countries; C3 = French-speaking country; C4 = Turkish-speaking country; C5 = English-speaking country; C6 & C7 = Chinese-speaking countries; C8 = Korean-speaking country