クロス表

ここでは先行研究で分析されたクロス表のデータをいくつか取り上げます.

library(tidyverse)
library(gnm)
library(broom)
library(vcdExtra)
library(logmult)
library(knitr)

Logan (1983) Table 2

入江・菅澤・橋本.2022.『標準 ベイズ統計学』(朝倉書店)のp.27でも用いられているデータ.

Freq_logan_1983 <- c(109, 36, 27, 24,  1,
                      43,  8, 16, 15,  0,
                      70, 27, 79, 55,  1,
                      58, 27, 54, 94,  2,
                      15,  7, 26, 29, 15
                      )
d_logan <- tibble(Freq = Freq_logan_1983,
                  O = gl(n = 5, k = 5, length = 5 * 5),
                  D = gl(n = 5, k = 1, length = 5 * 5))
d_logan
# A tibble: 25 × 3
    Freq O     D    
   <dbl> <fct> <fct>
 1   109 1     1    
 2    36 1     2    
 3    27 1     3    
 4    24 1     4    
 5     1 1     5    
 6    43 2     1    
 7     8 2     2    
 8    16 2     3    
 9    15 2     4    
10     0 2     5    
# ℹ 15 more rows
xtabs(Freq ~ O + D, data = d_logan)
   D
O     1   2   3   4   5
  1 109  36  27  24   1
  2  43   8  16  15   0
  3  70  27  79  55   1
  4  58  27  54  94   2
  5  15   7  26  29  15

gnmパッケージのデータ

data(package = "gnm")$results |> data.frame() |> select("Item", "Title") |> knitr::kable()
Item Title
House2001 Data on twenty roll calls in the US House of Representatives, 2001
backPain Data on Back Pain Prognosis, from Anderson (1984)
barley Jenkyn’s Data on Leaf-blotch on Barley
barleyHeights Heights of Barley Plants
cautres Data on Class, Religion and Vote in France
erikson Intergenerational Class Mobility in England/Wales, France and Sweden
friend Occupation of Respondents and Their Closest Friend
mentalHealth Data on Mental Health and Socioeconomic Status
voting Data on Social Mobility and the Labour Vote
wheat Wheat Yields from Mexican Field Trials
yaish Class Mobility by Level of Education in Israel

mentalHealthは本書でも用いられているデータ.eriksonErikson, Goldthorpe, and Portocarero (1982) で分析され、その後 Xie (1992) で再分析されている.データは Hauser (1984) のAppendixのものである.

erikson
, , country = EW

      destination
origin    I   II  III  IVa  IVb  IVc V/VI VIIa VIIb
  I     311  130   79   24   22    7   70   44    1
  II    161  128   66   22   11    6  112   47    1
  III   128  109   89   26   25    3  197  113    4
  IVa    88   83   43   72   41    5  112   64    4
  IVb    36   45   38   27   47    3  110   80    4
  IVc    43   23   25   16   14   99   86   81   40
  V/VI  356  375  325  108  140    5 1506  839   22
  VIIa  150  180  187   48   74    9  802  685   15
  VIIb   12   14   18    5   18   10   96  114   56

, , country = F

      destination
origin    I   II  III  IVa  IVb  IVc V/VI VIIa VIIb
  I     105   72   19    9    8    3   26   11    1
  II     59  113   37    9   14    0   54   34    2
  III    40   86   64   10   20    4  103   61    4
  IVa    38   37   17   38   23    2   36   22    1
  IVb    40   68   55   38   95   10   92   74    7
  IVc    27   74   77   27   52  461  156  286   73
  V/VI   36  138   93   22   38    5  339  189    9
  VIIa   22   88   79   18   24    8  235  209   11
  VIIb    4   18   26    9   14   19   68  107   47

, , country = S

      destination
origin    I   II  III  IVa  IVb  IVc V/VI VIIa VIIb
  I      52   15   13    3    2    0   11    7    0
  II     30   27   14    3    4    0   27   12    2
  III    10   19   10    2    4    0   16   11    1
  IVa    26   24    5   20    8    1   33   22    0
  IVb     8   13    6    3    9    4   31   20    1
  IVc    24   47   44   17   22   92  132  144   21
  V/VI   33   89   40   13   18    5  188  104    5
  VIIa   32   49   28   14   17    5  159  109    4
  VIIb    5   10    3    0    6    3   33   42    8
d_erikson <- data.frame(erikson) |> tibble()
d_erikson
# A tibble: 243 × 4
   origin destination country  Freq
   <fct>  <fct>       <fct>   <dbl>
 1 I      I           EW        311
 2 II     I           EW        161
 3 III    I           EW        128
 4 IVa    I           EW         88
 5 IVb    I           EW         36
 6 IVc    I           EW         43
 7 V/VI   I           EW        356
 8 VIIa   I           EW        150
 9 VIIb   I           EW         12
10 I      II          EW        130
# ℹ 233 more rows

vcdExtraパッケージのデータ

data(package = "vcdExtra")$results |> data.frame() |> select("Item", "Title") |> knitr::kable()
Item Title
Abortion Abortion Opinion Data
Accident Traffic Accident Victims in France in 1958
AirCrash Air Crash Data
Alligator Alligator Food Choice
Asbestos Effect of Exposure to Asbestos
Bartlett Bartlett Data on Plum Root Cuttings
Burt Burt (1950) Data on Hair, Eyes, Head and Stature
Caesar Risk Factors for Infection in Caesarian Births
Cancer Survival of Breast Cancer Patients
Cormorants Advertising Behavior by Males Cormorants
CyclingDeaths London Cycling Deaths
DaytonSurvey Dayton Student Survey on Substance Use
Depends Dependencies of R Packages
Detergent Detergent preference data
Donner Survival in the Donner Party
Draft1970 USA 1970 Draft Lottery Data
Draft1970table USA 1970 Draft Lottery Table
Dyke Sources of Knowledge of Cancer
Fungicide Carcinogenic Effects of a Fungicide
GSS General Social Survey- Sex and Party affiliation
Geissler Geissler’s Data on the Human Sex Ratio
Gilby Clothing and Intelligence Rating of Children
Glass British Social Mobility from Glass(1954)
HairEyePlace Hair Color and Eye Color in Caithness and Aberdeen
Hauser79 Hauser (1979) Data on Social Mobility
Heart Sex, Occupation and Heart Disease
Heckman Labour Force Participation of Married Women 1967-1971
HospVisits Hospital Visits Data
HouseTasks Household Tasks Performed by Husbands and Wives
Hoyt Minnesota High School Graduates
ICU ICU data set
JobSat Cross-classification of job satisfaction by income
Mammograms Mammogram Ratings
Mental Mental Impairment and Parents SES
Mice Mice Depletion Data
Mobility Social Mobility data
PhdPubs Publications of PhD Candidates
ShakeWords Shakespeare’s Word Type Frequencies
TV TV Viewing Data
Titanicp Passengers on the Titanic
Toxaemia Toxaemia Symptoms in Pregnancy
Vietnam Student Opinion about the Vietnam War
Vote1980 Race and Politics in the 1980 Presidential Vote
WorkerSat Worker Satisfaction Data
Yamaguchi87 Occupational Mobility in Three Countries

Hauser79Hauser (1980) で用いられたもの(Table 1. Counts in a Classification of Mobility from Father’s (or Other Family Head’s) Occupation to Son’s First Full-Time Civilian Occupation: U.S. Men Aged 20-64 in 1973).

Hauser79
    Son Father Freq
1  UpNM   UpNM 1414
2  LoNM   UpNM  521
3   UpM   UpNM  302
4   LoM   UpNM  643
5  Farm   UpNM   40
6  UpNM   LoNM  724
7  LoNM   LoNM  524
8   UpM   LoNM  254
9   LoM   LoNM  703
10 Farm   LoNM   48
11 UpNM    UpM  798
12 LoNM    UpM  648
13  UpM    UpM  856
14  LoM    UpM 1676
15 Farm    UpM  108
16 UpNM    LoM  756
17 LoNM    LoM  914
18  UpM    LoM  771
19  LoM    LoM 3325
20 Farm    LoM  237
21 UpNM   Farm  409
22 LoNM   Farm  357
23  UpM   Farm  441
24  LoM   Farm 1611
25 Farm   Farm 1832
tab_Hauser79 <- xtabs(Freq ~ Father + Son, data = Hauser79)
tab_Hauser79
      Son
Father UpNM LoNM  UpM  LoM Farm
  UpNM 1414  521  302  643   40
  LoNM  724  524  254  703   48
  UpM   798  648  856 1676  108
  LoM   756  914  771 3325  237
  Farm  409  357  441 1611 1832

Yamaguchi87Yamaguchi (1987) で用いられた,アメリカ,イギリス,日本の3ヶ国のデータ.Goodman et al. (1998) や@xie1992でも再分析されている.

Yamaguchi87
    Son Father Country Freq
1  UpNM   UpNM      US 1275
2  LoNM   UpNM      US  364
3   UpM   UpNM      US  274
4   LoM   UpNM      US  272
5  Farm   UpNM      US   17
6  UpNM   LoNM      US 1055
7  LoNM   LoNM      US  597
8   UpM   LoNM      US  394
9   LoM   LoNM      US  443
10 Farm   LoNM      US   31
11 UpNM    UpM      US 1043
12 LoNM    UpM      US  587
13  UpM    UpM      US 1045
14  LoM    UpM      US  951
15 Farm    UpM      US   47
16 UpNM    LoM      US 1159
17 LoNM    LoM      US  791
18  UpM    LoM      US 1323
19  LoM    LoM      US 2046
20 Farm    LoM      US   52
21 UpNM   Farm      US  666
22 LoNM   Farm      US  496
23  UpM   Farm      US 1031
24  LoM   Farm      US 1632
25 Farm   Farm      US  646
26 UpNM   UpNM      UK  474
27 LoNM   UpNM      UK  129
28  UpM   UpNM      UK   87
29  LoM   UpNM      UK  124
30 Farm   UpNM      UK   11
31 UpNM   LoNM      UK  300
32 LoNM   LoNM      UK  218
33  UpM   LoNM      UK  171
34  LoM   LoNM      UK  220
35 Farm   LoNM      UK    8
36 UpNM    UpM      UK  438
37 LoNM    UpM      UK  254
38  UpM    UpM      UK  669
39  LoM    UpM      UK  703
40 Farm    UpM      UK   16
41 UpNM    LoM      UK  601
42 LoNM    LoM      UK  388
43  UpM    LoM      UK  932
44  LoM    LoM      UK 1789
45 Farm    LoM      UK   37
46 UpNM   Farm      UK   76
47 LoNM   Farm      UK   56
48  UpM   Farm      UK  125
49  LoM   Farm      UK  295
50 Farm   Farm      UK  191
51 UpNM   UpNM   Japan  127
52 LoNM   UpNM   Japan  101
53  UpM   UpNM   Japan   24
54  LoM   UpNM   Japan   30
55 Farm   UpNM   Japan   12
56 UpNM   LoNM   Japan   86
57 LoNM   LoNM   Japan  207
58  UpM   LoNM   Japan   64
59  LoM   LoNM   Japan   61
60 Farm   LoNM   Japan   13
61 UpNM    UpM   Japan   43
62 LoNM    UpM   Japan   73
63  UpM    UpM   Japan  122
64  LoM    UpM   Japan   60
65 Farm    UpM   Japan   13
66 UpNM    LoM   Japan   35
67 LoNM    LoM   Japan   51
68  UpM    LoM   Japan   62
69  LoM    LoM   Japan   66
70 Farm    LoM   Japan   11
71 UpNM   Farm   Japan  109
72 LoNM   Farm   Japan  206
73  UpM   Farm   Japan  184
74  LoM   Farm   Japan  253
75 Farm   Farm   Japan  325
tab_Yamaguchi87 <- xtabs(Freq ~ Father + Son + Country, data = Yamaguchi87)
tab_Yamaguchi87
, , Country = US

      Son
Father UpNM LoNM  UpM  LoM Farm
  UpNM 1275  364  274  272   17
  LoNM 1055  597  394  443   31
  UpM  1043  587 1045  951   47
  LoM  1159  791 1323 2046   52
  Farm  666  496 1031 1632  646

, , Country = UK

      Son
Father UpNM LoNM  UpM  LoM Farm
  UpNM  474  129   87  124   11
  LoNM  300  218  171  220    8
  UpM   438  254  669  703   16
  LoM   601  388  932 1789   37
  Farm   76   56  125  295  191

, , Country = Japan

      Son
Father UpNM LoNM  UpM  LoM Farm
  UpNM  127  101   24   30   12
  LoNM   86  207   64   61   13
  UpM    43   73  122   60   13
  LoM    35   51   62   66   11
  Farm  109  206  184  253  325

logmult パッケージのデータ

data(package = "logmult")$results |> data.frame() |> select("Item", "Title") |> knitr::kable()
Item Title
color Two Cross-Classifications of Eye Color by Hair Color
criminal Dropped Criminal Charges, Denmark, 1955-1958
gss7590 Education and Occupational Attainment Among White Men and Women in the United States, 1975-1990
gss8590 Education and Occupational Attainment Among Women in the United States, 1985-1990
gss88 Major Occupation by Years of Schooling in the United States, 1988
hg16 Son’s occupation by father’s occupation for 16 countries in the 1960s and 1970s
ocg1973 Intergenerational Mobility in the United States, 1973

参考文献

Erikson, Robert, John H. Goldthorpe, and Lucienne Portocarero. 1982. “Social Fluidity in Industrial Nations: England, France and Sweden.” The British Journal of Sociology 33 (1): 1–34. https://doi.org/10.2307/589335.
Goodman, Leo A., Michael Hout, Yu Xie, and Kazuo Yamaguchi. 1998. “Statistical Methods and Graphical Displays for Analyzing How the Association Between Two Qualitative Variables Differs Among Countries, Among Groups, or over Time: A Modified Regression-Type Approach / Comments / Rejoinder.” Sociological Methodology 28: 175.
Hauser, Robert M. 1980. “Some Exploratory Methods for Modeling Mobility Tables and Other Cross-Classified Data.” Sociological Methodology 11: 413. https://doi.org/10.2307/270871.
———. 1984. “Vertical Class Mobility in England, France, and Sweden.” Acta Sociologica (Taylor & Francis Ltd) 27 (2): 87–110. https://doi.org/10.1177/000169938402700201.
Xie, Yu. 1992. “The Log-Multiplicative Layer Effect Model for Comparing Mobility Tables.” American Sociological Review 57 (3): 380–95. https://doi.org/10.2307/2096242.
Yamaguchi, Kazuo. 1987. “Models for Comparing Mobility Tables: Toward Parsimony and Substance.” American Sociological Review 52 (4): 482–94. https://doi.org/10.2307/2095293.