---
title: 'SOCI424/624: Dyads, triads, and homophily'
author: '(anonymous for peer evaluation)'
output: html_document
---
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
# load the 'igraph' package
library(igraph)
```
# Analysis
## Part 1: Dyads and triads
### Preschooler ethology
Today, we'll be using data on the interactions of a group a preschool students from Strayer and Trudel's (1984) study. The researchers observed several groups of children, aged 1-6, in a Montreal daycare over the course of two years. They catalogued two types of behavior: 'agonistic' acts (things like biting, making faces, and stealing), and 'affiliative' acts (things like smiling, following, and holding hands). This was part of a larger body of literature from the 1970s and 1980s that used ethological methods of animal behavioral analysis to examine the behavior of children.
The relations for this data are stored as simple lists of edges. For example, the first few lines of the agonistic relations file look like this:
1, 2
1, 3
1, 4
1, 5
...
indicating that child 1 displayed agonistic behavior toward child 2, 3, 4, 5, etc. For the current analysis, I have simplified the data to represent unvalued (binary) relations. For the agonistic data an edge represents at least one observed agonistic act. Affinitive (friendly) behavior was much more common among the students, so an edge here represents at least *five* observed affinitive acts.
There are three files you will need to load for this worksheet:
- The agonistic data is available at
- The affinitive data is available at
- The vertex attributes are at
### Task 1A: Loading the data
- *Use R's `read.csv()` function to download all three of these file directly into data frames representing edges and vertices. Name the agonistic data frame `ag_edges`, the affinitive data frame `af_edges`, and the vertex attributes `children`.*
```{r 1a_1}
# (your code here)
```
- *Inspect the two relational data frames (either in the console or in RSudio's table viewer). Then, in the space below, write code to calculate how many agonistic relations and how many affinitive relations there are between these children.*
```{r 1a_2}
# (your code here)
```
- *Use the `graph_from_data_frame()` function in R to create two separate network objects: `agnet` for the agonistic network and `afnet` for the affiliative network. You will want to provide two arguments to the function: (1) a data frame with network relations, and (2) a data frame with vertex attributes (specified as `vertices`). Refer to the documentation for `graph_from_data_frame` for more details.*
```{r 1a_3}
# (your code here)
```
- *Igraph provides functions `vcount()` and `ecount()` to count the number of vertices (nodes) and edges (relations) in a network. Write code below to answer the following questions: How many children are in the network? (the answer should be the same for both networks) How many relations does each network have? (the answer should be different for each network.*
```{r 1a_4}
# (your code here)
```
### Task 1B: Dyadic and triadic analysis
In this task you will learn about the dyads (pairs of children) in the network.
- *Use igraph's `dyad_census()` function to run a dyad census on each of the two networks. What do the numbers mean? Do you notice a difference in the prevalence of certain dyads between the two networks?*
```{r 1b_1}
# your code here
```
Recall that the reciprocity of a network is the probability that any given directed edge in that network is reciprocated. That is, if the edge A→B exists, reciprocity is the probability that the edge B→A also exists.
- *Use igraph's `reciprocity()` function to calculate the reciprocity of the two networks. Is there a large difference? How should these numbers be interpreted?*
```{r 1b_2}
# your code here
```
- *Calculate the global _transitivity_ of each set of relations (there is an R function for this). Is there a large difference?*
```{r 1b_3}
# your code here
```
- ***Bonus question for more experienced R users**: Holland and Leinhardt (1971) characterized strongly hierarchical networks as those that lacked a set of "forbidden" intransitive triads (021C, 030C, 111D, 111U, 120C, 201, 210D), arguing that these seven triads would be rare in such networks. Using the `triad_census()` function (and the documentation to see which triad is which), characterize the prevalence of these intransitive triads in the agonisitic and affinitive networks. By this metric, is one of these networks more hierarchical than the other?*
```{r 1b_4}
# your code here
```
## Part 2: Gender homophily
Strayer and Trudel also reported data on the gender of the children they studied. They provide no description of how the data on gender was collected, but, as was common in studies from the period, each child's gender was treated unproblematically as either "♂" or "♀". We will refer to these as "boy" and "girl", respectively.
Data on the children's gender was provided in the vertex attribute file you downloaded at the beginning of this worksheet, and was included in `agnet` and `afnet` when you constructed the graphs.
### Task 2A: Adding vertex attributes
The igraph function `V()` lets you query and interact with *vertex sequences* in a network. So, e.g., `V(afnet)` will give you a sequence of all 16 vertices (nodes) in the affinity network. The `V()` function also lets you read or alter vertex *attributes* — information (such as gender) about each of the vertices in a network. Igraph uses the "dollar sign" notation to access attributes of vertices, so `V(afnet)$name` would give you the "name" associated with each vertex in the network (by default, the vertex names are just numbers from 1 to the number of vertices). You can also use this syntax to create new attributes, or to alter existing ones: `V(afnet)$species <- 'human'` would create a new "species" vertex attribute, and set it to "human" for every child in the network.
- *Examine the `gender` vertex attribute for both networks. How many girls are there in each network? How many boys? (The `table()` function, which calculates tabulations and cross-tabulations, provides one conveniet way of answering these questions. But there are other ways to do this, too!)*
```{r 2a_2}
# your code here
```
- *In the next task, we will also need a numeric representation of the gender categories. The following code will create a new attribute on both networks called `gender_cat` that takes a value of 1 for boys and 2 for girls. You don't need to change this code, but you should look at it and make sure you understand what it is doing. Note the use of square brackets ("[]") to _index_ (i.e. select) the vertex sequences.*
```{r 2a_3}
# agonistic network
V(agnet)$gender_cat <- 1
V(agnet)$gender_cat[V(agnet)$gender=='G'] <- 2
# affinity network
V(afnet)$gender_cat <- 1
V(afnet)$gender_cat[V(agnet)$gender=='G'] <- 2
```
### Task 2B: Characterizing homophily
- *As we discussed in class, one way to characterize the overall level of homophily in a network is with `assortativity`. Use the `assortativity_nominal()` function to calculate the gender assortativity of the agonism and affinity networks. (Note: this function requires two arguments---the network and the category vertex attribute. Use `V(agnet)$gender_cat` or `V(afnet)$gender_cat` for this second argument.) Is there a difference between the two networks' values?*
```{r 2b_1}
# your code here
```
Another common way to discuss homophily is to examine *homophilous* (within-category) and *heterophilous* (between-category) relations. The `E()` function in igraph is an analog to the `V()` function that will allow us to look at the sequence of *edges* in the network and to create subsets of those edges based on type. The next few prompts will walk you through using the `E()` function to count inter- and intra-gender relations.
- *We will first need to enumerate the vertices in each gender category. You can do this using the `V()` function and its special indexing syntax. For example `V(agnet)[gender=='B']` will return the set of vertices whose `gender` attribute is 'B'. Create four vertex sequences, `agboys`, `aggirls`, `afboys`, and `afgirls`, that contain only the boys and girls, respectively. You'll need to do this separately for each network, since vertex sequences are specific to their network.*
```{r 2b_2}
# your code here
```
- *Now enumerate the within-gender (gender homophilous) and between-gender (gender heterophilous) ties in each of your networks. You can do this easily using the indexing syntax from the `E()` function. As an example: `E(agnet)[boys %--% boys]` will enumerate the edges that connect a boy to a boy in the agonistic network. How many edges are homophilous within each gender category in each of the two networks?*
```{r 2b_3}
# your code here
```
- ***Bonus question for more experienced R users**: Using the information you have on gender of the children, calculate the baseline homophily for boys and for girls (i.e. the expected homophily if relations were random). Then calculate the actual homophily for boys and girls within the agonistic and affinity networks. Are these networks more or less homophilous than the baseline level?*
```{r 2b_4}
# your code here
```
## Part 3: Visualizing vertex attributers
### Task 3A: Using vertex attributes for visualization
- *Vertex attributes in igraph do a lot of work. They describe empirical characteristics of vertices (e.g. gender), but they can also describe the visualization of nodes in a network when plotting. Create a new vertex attribute called `color`, and set it to `"green"` for all of the vertices in both networks. Plot the networks to see what they look like.*
```{r 3a_1}
# your code here
```
- *Now change the `color` attribute only for the boys, to make them visually distinct from the girls. (You can choose any color---the `colors()` command will list the ridiculously long list of named colors in R). Plot the networks again. Keeping in mind the caveats about network visualizations from class, what do the visualizations tell you? Do you see any gendered patterns? Are there any obvious differences between those gendered patterns between the two networks?*
```{r 3a_2}
# your code here
```
# Discussion:
- In part 1 above, you described features of the dyads (e.g. reciprocity) and triads (e.g. transitivity) of the two networks. Talk about the differences you observed between the agonistic and affinity networks in these structural measures. Considering the types of relations described (and what you know about preschool-aged children), are these patterns what you would expect? Do you think that transitivity (which ignores the direction of edges) describes the same kinds of structural patterns in the agonistic versus the affinity networks?
- (your response)
- In the example above, and in most of the examples from the reading, homophily was analyzed on *categorical* variables---gender, race, political affiliation, etc. Does the concept of homophily require categorical distinctions? What would homophily look like with less reductive conceptions of, say, race or gender that recognize complexity of the social distinctions they describe (e.g. multiracial or gender non-binary people)? How might you measure such homophily?
- (your response)
- It is often the case in network analysis that different network statistics are related to one another in surprising ways. Thinking about networks in general, how would a tendency toward *homophily* encourage *transitivity* in a network? How would a tendency toward *transitivity* encourage *homophily*? Are transitivity and homophily separate structuring "forces" in a network, or just different aspects of a single underlying force?
- (your response)
# References
Holland, Paul W., and Samuel Leinhardt. "Transitivity in Structural Models of Small Groups." *Small Group Research* 2, no. 2 (1971): 107--24. .
Strayer, F. F., and M. Trudel. "Developmental Changes in the Nature and Function of Social Dominance among Young Children." *Ethology and Sociobiology*, The Study of Adaptiveness of Aggressive, Dominance, and Conflict Resolution Strategies in Humans and Nonhuman Primates, 5, no. 4 (January 1, 1984): 279--95. [https://doi.org/10.1016/0162-3095(84)90007-4](https://doi.org/10.1016/0162-3095(84)90007-4){.uri}.