**Due Monday, Sept. 28 by the start of class (8:35am, Eastern Daylight Time)**

This lab will introduce you to network data representations in R. ** To complete the lab, you should create an R script that will address all of the questions below. Responses should use plain, descriptive language to address the questions. The text of your responses can be included in the file as comments (putting a ** The following illustrates a good format:

`#`

character before a line of text will tell R to ignore that line).```
###
# Question 1a
###
1:10
x <-
# The code above creates a vector of numbers from 1 to 10.
# These are some of my favorite numbers.
```

**Southern culture**

The data we will use for this lab comes from the book *Deep South: A Social Anthropological Study of Caste and Class* (Davis, Gardner, and Gardner [1941] 2009). Among the book’s detailed accounts of early–20th century social systems in the Southern United States, it contains data on the participation of a sample of 18 upper-class white women in a series of 14 social events in 1936. The authors provide the following figure to illustrate the data:

We will use the data presented by Davis, Gardner, and Gardner ([1941] 2009) to examine the social structure among the 18 women in the figure.

It is always a good idea to visually inspect a source of data before you start analyzing it in R. According to the figure above, what is the *least* and *most* number of social events that any woman participated in? Which women were most active (i.e. participated in the most events)?

Is there anything you can tell informally about the social structure among these women just by looking at the events table? Do you think the women are separated into different cliques? Do you think some of the people might be more central to the social life than others?

One way to represent data of this type in R is as a matrix with values `1`

(representing participation) and `0`

(representing non-participation). The following snippet creates such a matrix, with 18 rows and 14 columns. However, it is missing data for Miss Charlotte McDowd, Miss Ruth Desand, and Mrs. Flora Price, who are erroneously indicated to have not participated in any events. *Using the figure above, update the code below to include participation for these three women.* (You can copy and paste the code into your own file in RStudio and edit it there)

```
# this code uses R's `matrix()` function to take a sequence of numbers
# and convert them to a 2-dimensional matrix. For this to work, it needs
# to know how many rows to expect (`nrow=18`), and whether to fill the
# matrix column-by-column, or row-by-row (`byrow=TRUE`)
matrix(c(
events <-1,1,1,1,1,1,0,1,1,0,0,0,0,0,
1,1,1,0,1,1,1,1,0,0,0,0,0,0,
0,1,1,1,1,1,1,1,1,0,0,0,0,0,
1,0,1,1,1,1,1,1,0,0,0,0,0,0,
0,0,0,0,0,0,0,0,0,0,0,0,0,0,
0,0,1,0,1,1,0,1,0,0,0,0,0,0,
0,0,0,0,1,1,1,1,0,0,0,0,0,0,
0,0,0,0,0,1,0,1,1,0,0,0,0,0,
0,0,0,0,0,0,0,0,0,0,0,0,0,0,
0,0,0,0,0,0,1,1,1,0,0,1,0,0,
0,0,0,0,0,0,0,1,1,1,0,1,0,0,
0,0,0,0,0,0,0,1,1,1,0,1,1,1,
0,0,0,0,0,0,1,1,1,1,0,1,1,1,
0,0,0,0,0,1,1,0,1,1,1,1,1,1,
0,0,0,0,0,0,1,1,0,1,1,1,1,1,
0,0,0,0,0,0,0,1,1,1,0,1,0,0,
0,0,0,0,0,0,0,0,1,0,1,0,0,0,
0,0,0,0,0,0,0,0,0,0,0,0,0,0
nrow=18,byrow=TRUE) ),
```

Now that the table is represented in the matrix `events`

, we can use R to tell us a bit more about it. The functions `rowSums(events)`

and `colSums(events)`

will add up all of the numbers in each row and in each column of `events`

, respectively. Using these functions, answer the following questions (*show your code for each*):

- Which event was the most popular (had the most attendees)?
- How many women attended exactly four events?

One way to infer the relationships among these 18 women is to ask how frequently they attended the same event. Presumably, women who have a closer relationship will also attend more events together.

Fortunately for us, there is a very basic technique from linear algebra that will turn an *affiliation matrix* like the one you created above into a *co-occurrence* matrix that will measure co-attendance at events. To do so, we will use matrix multiplication. You don’t need to know anything about linear algebra for this to work, and for the time being, you can just treat it like a magic wand (we will cover this more in a later class). *Use the following command to make a co-occurrence matrix for event attendance*:

```
# `%*%` is the R command for matrix multiplication,
# and the `t()` function transposes a matrix
events %*% t(events) co_attend <-
```

Visually inspect this new matrix `co_attend`

, either by typing it into the R console on its own or by clicking on it in the RStudio “environment” pane.

- How many rows and columns does it have?
- What number is in the third row of the first column, and what does it represent in plain language?
- What number is in the last row of the last column, and what does it represent in plain language?
- Where would you look in this matrix to tell how many events Mrs. Flora Price co-attended with Miss Laura Mandeville?

Use the `max()`

and `min()`

functions to determine the maximum and minimum number of events that any pair of women in this sample attended together.

Make a new matrix, called `co_attend_3plus`

that has the same shape as `co_attend`

, but whose values are either `TRUE`

or `FALSE`

, indicating for each pair of women whether they have attended *at least 3* event together. (*Note: This one may be a little tricky for new R users. The operator >= will tell you whether the value on the left is at least as big as the value on the right.*)

- How does
`co_attend_3plus`

represent a different type of relationship between the pairs of women than`co_attend`

?

You will now convert this data into network representations in order to visualize it for further analysis. I will gloss over some of the details here—we will cover these methods in detail later on.

** Note:** The following questions require you to have the

`igraph`

add-on package installed. This does not come with R by default, so you will have to install it yourself (if you haven’t already). To install `igraph`

you can either:- Use “Tools > Install Packages …” in RStudio, and type “
`igraph`

”" into the search bar**OR** - Enter
`install.packages('igraph')`

into the R console

After you have installed the package, you will then need to load it into your R session with the command `library(igraph)`

.

The function `graph_from_adjacency_matrix()`

from the `igraph`

package converts an *adjacency matrix* like the co-attendance matrix you created above into a `graph`

object. It automatically figures out how many nodes there are in the network by looking at the number of rows/columns in the matrix, and constructs edges between those nodes based on the data. Use the following command to created an *undirected*, *weighted* network from the `co_attend`

matrix you already made:

```
# The `diag=FALSE` argument tells R to ignore the *diagonal* of the matrix
# (the 1st row of the 1st column, 2nd row of 2nd column, etc), which would
# create edges from each node back to itself. We will talk more about such
# `loop` edges later.
graph_from_adjacency_matrix(
event_net <-mode='undirected',
co_attend, weighted=TRUE, diag=FALSE)
```

We will talk about plotting networks in detail later in the course, but for the moment we can mostly use the default options built into `igraph`

. Use the following two commands to (a) tell `igraph`

that we want the weight of the edges to be reflected in the width of the lines in the plot, and (b) to plot `event_net`

using the default parameters. Don’t worry about how these commands work yet.

```
E(event_net)$width <- E(event_net)$weight
plot(event_net)
```

Look at the plot that is created. What kinds of patterns can you see? Who is central and who is peripheral to the social structure this network represents?

We are now going to create and plot an *unweighted* network from the matrix `co_attend_3plus`

you created earlier based on a different measure of “relationship” between the women.

```
graph_from_adjacency_matrix(
event_net_3plus <-mode='undirected', diag=FALSE)
co_attend_3plus, plot(event_net_3plus)
```

Execute the commands above and inspect this new network visually, then answer the following questions:

- What is different about this network?
- Why does it look so different?
- In what ways is this new network a better representation of the social structure among the women? In what ways is it worse?

Davis, Allison, Burleigh Bradford Gardner, and Mary R. Gardner. (1941) 2009. *Deep South: A Social Anthropological Study of Caste and Class*. Univ of South Carolina Press.