[go: up one dir, main page]

Academia.eduAcademia.edu
“Diagnosing Multicollinearity in Exponential Random Graph Models” Forthcoming, Sociological Methods and Research. Code to implement measure Scott W Duxbury Doctoral Candidate The Ohio State University Department of Sociology ABSTRACT: Exponential random graph models (ERGM) have been widely applied in the social sciences in the past ten years. However, diagnostics for ERGM have lagged behind their use. Collinearity-type problems can emerge without detection when fitting ERGM, yielding inconsistent model estimates and problematizing inference from parameters. This article provides a method to detect multicollinearity in ERGM. It outlines the problem and provides a method to calculate the variance inflation factor from ERGM parameters. It then evaluates the method with a Monte Carlo simulation, fitting 216,000 ERGMs and calculating the variance inflation factors for each model. The distribution of variance inflation factors is analyzed using multilevel regression to determine what network characteristics lend themselves to collinearitytype problems. The relationship between variance inflation factors and unstable standard errors (a standard sign of collinearity) is also examined. The method is shown to effectively detect multicollinearity and guidelines for interpretation are discussed. 1 Advances in the statistical modeling of social networks have provided a much-needed analytic tool for social scientists. Exponential random graph models (ERGM) have been widely applied to test theories of racial boundaries (Wimmer and Lewis 2010), group processes (Goodreau, Kitts, and Morris 2009), gang violence (Papachristos 2009; Papachristos, Hureau, and Braga 2013), and social structure in total institutions (Kreager et al., forthcoming), among other topics. The main benefit of ERGM is the ability to simultaneously model and compare (exogenous) actor level attributes against contextual (endogenous) network effects (Robins et al. 2007; Lusher, Robins and Koskinen 2013). The basics of using exponential models for understanding social networks has been around for some time (e.g. Holland and Leinhardt 1981; Frank and Strauss 1986); however, initial statistical models based on pseudo-likelihood estimation were shown to be highly problematic (see Handcock et al. 2003a) and have only recently become widespread through the application of Markov Chain Monte Carlo (MCMC) maximum likelihood (MLE) methods (Geyer and Thompson 1992). The larger social science community has embraced these methods due to novel software made available in the R environment (Handcock et al. 2003b). Despite widespread use of ERGM, the variety of diagnostic tools commonly used to evaluate issues with correlation among the explanatory variables and model assessment in more conventional statistical models have yet to be adapted for ERGM. Particularly, the issue of multicollinearity— when at least one explanatory variables varies as a linear function of other explanatory variables —often goes unchecked and, by extension, undetected. The lack of multicollinearity diagnostics for statistical network models is not surprising. The standard approach to diagnosing multicollinearity in a regression model—the variance inflation factor (VIF)—implies independent observations. Consequently, translating this tool to network data is 2 an unclear task. Moreover, extremely high multicollinearity in an ERGM may lead to model degeneracy and non-convergence (Lusher et al. 2013; O’Malley and Onnela 2014), and so highly collinear models may not yield problematic estimates because they fail to yield estimates at all. These practical constraints do not subvert the issue that multicollinearity in ERGM can affect parameter and variance estimates, inhibiting inference from coefficients. Moreover, unique collinearity-type problems arise when using ERGM. First, highly collinear models may have multiple solutions to the likelihood function. In such cases, a single ERGM specification may result in multiple models from which the best fitted model cannot be determined (Hunter et al. 2008; Salter-Townshend and Murphy 2015). Second, variance estimates calculated through Markov Chain Monte Carlo (MCMC) maximum likelihood estimation (MLE) may be unstable when collinearity exists. Consequently, the same model can be rerun and yield different parameter estimates (Chandrasekhar and Jackson 2014). Third, since endogenous network parameters all often vary in tandem with the networks’ density, an increasing number of structural terms may increase the likelihood of collinearity-type problems (Snijders, Robins, and Handcock 2006). This is particularly problematic since the inclusion of endogenous parameters is often the primary appeal of using ERGM, and dyad-dependency—where global network structures influence the probability of tie formation—may be especially prevalent in dense networks. Finally, since endogenous covariates are often highly correlated, a collinear network variable may affect the parameter and variance estimates of other network covariates. How can the VIF be translated to ERGM? What VIF value would indicate problematic levels of multicollinearity? What ERGM specifications are more or less vulnerable to collinearity-type problems? This article addresses these questions by developing a method to calculate the VIF in ERGM. It introduces a technique to calculate shared variance between 3 parameters using a distribution of networks simulated from the parameter estimates of an ERGM. The method is evaluated using a Monte Carlo simulation experiment. ERGM specifications which correlate with large VIFs are identified using multilevel regression and the relationship between VIFs and standard error instability is investigated. PROBLEMS DIAGNOSING MULTICOLLINEARITY IN EXPONENTIAL RANDOM GRAPH MODELS Formally, ERGM is defined as: (1) 1 Pr(𝐘 = 𝒚) = ( ) exp⁡{∑ 𝜂𝐴 𝑔𝐴 (𝐲)} 𝑘 Where A is a type of network configuration, 𝜂𝐴 is the parameter term, 𝑔𝐴 (𝐲) is the network statistic, and k is a normalizing constant that ensures interpretable results and that the probability model sums to 1. The model assumes that the observed network is the result of a stochastic process and is only one realization from a population of possible networks. For (endogenous) change statistics, coefficients represent the log-odds of an edge connecting two vertices; for node-level covariates, coefficients indicate whether actors with a specific attribute are more or less likely to be connected to alters in the network (Lusher, Koskinen, and Robins 2013). The general approach to fitting an ERGM is to assume (1) that the likelihood principle is true, (2) that the data (e.g, the observed network) is good evidence of the generative (stochastic) distribution, and (3) that the likelihood formulation of the problem is reasonable. The researcher then provides the structural parameters of interest and finds the parameter weights which maximize the likelihood of the observed network. Computationally, this is done using MCMC MLE methods (see Handcock et al. 2008) because of the intractability of k in large networks. 4 Oftentimes, pseudo-likelihood estimates are used as starting values for the MCMC algorithm,1 though this is not necessary for the model to succeed. The MCMC procedure simulates a distribution of sufficient network statistics by making incrementally small changes to the network in sequential steps, where each step toggles whether a pair of actors are connected by an edge. If the new network has a higher probability than the preceding network, or if the ratio of the probability of the old network to the probability of the new network is relatively low, the new network is used as the starting network in the next step of the sequence. If this is not the case, then the old network is used again (see Koskinen and Snijders 2013 for a technical introduction). As simulated networks converge on the parameterized sufficient statistics, formal tests (such as a Gelman-Rubin test) are used to evaluate whether the Markov Chain has reached equilibrium. The result of the procedure is a joint probability distribution of sufficient network statistics. Random draws are made from the distribution to derive parameter estimates. The log-likelihood of the ERGM is evaluated iteratively until it reaches convergence. The MCMC procedure is often repeated many times to ensure that the solution to the log-likelihood is a global maximum rather than a local one (Snijders 2002). The benefit of this approach is that contextual network effects, such as triadic closure or the influence of shared partnerships, can be accurately estimated. Some techniques exist to evaluate multicollinearity, but these are rudimentary and often imprecise. First, the posterior bivariate correlation matrix can be used to gauge when correlations are high. However, bivariate correlations are often very large when working with network data (Ripley et al. 2017:59), and so determining when a correlation should warrant concern is a difficult task. Further, multicollinearity often results from a linear combination of correlations— a condition which is undetectable through the bivariate correlation matrix alone. This problem 5 may be particularly acute when a researcher includes multiple endogenous terms because higher order substructures often depend on lower-order ones and most endogenous effects are somewhat influenced by network density (Faust 2007). Second, multicollinearity may cause problems in calculating the covariance matrix between the parameter estimates. The Hessian matrix is often unsolvable when ERGM parameters are highly correlated. Since the covariance matrix is retrieved by solving the Hessian matrix, a problematic Hessian matrix may indicate multicollinearity. However, the Hessian matrix may be unsolvable for any variety of reasons that causes the likelihood function to fail; for example, insufficient Markov Chain length. Moreover, the Hessian matrix is not always unsolvable when multicollinearity is present. Thus, while an unsolvable Hessian matrix or large bivariate correlations may suggest multicollinearity, they cannot identify the problem directly; nor can these methods pinpoint when combinations of explanatory variables are highly correlated. These shortcomings highlight the need for a diagnostic tool for multicollinearity. Evaluating correlations One large problem in detecting multicollinearity in ERGM relates to translating the VIF to the ERGM context. VIFs in GLM are based on the assumption of accurate and otherwise independent correlations between predictor variables (Fox and Monelli 1992; Fox 2008). The general approach is to regress the term of interest on all other explanatory variables in the model using OLS. The resulting 𝑅 2 is used to calculate a VIF for that variable. In this approach, 𝑅 2 is calculated as: (2) 𝑅𝑦2 ∑(𝑦𝑖 − ŷ𝑖 )2 = 1 −⁡ ∑(𝑦𝑖 − 𝑦̅)2 6 Where y is the outcome variable and ŷ is the predicted value of y. VIF is calculated as: (3) 𝑉𝐼𝐹𝑦 = 1 1 − 𝑅𝑦2 This approach works well in GLMs because observations are independent and errors are assumed to be uncorrelated. When predictor variables are highly correlated, variance estimates inflate and can be subsequently detected. However, variance estimates resulting from an OLS regression are inaccurate when working with network data because observations are not independent and identically distributed (iid). Utilizing this method on network data may lead to 𝑅 2 values that do not reflect shared variance among network parameters, and therefore VIFs that do not accurately estimate multidimensional correlations between explanatory variables. This problem is likely to increase as contextual network effects are included in an ERGM. The proposed alternative for ERGMs is to simply calculate 𝑅 2 from the fitted parameters of an existing model. This approach is particularly appealing for statistical network models, which explicitly seek to evaluate independent correlations between variables when using non-iid data. By calculating 𝑅 2 directly from the correlation matrix between parameters of a converged ERGM, resulting VIFs will capture shared variance between explanatory variables. This method assumes that an ERGM has already yielded some set of parameter estimates. After yielding estimates, a distribution of networks can be simulated from the parameters of an ERGM using MCMC techniques (Geyer and Thompson 1992; Hunter et al. 2008). An accurate bivariate correlation matrix R can be calculated from the sufficient statistics of these networks. R can then be broken into a vector of correlations 𝒓𝒙𝒚 , where subscript y represents the explanatory variable of interest and x represents all other explanatory variables in the ERGM, and a square 7 correlation matrix 𝑹𝒙𝒙 consisting of all explanatory variables in the model other than y. Correlations with the edges term (the number of the edges in the network) should be excluded from both 𝒓𝒙𝒚 and 𝑹𝒙𝒙 . Thus, in the fitted ERGM, 𝑅 2 for a focal predictor variable y as explained by other predictor variables 𝑥𝑖 can be calculated as (e.g. Rencher 2002): (4)2 ⁡𝑅𝑦2 = 𝒓𝑻𝒙𝒚 𝑹−𝟏 𝒙𝒙 𝒓𝒙𝒚 This 𝑅 2 can then be used in Eq. (3) to calculate the VIF for y. It may seem curious to simulate a distribution of networks rather than to calculate R from the ERGM directly. MCMC MLE explores unlikely network configurations when approaching convergence. Consequently, the correlation matrix of a fit ERGM may reflect some error implicit in the simulation procedure. It is therefore important that the researcher simulate a distribution of networks from a converged ERGM and use this posterior distribution of networks to calculate R. To summarize, the proposed approach is to: 1) Fit an ERGM. 2) Simulate a large distribution of networks from the ERGM parameters.3 3) Calculate R from the distribution of networks. 4) Calculate 𝑅𝑦2 from R (Eq. (4)), treating the explanatory variable of interest as y. 5) Calculate the VIF for y using Eq. (3). Proof of concept Since the properties of VIF in OLS regression are well understood, a simple way to evaluate this method is to compare the VIFs in ERGM to the VIFs retrieved from OLS regression. A Monte Carlo experiment compared estimates from a dyadic independent ERGM to 8 those of an OLS regression. The ERGM was binary, predicting an undirected 75-vertex network with three node-level attributes. The same predictors for the OLS model were used as node-level main-effects in the ERGM. The outcome (network) was a random network, and the attributes were simulated random continuous variables. Both models were fit to the same simulated data 1,000 times to calculate three VIFs for OLS regressions and ERGMs. The ERGM correlation matrix was calculated from a posterior distribution of 1,000 simulated networks. MCMC MLE was used instead of pseudo-likelihood to estimate the ERGM. In the OLS regression, VIFs were calculated using Eq. (2) and Eq. (3); the ERGM used the approach described above. The distribution of VIFs for the OLS regression yielded a mean of 1.027, a standard deviation of 0.028, and a range of 1.000 to 1.278. The distribution of VIFs for ERGM yielded a mean of 1.030, a standard deviation of 0.032, and a range of 1.00 to 1.284. The two distributions were positively correlated with a correlation coefficient of 0.876 (p < 0.001). In a second condition, the variance-mean ratio for the randomized independent variables was set to 0.2 to introduce relatively high levels of collinearity among the predictor variables. The distribution of VIFs for the OLS regression in this condition yielded a mean of 3.343, a standard deviation of 0.528 and a range of 2.069 to 5.737. The distribution of VIFs for ERGM yielded a mean of 3.919, a standard deviation of 0.685, and a range of 2.174 to 7.648. The two distributions were positively correlated with a correlation coefficient of 0.893 (p < 0.001). These results indicate that the proposed method for estimating VIFs yields comparable estimates to OLS VIFs when calculated on network outcomes. Evaluating ERGM VIFs 9 It is worth discussing here the meaning of VIFs in the ERGM context. There currently is not a clear estimate of variance explained for ERGMs, including the 𝑅 2 in Eq. (4). ‘Spectral’ goodness of fit measures for social networks have only been developed recently (Shore and Lubin 2015) and they do not easily translate to estimating multicollinearity. Instead, the calculation for VIFs proposed in Eq. (4) estimates shared variance, where increasing values indicate higher multivariate correlations between explanatory variables. Thus, the method detects correlations as a linear combination of variables, but it is not clear how VIFs should be interpreted on their own or how multidimensional correlations are related to parameter or variance estimates. The remainder of this paper is dedicated to 1) defining when VIF values are problematic and 2) identifying ERGM specifications that are particularly vulnerable to multicollinearity. AN EXPERIMENTAL APPROACH The main ambiguity in evaluating VIFs in ERGM is that MCMC estimation is especially tolerant to highly correlated predictor variables. Consequently, the standard rule of thumb for evaluating VIFs—where a VIF larger than 4 indicates concerning levels of collinearity, and a VIF greater than 10 indicates problematic levels of collinearity (Chatterjee and Price 1991; O’Brien 2007)—does not translate well to the ERGM context because network data is often highly correlated (e.g. Snijders 2002; Ripley et al. 2017), and so VIFs will most likely be much larger than in GLM. Moreover, the VIF in ERGM does not have a straightforward interpretation as in OLS, where √𝑉𝐼𝐹 is the factor by which variance estimates are inflated. How should researchers interpret VIFs in ERGM? What ERGM specifications may increase the risk of multicollinearity? 10 To answer these questions, I conducted a Monte Carlo experiment to investigate the properties (or distribution) of VIFs under a variety of ERGM specifications. Monte Carlo simulation generates random samples of data from fixed specifications (Mooney 1997). The underlying logic of this strategy is to create a distribution of VIFs that reflect a wide array of ERGM specifications and network configurations. The distribution of VIFs resulting from these experiments are then analyzed using multilevel regression to evaluate which ERGM specifications may result in multicollinearity. While an uncommon approach, regression on simulated data is useful to evaluate multidimensional correlations that may not be apparent through descriptive statistics alone (Kim and Bearman 1997; Hanaki et al. 2007). The simulation entailed generating the data, estimating an ERGM on the data, and then saving the VIF for each parameter in the model. The simulated networks are all undirected and each have 75 vertices. I focus on the simplest case of unweighted networks, where all cells in the adjacency matrix have a value of either 0 or 1. An ERGM is fit to the network per the specifications of that factorial condition. A posterior distribution of 1,000 networks is then simulated from the fitted ERGM. A correlation matrix is retrieved from this distribution, and the VIF for each explanatory variable is calculated using Eq. (4) and Eq. (3). Manipulations The factorial design of the experiment varies network density, network topology, ERGM parameters, and variance-mean ratios of node-level attributes to create a distribution of ERGM VIFs that reflect variation in realistic ERGM specifications. Each network is predicted with three random node-level attributes specified as main-effects. The variance-mean ratio of the node-level attributes is varied between 0.5, 1, and 5 to introduce or reduce correlation. Network density 11 varies within realistic values (0.1, 0.2, 0.3) (Steglich, Snijders, and Pearson 2010) to gain insight to the range of potential VIFs in networks with different densities. The simulated networks’ topological features are either random, scale-free, or small-world—global network structures that often emerge in real-world social networks (Watts and Strogatz 1998; Barabasi and Albert 1999; Newman, Strogatz, and Watts 2001). [Table 1 here] The final factor in the experiment is the number and type of endogenous parameters. When these are included, there are three different endogenous terms: degree correlation, gwesp, and degree popularity. Degree correlation measures the (Pearson’s) correlation of degree scores between vertices in the network (Newman 2002); degree popularity is an actors’ degree score multiplied by its square root (Snijders et al 2010); gwesp measures localized clustering through edgewise shared partnership—a transitivity measure of shared edges between three or more vertices.4 Formally, the terms can be defined as: 𝐷𝑒𝑔𝑟𝑒𝑒⁡𝑝𝑜𝑝𝑢𝑙𝑎𝑟𝑖𝑡𝑦 = ⁡ ∑ 𝑐𝑖 √𝑐𝑖 where c is the degree score for actor i, 𝐷𝑒𝑔𝑟𝑒𝑒⁡𝑐𝑜𝑟𝑟𝑒𝑙𝑎𝑡𝑖𝑜𝑛 = ⁡ ∑𝑦𝑐 𝑦𝑐(𝑤𝑦𝑐 − 𝑞𝑦 𝑞𝑐 ) 𝜎2𝑞 where y is a tie, w is the joint probability distribution of the degree scores of two actors connected by y, and q is the normalized distribution of 𝑤𝑦𝑐 , and 𝛼 𝑛−2 𝑔𝑤𝑒𝑠𝑝 = ⁡ e ∑{1 − (1 − e−𝑎 )𝑖 }𝑝𝑖 𝑖=1 where p is the number of edgewise partnerships for i and α is a decay parameter (Hunter 2007). Although not necessary, α is typically provided by the researcher to reduce computational 12 burden. These terms were chosen for their theoretical relevance and common usage (e.g. the effects of global and local status hierarchies) and to mirror the data generating process for certain network topologies, namely localized clustering and preferential attachment (Snijders et al. 2010). Either none, one, two or all three terms were included, depending on the condition. One level of this factor corresponds to no endogenous terms, three levels correspond to one endogenous factor (gwesp, degree correlation, or degree popularity), three levels correspond to two endogenous factors (gwesp and degree correlation, gwesp and degree popularity, or degree correlation and degree popularity), and the final level corresponds to all three endogenous variables.5 These combinations yield eight categories for the condition. Thus, a single ERGM could yield three (no endogenous variables), four (one endogenous variable), five (two endogenous variables), or six VIFs (three endogenous variables), depending on the level of the factor. To summarize, the Monte Carlo experiment varied the network density (3 categories), network topology (3 categories), the variance-mean ratio of node-level variables (3 categories), and specifications for the right-hand side of the ERGM equation (8 categories). The entire experiment included 216 conditions (3 x 3 x 3 x 8 = 216). With 1,000 replications per condition, the entire sample space is the estimates from 216,000 ERGMs.6 Below are summary statistics of the estimates; I also model the differences between models as a function of the manipulated factors in the Monte Carlo experiment. With an understanding of the properties of the VIF, the method’s association with instability in parameter variance estimates is evaluated. Supplemental analyses are also available for large networks with 750 vertices (online supplement). An extended examination of the parameter space of the ERGMs is provided in the Appendix. 13 RESULTS The distribution of ERGM VIFs The first concern is to characterize the distribution of VIFs retrieved from the Monte Carlo experiment to understand how the proposed method behaves across a range of ERGM specifications. Unlike VIFs in GLM, ERGM estimation is uniquely tolerant to high correlations, and so models with multicollinearity problems may yield absurdly high VIFs (especially those models which have a problematic Hessian matrix). The range of VIFs in the experiment stretches from 1.00 to 1.28⁡x⁡1012 . The mean VIF is 6.07⁡x⁡106 with a standard deviation of 2.04 x 109 . The median VIF is 1.64. VIFs vary by network topology. Particularly, ERGMs fit to scale-free networks tend to generate large VIFs. The range of VIF in scale-free networks stretches between 1 and 1.28⁡x⁡1012 . To compare, the mean VIF for scale-free networks is 1.83 x⁡107 with a standard deviation of 3.54 x⁡109 , while the VIF for random networks is 456 with a standard deviation of 9.57 x⁡105 and an upper limit of 2.75 x⁡107 . The mean VIF for small-world networks is 2,007 with a standard deviation of 2.75 x⁡105 and upper limit of 1.31⁡x⁡108 . This suggests that networks exhibiting preferential attachment may be at high risk for multicollinearity. What ERGM specifications are likely to produce collinearity-type problems? The next step is to identify ERGM specifications which may increase the likelihood of multicollinearity. Prior research suggests that an increasing amount of endogenous terms may increase the likelihood of encountering collinearity-type problems (Lusher et al. 2013; Chandrasekhar and Jackson 2014). However, it is unclear whether a simple increase in the count of endogenous parameters increases the risk of multicollinearity by complicating the MCMC 14 procedure or if certain endogenous variables simply tend to be highly correlated with other variables at a greater rate than node-level characteristics. Multicollinearity may also decrease as network density increases because the specific attributes or endogenous effects which correlate with selection patterns may be easier to identify as actors connect with one another more frequently. There may also be an interactive effect, where endogenous parameters specified on high density networks generate multicollinearity because endogenous parameters tend to be highly correlated with the edges of the network. These potential trends are investigated empirically using multilevel regression on the distribution of VIFs (VIFs nested in ERGMs). Since VIFs are heavily skewed, 𝑙𝑜𝑔(𝑉𝐼𝐹)0.1 is taken to trend the data linearly. The mean 𝑙𝑜𝑔(𝑉𝐼𝐹)0.1 is 0.929 with a standard deviation of 0.165, and a range of 0.228 to 1.395. Figure 1 demonstrates that this transformation approximates a normal distribution. 𝐿𝑜𝑔(𝑉𝐼𝐹)0.1 is predicted with network topology, the variance-mean ratio of node-level attributes, network density, and the right-hand specifications of the ERGM equation. The right-hand specification of the ERGM equation is operationalized as seven dummy variables, one for each combination of endogenous effects. Dyad-independent ERGMs are omitted as the reference category. The model includes a randomly varying intercept at the second level to account for potential nesting in ERGMs. [Figure 1 here] Model 1 includes network density, variance mean ratios, network topology, and the effects of endogenous parameters (Table 2). Increases in density are correlated with growing VIFs. VIFs decrease as the variance-mean ratio of exogenous variables increases. Scale-free networks are associated with higher VIFs than random networks, while small-world networks are associated with lower VIFs. When only one endogenous effect is specified in the ERGM, VIFs 15 tend to be lower than when only node-level variables are specified. However, when more than one endogenous parameter is specified in the model, VIFs increase. These trends are demonstrated in Figure 2, which shows that VIFs tend to cluster around higher values as the number of endogenous variables in an ERGM increases. Notably, when an ERGM is fit with two-degree related parameters (degree popularity and degree correlation), the effect size is larger than in other two-endogenous variable combinations. These results indicate that researchers should use caution when specifying multiple endogenous variables in ERGM, especially multiple degree-related parameters. [Table 2 here] [Figure 2 here] Model 2 includes interactions between ERGM specifications and network density. For the most part, specifying one endogenous variable correlates with lower VIFs. However, as more endogenous variables are specified and network density grows, VIFs tend to increase. Figure 3 shows this graphically. VIFs tend to be low in models specifying a single endogenous parameter, net of network density. As density increases, VIFs decrease when gwesp is the only endogenous parameter in the model. These results suggest that specifying a single endogenous effect may help offset multicollinearity in otherwise dyad-independent models of dense networks. [Figure 3 here] Still, endogenous parameters generate larger VIFs as more parameters are included in an ERGM (Figure 3). The most consistent effect is when degree popularity and degree correlation are simultaneously specified. This variable combination shows little fluctuation across levels of network density. When gwesp is specified alongside either degree correlation or degree popularity, the effect size is approximately linear across levels of network density. Importantly, 16 the effect of specifying three endogenous parameters is greater at even low levels of network density than specifying any other combination of gwesp, degree popularity, and degree correlation at high levels of network density. The predicted value of VIFs in an ERGM fit to a 0.1 density network with three endogenous variables is 7.17 (𝑙𝑜𝑔(𝑉𝐼𝐹)0.1 = 1.07); at high levels of network density (0.3) it is 22.42 (𝑙𝑜𝑔(𝑉𝐼𝐹)0.1 = 1.12). To compare, the predicted value for an ERGM specifying degree popularity and degree correlation on a dense network is 5.10 (𝑙𝑜𝑔(𝑉𝐼𝐹)0.1 = 1.05). Thus, researchers should be aware of the risk of multicollinearity when specifying endogenous effects in ERGMS of dense networks. Since the range of VIFs varies between networks with different topologies, it is useful to examine how the VIF behaves in networks with diverse global structures. As in the full model, the dependent variable is transformed to 𝑙𝑜𝑔(𝑉𝐼𝐹)0.1 to trend the data linearly. For random networks, the mean 𝑙𝑜𝑔(𝑉𝐼𝐹)0.1 is 0.925, with a standard deviation of 0.158 and a range of 0.305 to 1.329. For scale-free networks, the mean is 0.945, with a standard deviation of 0.173 and a range of 0.228 to 1.395. For small-world networks, the mean to 𝑙𝑜𝑔(𝑉𝐼𝐹)0.1 is 0.917, with a standard deviation of 0.164 and a range of 0.299 to 1.340. This transformation approximates a normal distribution in all topologies. Table 3 shows the results for multilevel regression of networks with different topologies. There is little variation between models. However, the main effect of high network density (0.3) is only statistically significant for scale-free networks, indicating that dense scale-free networks are at greater risk for multicollinearity than either random or small-world networks. Similarly, the effect of gwesp on VIFs is negative for scale-free networks, but positive for random and small-world networks. Specifying three endogenous variables in scale-free networks also has a noticeably larger effect on VIFs compared to either random or small-world networks. Thus, 17 researchers should use caution when specifying numerous endogenous effects in networks that exhibit preferential attachment, regardless of network density. Researchers should also be aware of the increased risk of multicollinearity in ERGMs fit to dense scale-free networks. [Table 3 here] Figure 4 explores interaction effects across network topologies. When three variables are in the model, VIFs increase alongside network density. The effect size is greatest, as well as most consistent, for scale-free networks. Even at low levels of network density (0.1), the predicted VIF value for ERGMs fit to scale-free networks with all three endogenous parameters is 22.42 (𝑙𝑜𝑔(𝑉𝐼𝐹)0.1 = 1.12). ERGMs with three endogenous variables fit to random networks exhibit a marginal decrease in VIF size as network density increases. Effects are mostly consistent across network topologies when there are two variables in the ERGM. The combination of degree parameters does not fluctuate across levels of network density. Alternatively, the interactive effect of gwesp and degree correlation is approximately linear as network density rises. One notable difference is that the combination of degree popularity and gwesp exhibits a steep marginal increase between network densities of 0.2 and 0.3 in scale-free networks. Turning to models predicted with a single endogenous variable, two trends stand out. First, at high levels of network density, node-level variables show a marginal increase in effect size for scale-free networks. This is not surprising, since most edges are connected to only a few actors in scale-free networks. Second, a model specifying only gwesp in small-world networks is associated with decreasing VIFs as the network becomes denser. This is also intuitive, since gwesp parameterizes local clustering—a defining trait of small-world networks (Watts and Strogatz 1998). [Figure 4 here] 18 Figure 5 explores how the effects of specific variables vary across network topologies. Across the plots, violins for the three topologies are similarly dispersed with comparable medians. This indicates that variable specifications tend to have similar effects on multivariate correlation across network topologies. However, there are a few notable differences. Random networks tend to have the widest violins across types of variables, indicating that VIFs in random networks concentrate more densely around similar values compared to either small-world or scale-free networks. The densest concentrations of values for gwesp in small-world networks sit near the median, indicating that the effects of gwesp tend to be most consistent in small-world networks compared to either scale-free or random networks. Scale-free networks have the most variability in the concentration of VIF values, with larger concentrations of values for structural characteristics of 𝑙𝑜𝑔(𝑉𝐼𝐹)0.1 > 1.25 than either random or scale-free networks. These values reflect the trends revealed in Table 3: structural specifications may elicit collinearity problems in networks formed through preferential attachment, while the gwesp statistic tends to be a consistent estimator in networks exhibiting localized clustering. [Figure 5 here] A Monte Carlo simulation of ERGM VIFs gives insights to their properties. VIFs are highly skewed in ERGM and frequently generate values far above what would be acceptable in GLM. Multilevel regression identifies conditions which may increase or reduce VIFs in ERGM. Scale-free networks tend to exhibit higher VIFs than random networks, while small-world networks tend to exhibit lower VIFs. Similarly, increasing network density is associated with larger VIFs in scale-free networks. For the most part, specifying a single endogenous variable is associated with lower VIFs, but VIFs tend to increase as more endogenous variables are 19 specified. This is especially true when multiple degree-related parameters are specified or when network density rises. Interpreting ERGM VIFs A final step is to identify when a VIF is large enough to warrant concern. This a problem which requires creative thinking. Since VIFs can be used to detect the degree of inaccuracy in a standard error, the association between a VIF and the size of a standard error will only be moderate at best. One way to detect the relationship is to measure the association between VIFs and fluctuations in an ERGM’s variance estimates. When ERGMs exhibit multicollinearity, the same model can be rerun and yield different parameter estimates (Chandrasekar and Jackson 2014). Similarly, as an ERGM loses accuracy due to multicollinearity, parameter estimates approach randomness, increasing the range of potential fluctuations for variance estimates between model reruns. Thus, when an ERGM is rerun, the size of the change in the variance of the diagonal of the variance/covariance matrix for the parameter estimates between models should be positively correlated with large VIFs.7 To test this, I ran a four condition Monte Carlo simulation experiment. Conditions were fixed to yield an increasing rate of multivariate correlations. As above, each condition generated a 75-vertex random network and assigned three random continuous node-level attributes to the vertices. The variance-mean ratio for the variables was set at 0.5; each ERGM was estimated with three endogenous variables (gwesp, degree popularity, and degree correlation). Since the interaction between dense networks and endogenous variables has been shown to yield higher VIFs, network density is toggled between values of 0.1, 0.2, and 0.3 (3 categories). The final condition fit an ERGM to a 0.1 density network with three random node-level variables and no 20 endogenous variables using MCMC MLE. The purpose of this condition was to generate ERGMs with little-to-no multicollinearity. The simulation draws a random network, fits an ERGM to the network, records the VIFs, parameter variance estimates (diagonal of the covariance matrix), and whether the Hessian Matrix is unsolvable or near unsolvable.8 It then reruns the exact same ERGM on the same network and records the same statistics. This enables comparisons of VIFs with estimates of parameter variance between models. With 500 replications per condition, the sample space for the experiment is the summary statistics of 2,000 ERGM pairs.9 Since VIFs are highly skewed, it is necessary to transform the data to gauge the correlation between variance estimates and VIFs. Table 4 reports the bivariate correlations between average VIFs across both models and absolute differences in the parameter variance estimates in log-log space.10 The tenth root of VIFs is taken to trend the data linearly.11 The mean difference for VIFs is -0.11. Since multicollinearity in an endogenous effect can skew the parameter estimates of other network processes (e.g. Faust 2007), correlations are also reported for the maximum VIF in a model and maximum variance fluctuations in a model. VIFs effectively predict instability in parameter estimates caused by multicollinearity. VIFs are highly correlated with fluctuations in the focal variable’s variance (0.715) (p<0.001). As VIFs increase, the size of parameter fluctuations for a focal variable also increase. Increasing VIFs are also positively correlated with the largest variance (0.873) fluctuations in a model (p<0.001). This reflects that, in many cases, the variable most affected by multicollinearity is not the one causing it. Indeed, a highly collinear variable may skew variance estimates in the entire model. Higher VIFs are also associated with higher average variance (0.853) fluctuations in a model (p<0.001). 21 [Table 4 here] Figure 6a examines (smoothed) trends in the distribution of VIFs. The solid line tracks the average absolute difference in parameter variance estimates for a focal variable’s VIF. The dashed line tracks the average size of the greatest absolute difference in parameter variance estimates for any variable in a model as a function of the model’s highest VIF. Figure 6a shows that differences in parameter variance estimates tend to increase at VIFs of around 20 (untransformed). An unsmoothed version of the plot (Figure 6b) shows considerable overlap between the model and variable-level absolute differences in parameter variance estimates when 40 < VIF < 150. This suggests that the offending variable tends not to be the variable most affected by multicollinearity in models with VIFs lower than 40 and in models with VIFs greater than 150. The size of absolute differences in parameter variance estimates also plateau and trend slightly downward beyond VIFs of 20. This is explored further below. [Figure 6a here] [Figure 6b here] Table 5 reports the average size of absolute difference in parameter variance estimates. When a variable’s VIF is between 20 and 150, the standard error for the variable fluctuates by an average of 0.23. When a model has at least one VIF between 20 and 150, the average maximum change in standard errors is 0.51. This means that, in an average model rerun where at least one VIF is between 20 and 150, the least stable standard error in the model fluctuates by 0.51. In the same range, the average difference in standard errors across all variables in the model is 0.09. When VIFs are above 150, the average change in standard error for a focal variable is 0.10. For a model, VIFs above 150 are associated with a mean maximum standard error change of 0.38 and a mean change of 0.06 across all variables in the model. Variance estimates tend not to fluctuate 22 at any level of observation for VIFs below 20. Despite large VIFs and unstable standard errors, the Hessian matrix was solvable in every ERGM. [Table 5 here] The results of Table 5 and Figures 6a and 6b imply that variance instability tends to be equally problematic for any VIF above 20, and slightly less problematic when VIFs pass 150. However, this is not the case. As VIFs increase, multicollinearity problematizes the MCMC procedure. Consequently, the accuracy of parameter estimates plummets and standard errors grow increasingly random. The result of an unstable Markov Chain and increasingly random parameter estimates is that the absolute difference in parameter variance estimates will not always be large in models exhibiting severe multicollinearity. They may be quite small by pure chance. Thus, the decrease in average absolute difference in parameter variance estimates when VIFs > 150 reflects heteroskedastic fluctuations in a sparse sample space (only 7.6% of VIFs were between 150 and 1,686, the maximum VIF). Figure 7 exhibits this trend for the VIF of a focal variable and the absolute difference in parameter variance estimates for that focal variable across model reruns. Figure 8 shows the same trend for the maximum absolute difference in parameter variance estimates in a given model. As VIFs increase, the size of the absolute difference in parameter variance estimates between model reruns grows inconsistent, indicating that the ERGM has difficulty reaching the target distribution. Absolute differences in parameter variance estimates are especially inconsistent when VIFs pass 150. This shows that, in some severe cases, standard errors in ERGMs with a VIF above 150 are essentially random. Thus, although the mean size of absolute differences in parameter variance estimates is largest when 20<VIF<150 (Table 5), variance estimates are particularly unreliable when VIFs pass 150 (Figure 7, Figure 8). 23 [Figure 7 here] [Figure 8 here] Results indicate that VIFs are strongly and positively correlated with the size of the absolute difference in parameter variance estimates. Figures 6a and 6b identified critical regions for parameter instability. VIFs greater than 20 are associated with unstable variance estimates. Similarly, if there is a VIF over 150 in a model, then it drives instability in other covariates in the model. Figures 7 and 8 demonstrate that when VIFs exceed 150, standard errors are often especially unreliable, and in some cases essentially random. With these results in mind, it is recommended that researchers recognize the risk of multicollinearity when VIFs exceed 20, and should be especially concerned if any VIF is above 150. Empirical applications on five networks To build intuition regarding the application of this measure, I provide empirical examples on five networks. Four of the networks have been used in prior research or online ERGM tutorials. First, I provide an in-depth empirical example that estimates all models twice. The benefit of this approach is to illustrate parameter and variance instability when multicollinearity exists in an ERGM. Next, I fit four additional ERGMs to four empirical networks and discuss how the VIFs for these ERGMs can be interpreted. Finally, I plot the mean VIFs from all models and discuss what a researcher can infer about the model using the VIF. The Faux Mesa high school network is a simulated network designed to reflect real-world social networks. The network is based on the characteristics of the friendship network in one rural high school in the AddHealth dataset (see Hunter et al. 2008). Below, I replicate the ERGM specifications used by Shore and Lubin (2015) to model the Faux Mesa network, predicting 24 homophily attributes for sex, grade, and race, and the geometrically weighted degree distribution (gwdegree) for the network (see Hunter and Handcock 2006).12 The final model adds a gwesp term. Each ERGM forces MCMC MLE estimation instead of pseudo-likelihood. VIFs are evaluated across models. Each model is rerun under the same model specifications to evaluate the stability of the standard errors. Model 1 predicts the Faux Mesa network using homophily attributes. Each of these parameters remains relatively uncorrelated, with VIFs staying well below a value of 3. Model 2 adds the gwdegree statistic for the network. VIF values increase noticeably, above levels that most researchers would be comfortable with in GLM, but which are not problematic in ERGM. [Table 6 here] Model 3 includes the gwesp parameter. The VIFs for all variables increase, except for gwdegree, which diminishes. Particularly, VIFs for homophily (Grade) and gwesp jump significantly to values over 80. This suggests that these two terms are highly correlated. This is reflected in the correlation matrix of the posterior distribution of the ERGM: for gwesp and grade homophily, r = 0.989 (Table 7). Notably, standard errors are less stable in the third model compared to Models 1 and 2. The high correlation between gwesp and grade homophily affects the stability of standard errors for other covariates in the model. In the case of racial homophily and gwdegree, the level of significance changes from p<0.05 to p<0.01. Similarly, the standard errors for sex homophily and racial homophily—both of which are highly correlated with gwesp and grade homophily (Table 7)—fluctuate by roughly 40%. This points to multicollinearity in the third model and indicates that the cause of multicollinearity is the association between gwesp and grade homophily. In this case, where the source of the problem is a bivariate correlation, either of the offending terms can be removed from the model to reduce collinearity. 25 [Table 7 here] To provide further intuition on how the method can be applied in substantive research, I present VIFs for ERGMs fit to four networks that are often used as example cases in ERGM tutorials: the Faux Magnolia network, Sampson’s monk network, the Florentine Marriage network, and the Local Health Department Services network (LHDS). Like the Faux Mesa network, the Faux Magnolia network is a simulated network designed to mirror real-world high school friendship networks. It is frequently used by the statnet team—developers of the leading ERGM statistical software—in online and conference ERGM tutorials (Goodreau et al. 2008). The variable specifications available for this network are identical to the Faux Mesa network. The Sampson Monks network is also a classic network study used in a popular online ERGM tutorial (Levy 2016). The Sampson network is a directed network of affiliations in a monastery. Monks are classified in the network by their group membership, which indicates whether or not they belong to a certain faction within the monastery. The LHDS network is a network compiled by Harris (2014) in her introductory book on ERGM. The LHDS network is a network of affiliations between leaders of health department offices. Actors in this network are classified by how long they have been in their leadership position and by the population in millions of the jurisdiction over which they preside. The final network, the Florentine marriage (Medici) network, is the network studied by Padgett and Ansell (1993) in their classic research. Ties in the network are marital associations between powerful families in 15th century Florence. Members in this network are classified by their wealth and whether they have some pre-existing business relationship. Model building for these networks proceeds as follows. In Model 1, I fit an ERGM to each network with three explanatory variables. This is the minimum number of variables 26 necessary for the VIF to yield insight beyond a simple Pearson’s correlation. In Model 2, ERGM specifications are identical to the respective tutorials mentioned above, with the exception of the Florentine marriage network. In Model 3, I add additional network statistics to all models to force multicollinearity in some models and not in others. The goal of this step is to help delineate when a researcher may or may not be concerned with multicollinearity. Variables in these models were chosen to represent a wide range of commonly used ERGM parameters. Table 8 presents VIFs for the four ERGM examples. In the Faux Magnolia network, the largest VIF is 3.72 for the grade homophily statistic, 3.53 for race homophily, and 2.88 for sex homophily. These are all extremely low correlations by ERGM standards. From these results, a researcher can infer that multicollinearity is not a problem. The same conclusion can be reached for the Sampson network, where the largest VIF is 3.8 for the reciprocity parameter. As a dyadindependent ERGM, the LHDS network also yields extremely low VIFs. Here, the largest VIF is 1.1, indicating that correlations between explanatory variables are almost not existent. This is not surprising, since the variable categories here are a priori mutually exclusive. Model 1 for the Florentine marriage network also exhibits extremely low VIFs. Here the largest VIF is 3.34 for the node-level wealth covariate, and the lowest VIF is for the gwdegree statistic. [Table 8 here] In Model 2, where ERGM specifications in the first three models are identical to existing tutorials, VIFs increase noticeably across the networks. The largest VIF in the Faux Magnolia network is 8.66 for the grade homophily parameter, while race homophily VIF is 6.93, and the sex homophily VIF is 4.78. The new term, gwesp, has a VIF of 4.19. Although VIFs double in size in this model, they are still well below what ERGMs can tolerate. Thus, a researcher can 27 conclude that multicollinearity is not a problem in this ERGM, even though VIFs of this size should raise concern in GLM. The trends in Model 2 for the Faux Magnolia network are comparable to the Sampson Monk ERGM. Here, the group homophily term yields the largest VIF of 8.67, followed by gwesp, reciprocity, and finally gwdegree. As above, collinearity does not appear to be affecting model estimates. VIFs in the LHDS network remain extremely low, with no VIF exceeding 1.15. As a dyad-independent ERGM, this is not surprising. Finally, of the four empirical networks, Model 2 for the Florentine marriage network yields the largest overall VIF at 9.09 for the 2-star parameter, followed by wealth, gwesp, and then gwdegree. Since the VIFs for wealth and 2-stars are much larger than the VIFs for either of the gw terms, a researcher may conclude that the 2star and wealth parameters are correlated. However, as above, the levels of correlation are well below what an ERGM can be reasonably expected to tolerate. In Model 3, I add terms beyond what is specified in current tutorials to force collinearity in some models. The Faux Magnolia network maintains low VIFs by ERGM standards. The grade homophily and sex homophily terms show small gains in the VIFs, indicating that the two terms are likely correlated with the gwdegree statistic added in Model 3. Turning to the Sampson Monk network, VIFs remain small and unconcerning even after adding gwodegree. This is also true of the Florentine Marriage network, where the largest VIF is 8.60 for the 2-star parameter after controlling for whether a Florentine marriage is correlated with a business relationship. These trends do not hold, however, for the LHDS network. Adding gwesp to the ERGM increases VIFs across all variables in the model. The newly added gwesp term is the largest VIF at a value of 327.9, followed by the jurisdiction population at 144.5. A VIF of this size clearly indicates multicollinearity in Model 3. Further evidence for multicollinearity in this model is the 28 jurisdiction parameter, where the coefficient changes direction (from positive to negative) between Model 2 and Model 3, as well as noticeable change in the strength and size of coefficients and standard errors for the leadership position parameters in the model. Since correlations were extremely low in prior models and no other VIF is of comparable size to the gwesp parameter, a researcher can conclude that gwesp is the source of multicollinearity. A researcher examining this network should remove the leading VIF—gwesp—to reduce multicollinearity. To summarize these trends graphically, Figure 9 plots the mean VIF across all three models for the five empirical networks: Faux Mesa, Faux Magnolia, Sampson, LHDS, and the Florentine Marriage (Medici) network. The dashed lines plot the three networks where multicollinearity did not affect ERGM estimates. In these models, the mean VIF stays below 8 and the largest VIF stays below 10. In the Faux Mesa network—a network with a moderate collinearity problem—the largest VIF reaches 80 and the mean VIF reaches 29.7. In this case, researchers taking a cautious approach should omit the offending variable (gwesp) to reduce collinearity. This is the preferred approach. However, in some cases, researchers may be especially interested in the offending variable on theoretical grounds. In cases of light to moderate collinearity, it may be possible to extend the length of the Markov Chain or the size of the MCMC sample and to rerun the model multiple times to evaluate the extent of collinearity in the ERGM. If p-values, standard errors, and parameter estimates stay relatively stable with these adjustments, a researcher may be able to interpret the parameters anyways. However, this approach is not recommended for a VIF above 50, and should be treated with scrutiny in general. [Figure 9 here] 29 Turning to the LHDS network Model 3, the largest VIF is over 300 and the mean model VIF is 134. This is well beyond the threshold for severe multicollinearity (discussed above) and should be raise concern. The severity of the problem is reflected in the unstable parameter and variance estimates for Model 3 in Table 8, where one coefficient changes direction and three others dramatically change in strength. In this case, a researcher cannot interpret any parameter estimates reliably; gwesp should be removed from the ERGM. In summary, these results demonstrate the utility of the method. VIFs pinpoint the sources of multicollinearity by capturing multivariate correlations. In cases where the VIF is relatively small to moderate in size (e.g. 20-50), the problem may be subverted by increasing the length of the Markov Chain or the size of the MCMC sample. However, in moderate or severe cases, the offending variable will usually need to be omitted from the model entirely. Theoretical reasoning should be used to inform which variable to remove in cases where multiple variables exhibit high VIFs. DISCUSSION This article proposed a method to diagnose multicollinearity in ERGM. The method for calculating VIFs in ERGM is robust to both network data as well as contextual network effects. Thus, it resolves the problems outlined in the introduction by allowing researchers to pinpoint which predictors exhibit high multicollinearity and to gauge the severity of multicollinearity in an ERGM. The method has been shown to perform comparably to well-known methods for GLM, yielding similar results and predicting unstable variance estimates with high reliability. As such, it outperforms other approaches to detecting multicollinearity in ERGM, such as relying on bivariate correlations or the Hessian matrix. Moreover, the method requires little computational 30 power or specialization in statistical network analysis to execute (outlined under Eq. (4)). Thus, the approach is relatively parsimonious and should be easily applied by researchers who may not be familiar with network analysis. Since the properties of VIFs in ERGM are different from VIFs observed in GLM, a 216 condition Monte Carlo simulation experiment was conducted to evaluate ERGM VIFs. The VIFs showed significant kurtosis and extremely high values. Results from multilevel regression give insight to ERGM conditions which increase the risk of multicollinearity. First, VIFs tend to increase as more endogenous variables are included in the model. Thus, researchers should be cautious when specifying endogenous effects to improve model fit in ERGM, as the overspecification of endogenous effects may induce collinearity-type problems. Second, ERGMs specifying numerous endogenous effects on dense networks tend to exhibit high VIFs. Thus, researchers modelling dense networks should be particularly aware of how endogenous effects may introduce collinearity problems. Third, networks with certain global properties are at greater risk for multicollinearity. Scale-free networks tend to generate larger VIFs than either random or small-world networks. Researchers should be aware of the increased risk of multicollinearity when modelling networks that exhibit preferential attachment or as the degree distribution of the network approaches a power-law (e.g. Barabasi and Albert 1999), especially in dense networks. When interpreting VIFs, it is recommended that researchers consider multicollinearity a possibility when VIFs rise above 20. VIFs above 150 should be considered especially concerning. Moreover, researchers using ERGM should note that as VIFs rise, there is a greater likelihood that parameter instability will not be located in the focal variable. In GLM, if two variables are not correlated, multicollinearity may not warrant concern. However, network data 31 is usually highly correlated and high collinearity may affect MCMC estimation. Thus, as VIFs rise, researchers should grow skeptical of the standard errors for all covariates in the model. If a researcher is wed to a variable exhibiting a high VIF, multicollinearity can be investigated more closely by observing fluctuations in standard errors with model reruns. However, a single model rerun may not capture parameter instability because the margin of error for variance estimates grows as multicollinearity becomes increasingly problematic. Thus, if a researcher resorts to rerunning models, they should rerun the model numerous times. Of course, reruns are not always feasible when fitting ERGMs to large networks or when specifying complicated models. If an offending variable is not specified on theoretical grounds, it should simply be removed. This advice comes with a caveat. Although multicollinearity can often be detected through unstable standard errors, it can still affect relatively stable ERGM estimates. Large VIFs should raise concern even when standard errors do not change across model reruns. For this reason, researchers should note that rerunning the same model multiple times may not approximate ‘true’ standard errors when VIFs are high, even if standard errors only exhibit mild fluctuations. One benefit of the proposed approach to calculating VIFs in ERGM is that the general method may be translatable to dynamic network models with little modification (Hanneke, Fu, and Xing 2010; Snijders et al. 2010; Krivitsky and Handcock 2014). Namely, the stochastic actor-oriented model (SAOM) and temporal exponential random graph model (TERGM) both utilize a similar MCMC estimation procedure as ERGM and face comparable collinearity issues. The proposed method can potentially be adapted to evaluate SAOM or TERGM, though what the properties of VIFs will be in the longitudinal context is unclear. 32 Future research may also consider the effects of other endogenous specifications. Due to constraints on computational power and processing time, the number of endogenous terms included in the ERGMs for these simulations has been limited. The terms provided in this paper have been selected due to their widespread popularity and common usage (Hunter 2007; Robins et al. 2007; Snijders et al. 2010). A promising direction for future research is to investigate how variance estimates change with the inclusion of less common endogenous effects, such as the arc tangent or k-cores. These effects are sometimes applied as specific measures in research (e.g. Hipp et al. 2013), but may have less clear behaviors when included with other network terms. Similarly, selective-sorting parameters such as homophily and heterophily are widely used by networks scholars, but may have distinct behaviors. APPENDIX Table A1 presents additional information on how VIFs vary by ERGM specification and variable type. As demonstrated in the multilevel regression, ERGMs with three endogenous variables tend to generate the largest VIFs, followed by ERGMs specifying degree popularity and degree correlation, ERGMs specifying gwesp and degree popularity, and ERGMs specifying gwesp and degree correlation. Moving down the “Overall” column, results indicate that dyadindependent ERGMs tend to generate low VIFs. Notably, the average VIF tends to be lower in models specifying a single endogenous parameter compared to dyad-independent ERGMs. However, beyond a single endogenous variable, adding additional endogenous effects increases the mean VIF in an ERGM. [Table A1 here] This is consistent across variable types as well. The lowest observed mean VIFs are for endogenous effects specified in ERGMs with only a single endogenous variable. Inversely, the 33 largest mean VIFs are for endogenous effects in complex (three endogenous variable) ERGMs. Even in ERGMs with only two endogenous variables, the mean VIF for these variables is sometimes quite large (e.g. exp(1.16^10)=82.39). Moving down the column for actor-level attributes, there is relatively little variation in the mean VIF for actor-level variables as more endogenous effects are added to the model. Consistent with the endogenous effects, the largest mean VIF for actor-level attributes is among ERGMs specified with three endogenous covariates. Figure A1 provides violin plots examining how VIFs vary by variable type and the variance-mean ratio of exogenous variables. The top panel shows VIFs in ERGMs with highly correlated endogenous variables. Unsurprisingly, the actor-level covariates in these ERGMs have the largest modes and medians of the three panels. Interestingly, the density of the endogenous covariates is also widely dispersed in this condition. The panel for the equidispersed (variancemean ratio = 1) actor-level variables shows a similar pattern as the under-dispersed panel. One notable difference is that the violin for the equidispersed actor-level covariates has a lower median than the under-dispersed violin for actor-level covariates. This is also distinct from the over-dispersed panel, where actor-level covariates have the lowest median and the lowest mode of any panel. [Figure A1] Table A2 explores how VIFs vary by variable type, ERGM specification, and network density. For the most part, results in this table are consistent with Table A1 in low density (0.1) networks. The lowest VIFs in this model are for endogenous effects in ERGMs with only one endogenous covariate, while the largest VIFs are for endogenous variables in ERGMs specified with three endogenous covariates. Notably, in low density networks, specifying both degree 34 popularity and degree correlation in a model yields a mean VIF for each of those effects larger than the mean VIF for gwesp in an ERGM specified with three endogenous effects. [Table A2] Turning to ERGMs specified on networks with moderate density (0.2), results for less complicated models—dyad-independent models and those with a single endogenous covariate— are consistent with low density networks and the aggregate results in Table A1. Here, however, the mean VIF for the gwesp statistic in ERGMs specified with two or three covariates is noticeably larger. Moreover, the size of the VIF for endogenous covariates specified alongside gwesp in ERGMs with only two structural effects are noticeably larger in moderate density networks compared to low density networks. The mean VIFs for degree effects specified in the “Deg. Pop. + Deg. Cor.” models are only slightly larger than in low density networks. This is also true of the three-endogenous-effect ERGMs, where the gwesp term is noticeably larger, but the degree statistics only increase slightly. These trends continue into the high density (0.3) networks. Again, actor-level covariates remain relatively unaffected by increasing network density; neither do single endogenous variables in less complex models. In more complex models, however, the gwesp statistic again shows noticeable increases. The degree statistics also show gains, but they are smaller. Here, the largest VIF is for the degree popularity statistic. In the simulated models with three endogenous covariates, the mean degree popularity score is 1487.72 (exp(1.22^10)). Figure A2 breaks down variation in specific variables by network density. The top panel plots low density networks. Variable effects in this network have the strongest central tendencies, with medians and modes being closest in value in these models. In the moderate density network, values tend to be more dispersed, with a higher concentration of VIFs at higher 35 values and weaker central tendency. This trend comes to a head in the third plot, where medians are noticeably below modes in high density networks. Another notable trend is the steady increase in the gwesp VIF as network density increases. The degree statistics also show a similar pattern, but it is less dramatic. These results indicate that gwesp—and possibly other triangle terms—may generate high correlations when specified with other endogenous effects in dense networks. Moreover, parameterizing multiple degree effects may increase the risk of multicollinearity regardless of network density. [Figure A2] 36 1 Originally, this was accomplished using a Gibbs sampler, but more recent approaches rely on variants of the Metropolis Hastings algorithm. 2 This can be reduced to Eq. (2) for a more parsimonious equation in the regression context. However, this reduction changes the input values on the right-hand side of the equation from correlations (parameter space) to observations and is therefore inappropriate for ERGM. 3 A distribution of 1,000 networks is sufficient for exploratory purposes, but the number of networks should be increased in larger networks or ones with potentially problematic coefficients. 4 The gwesp term is a geometrically weighted term for edgewise shared partnership. It falls under the ‘curved’ family of ERGM terms and is generally a more robust estimate than linear triadic closure. See Hunter (2007) for an introduction and discussion. 5 All dyad-independent models are estimated with MCMC MLE instead of pseudo-likelihood. 6 Since the purpose of this simulation was to force varying degrees of collinearity, many of the more complex ERGMs did not converge (presumably due to multicollinearity). Since these ERGMs did not yield estimates, results from these simulations were not stored or recorded. Results are restricted to only those ERGMs which yielded estimates, as per the assumptions of the method outline in Eq. (4). 7 Of course, this correlation will not be perfect since models will also be inconsistent in their degree of instability. In other words, the rate of change in variance estimates between model reruns will approach randomness as variance estimates grow increasingly unreliable. Therefore, differences in parameter variance estimates will not necessarily be larger as when they grow less reliable. 8 Since VIFs detect inflation in the variance estimates of the model and only evaluate standard errors by extension, I focus on instability in variance estimates. 9 To ensure that parameter instability was not an artifact of insufficient MCMC sampling, a smaller experiment was run with the same conditions as outlined in this section, but with a longer Markov Chain (MCMC interval = 5,000) and larger MCMC sample size (5,000). Each condition fit 100 ERGM pairs and yielded almost identical results. 10 Absolute differences in parameter variance estimates are total values between models. Rate of change is not reported because small fluctuations may yield large percent change. For example, a variance estimate shifting between 0.001 and 0.002 would be a 100% increase but would not be cause for concern. 11 Absolute values are used so that negative fluctuations trend in the same direction as VIFs, which have a lower bound of 1. All analyses were also performed using VIFs in the first ERGM and VIFs in the second ERGM. Results were almost identical. 12 Like the gwesp term, the gwdegree decay parameter is fixed at 0.7. 37 References An, Weihua. 2016. “Fitting ERGMs on big networks.” Social Science Research 59: 107-119. Barabasi, Albert-Laszlo and Reka Albert. 1999. “Emergence of Scaling in Random Networks.” Science 286: 509-512. Chandrasekhar, Arun G., and Matthew O. Jackson. 2014. “Tractable and Consistent Random Graph Models.” Physics and Society. DOI: arXiv:1210.7375v4 Chatterjee, Samprit and Bertram Price. 1991. Regression Analysis by Example. New York: Wiley. Faust, Katherine. 2007. “Very Local Structure in Social Networks.” Sociological Methodology 37(1): 209-256. Fox, John, and Georges Monette. 1992. “Generalized Collinearity Diagnostics.” Journal of the American Statistical Association 87(417): 178-183. Fox, John. 2008. Applied Regression Analysis and Generalized Linear Models. Thousand Oaks, CA: Sage Inc. Frank, Ove, and David Strauss. 1986. “Markov graphs.” Journal of the American Statistical Association 81: 832-842. Geyer, Charles J., and Elizabeth A. Thompson. 1992. “Constrained Monte Carlo Maximum Likelihood for Dependent Data.” Journal of the Royal Statistical Society Series B 54(3): 657-699. Goodreau, Steven M., Mark S. Handcock, David R. Hunter, Carter T. Butts, and Martina Morris. 2008. “A statnet Tutorial.” Journal of Statistical Software 24(9): 1-26. 38 Goodreau, Steven M., James A. Kitts, and Martina Morris. 2009. “Birds of a Feather, or Friend of a Friend? Using Exponential Random Graph Models to Investigate Adolescent Social Networks.” Demography 46(1): 103-125. Hanaki, Nobuyuki, Alexander Peterhansl, Peter S. Dodds, Duncan J. Watts. 2007. “Cooperation in Evolving Social Networks.” Management Science 53(7):1036-1050. Handcock, Mark S. 2003. “Statistical models for social networks: degeneracy and inference.” Pp. 229-240 in Dynamic Social Network Modeling and Analysis, ed. Ronald Breiger and Phillipa Pattison. National Academies Press. Handcock, Mark S., Garry Robins, Tom Snijders, Jim Moody, and Julian Besag. 2003a. “Assessing Degeneracy in Statistical Models of Social Networks.” Journal of the American Statistical Association 76:33-50. Handcock, Mark S., David R. Hunter, Carter T. Butts, Steven M. Goodreau, and Martina Morris. 2003b. statnet: Software tools for the Statistical Modeling of Network Data. Retrieved May 22nd, 2017 (http://statnetproject.org). Hanneke, Steve, Wenjie Fu, and Eric P. Xing. 2010. “Discrete temporal models of social networks.” Electronic Journal of Statistics 4:585-605. Harrell, Franke E. 2001. Regression Modeling Strategies. Springer Series in Statistics. Harris, Jenine K. 2014. An Introduction to Exponential Random Graph Modeling. Thousand Oaks: Sage. Hipp, John R., Carter T. Butts, Ryan Acton, Nicholas N. Nagle, and Adam Boessen. 2013. “Extrapolative simulation of neighborhood networks based on population spatial distribution: Do they predict crime?” Social Networks 35:614-625, 39 Holland, Paul W., and Samuel Leinhardt. 1981. “An Exponential Family of Probability Distributions for Directed Graphs.” Journal of the American Statistical Association 76(737): 33-50. Hunter, David R. 2007. “Curved exponential family models for social networks.” Social Networks 29(2): 216-230. Hunter, David R., Steven M. Goodreau, and Mark S. Handcock. 2008. “Goodness of Fit of Social Network Models.” Journal of the American Statistical Association 103(481): 248258. Hyojoung, Kim, and Peter S. Bearman. 1997. “The Structure and Dynamics of Movement Participation.” American Sociological Review 62(1): 70-93. Koskinen, Johan, and Tom A. B. Snijders. 2013. “Simulation, estimation, and goodness of fit.” In Ed. Dean Lusher, Johan Koskinen, and Garry Robins. Exponential Random Graph Models for Social Networks: Theory Methods and Applications. Cambridge University Press: Cambridge. Kreager, Derek A., Jacob T. N. Young, Dana L. Haynie, Martin Bouchard, David R. Schaefer, and Gary Zajac. Forthcoming. “Where ‘Old Heads’ Prevail: Inmate Hierarchy in a Men’s Prison Unit.” American Sociological Review. Krivitsky, Pavel N., and Mark S. Handcock. 2014. “A separable model for dynamic networks.” Journal of the Royal Statistical Society B 76(1): 29-46. Levy, Michael. 2016. “ERGM Tutorial.” Retrieved October 11th, 2017 http://michaellevy.name/blog/ERGM-tutorial/ Lusher, Dean, Johan Koskinen, and Garry Robins. 2013. Exponential Random Graph Models for Social Networks: Theory, Methods, and Applications. Cambridge, UK: Cambridge Press. 40 Mooney, Christopher Z. 1997. Monte Carlo Simulation. Thousand Oaks, CA: Sage. Newman, Mark E. J., Steven H. Strogatz, and Duncan J. Watts. 2001. “Random graphs with arbitrary degree distributions.” Phyiscal Reviews E 64: 026118. Newman, Mark E. J. 2002. “Assortative mixing in networks.” Physical Review Letters 89: 208701. O’Brien, Robert M. 2007. “A Caution Regarding Rules of Thumb for Variance Inflation Factors.” Quality and Quantity 41: 673-690. O’Malley, James A, and Jukka-Pekka Onnela. 2014. “Topics in Social Network Analysis and Network Science.” Physics and Society. DOI: arXiv:1404.0067v1. Padgett, John F., and Christopher K. Ansell. 1993. “Robust Action and the Rise of the Medici, 1400-1434. American Journal of Sociology 98(6): 1259-1319. Papachristos, Andrew V. 2009. “Murder by Structure: Dominance Relations and the Social Structure of Gang Homicide.” American Journal of Sociology 115(1): 74-128. Papachristos, Andrew V., David Hureau, and Anthony Braga. 2013. “The Corner and the Crew: The Influence of Geography and Social Networks on Gang Violence.” American Sociological Review 78(3): 417-447. Rencher, Alvin C. 2002. Methods of Multivariate Analysis. Second edition. John Wiley & Sons Inc. Ripley, Ruth M., Tom A. B. Snijders, Zsofia Boda, Andras Voros, and Paulina Preciado. 2017. Manual for RSiena. Retrieved May 22nd, 2017 (https://www.stats.ox.ac.uk/~snijders/siena/RSiena_Manual.pdf). 41 Robins, Garry, Pip Pattison, Yuval Kalish, and Dean Lusher. 2007. “An Introduction to Exponential Random Graph (p*) Models for Social Networks.” Social Networks 29(2): 173-191. Salter-Townshend, Michael, and Thomas Brendan Murphy. 2015. “Role Analysis in Networks using Mixtures of Exponential Random Graph Models.” Journal of Computational Graph Statistics 24(2): 520-538. Shore, Jesse, and Benjamin Lubin. 2015. “Spectral goodness of fit for network models.” Social Networks 43:16-27. Snijders, Tom A. B. 2002. “Markov Chain Monte Carlo estimation of exponential random graph models.” Journal of Social Structure 3. Snijders, Tom A. B., Phillipa E. Pattison, Garry L. Robins, Mark S. Handcock. 2006. “New specifications for exponential random graph models.” Sociological Methodology 36(1): 99-153. Snijders, Tom A. B., and Roel J. Bosker. [1999] 2008. Multilevel Analysis: An introduction to basic and advanced multilevel modeling. Third edition. London: Sage. Snijders, Tom A. B., Gerhard G. van de Bunt, and Christian E.G. Steglich. 2010. “Introduction to actor-based models for network dynamics.” Social Networks 32:44-60. Steglich, Christian, Tom A. B. Snijders, and Michael Pearson. 2010. “Dynamic Networks and Behavior: Separating Selection from Influence.” Sociological Methodology 40(1): 329393. Thiemichen, Stephanie, and Goeran Kauermann. 2017. “Stable exponential random graph models with non-parametric components for large dense networks.” Social Networks 49 (1): 67-80. 42 Watts, Duncan J. and Steven H. Strogatz. 1998. “Collective dynamics of ‘small-world’ networks.” Nature 393: 440-442. Wimmer, Andreas, and Kevin Lewis. 2010. “Beyond and Below Racial Homophily: ERG Models of a Friendship Network Documented on Facebook.” American Journal Sociology 116(2): 583-642. 43 Table 1: Conditions of the experiment Conditions Values Network density .1 .2 .3 Variance mean ratios of exogenous variables .5 1 5 Topology Random Scale-free* Small-world Endogenous parameters gwesp (0.7)** Degree popularity Degree correlation Network size 75 *All scale-free networks were simulated by rewiring a random network with a scaling parameter of 2. ** Decay (α) parameter of the gwesp term is fixed at 0.7. See Hunter (2007) for an introduction. 44 Table 2: Multilevel regression analysis of 𝑙𝑜𝑔(𝑉𝐼𝐹)0.1 Independent Variables Model 1 Model 2 Network density - - 0.2 0.013 (0.000) -0.000* (0.002) 0.3 0.022 (0.000) 0.009 (0.002) - - 1 -0.075 (0.000) -0.074 (0.000) 5 -0.174 (0.000) -0.174 (0.000) - - Scale-free 0.020 (0.000) 0.020 (0.000) Small-world -0.007 (0.000) -0.007 (0.000) - - gwesp -0.001 (0.001) 0.010 (0.001) Deg. Pop. -0.024 (0.001) -0.024 (0.001) Deg. Cor. -0.021 (0.001) -0.016 (0.001) gwesp + Deg. Pop. 0.105 (0.001) 0.077 (0.001) gwesp + Deg. Cor. 0.101 (0.001) 0.073 (0.001) Deg. Pop. + Deg. Cor. 0.143 (0.001) 0.140 (0.001) gwesp + Deg. Pop. + Deg. Cor. 0.200 (0.001) 0.187 (0.001) - - 0.2*gwesp - -0.016 (0.001) 0.2*Deg. Pop. - 0.001 (0.001) 0.2*Deg. Cor. - -0.002* (0.001) 0.2*gwesp + Deg. Cor. - 0.035 (0.001) 0.2*gwesp + Deg. Pop. - 0.038 (0.001) 0.2*Deg. Cor. + Deg. Pop. - 0.007 (0.001) 0.2*Deg. Cor. + Deg. Pop. + gwesp - 0.016 (0.001) 0.3*gwesp - -0.030 (0.001) 0.3*Deg. Pop. - 0.002* (0.001) 0.3*Deg. Cor. - -0.011 (0.001) 0.3*gwesp + Deg. Cor. - 0.049 (0.001) 0.3*gwesp + Deg. Pop. - 0.048 (0.001) 0.3*Deg. Cor. + Deg. Pop. - 0.002* (0.001) Variance mean ratios Topology Endogenous parameters Endogenous parameters*density 45 0.3*Deg. Cor. + Deg. Pop. + gwesp Intercept - 0.024 (0.001) 0.917 (0.001) 0.927 (0.001) 0.000* (0.000) 0.000* (0.000) -1.29 x⁡106 -1.30 x⁡106 Variance components (std. dev.)*** Level 2 intercept AIC 𝑅 2 **** 0.44 0.44 *Not statistically significant at the 0.1 level; N for count of VIFS is 972,000, N for models is 216,000. **Reference category for network density is 0.1; reference category for variance mean ratios is 0.5 reference category for endogenous parameters is whether the VIF observation was an exogenous variable; reference category for topology is random networks; reference category for interaction is density (0.1)*exogenous variable. ***In the null model, the variance component for the model level intercept is equal to 0.104 with a standard deviation of 0.102. Thus, all regressions are constrained with a random second level intercept. A likelihood ratio test confirms that this is the correct modelling choice compared to a standard linear model (𝜒 2 = 199.01). ****𝑅 2 is Snijder and Bosker’s ([1999]2008) total variance method. 46 Table 3: Multilevel regression for three different network topologies Independent Variables Random Scale-free Small-world Network density - - - 0.2 -0.002* (0.002) 0.002* (0.002) -0.001* (0.002) 0.3 0.001* (0.002) 0.023 (0.002) 0.002* (0.002) - - 1 -0.078 (0.001) -0.070 (0.001) -0.077 (0.001) 5 -0.176 (0.001) -0.164 (0.001) -0.184 (0.001) - - - gwesp 0.013 (0.002) -0.011 (0.002) 0.027 (0.002) Deg. Pop. -0.040 (0.002) -0.007 (0.002) -0.025 (0.002) Deg. Cor. -0.006 (0.002) -0.007 (0.001) -0.034 (0.002) gwesp + Deg. Pop. 0.084 (0.002) 0.084 (0.002) 0.061 (0.002) gwesp + Deg. Cor. 0.084 (0.002) 0.084 (0.002) 0.059 (0.002) Deg. Pop. + Deg. Cor. 0.136 (0.002) 0.136 (0.002) 0.149 (0.002) gwesp + Deg. Pop. + Deg. Cor. 0.140 (0.002) 0.278 (0.002) 0.143 (0.002) - - - 0.2*gwesp -0.022 (0.002) -0.000* (0.002) -0.026 (0.000) 0.2*Deg. Pop. 0.032 (0.002) -0.004* (0.002) 0.001* (0.002) 0.2*Deg. Cor. -0.003* (0.002) -0.006* (0.002) 0.002* (0.002) 0.2*gwesp + Deg. Cor. 0.033 (0.002) 0.031 (0.002) 0.041 (0.002) 0.2*gwesp + Deg. Pop. 0.038 (0.002) 0.034 (0.002) 0.038 (0.002) 0.2*Deg. Cor. + Deg. Pop. 0.008 (0.002) 0.006* (0.002) 0.006* (0.002) 0.2*Deg. Cor. + Deg. Pop. + gwesp 0.027 (0.002) 0.001* (0.002) 0.020 (0.002) 0.3*gwesp -0.020 (0.002) -0.021 (0.002) -0.049 (0.002) 0.3*Deg. Pop. 0.030 (0.002) -0.026 (0.002) -0.000* (0.002) 0.3*Deg. Cor. -0.005* (0.002) -0.027 (0.002) 0.015 (0.002) 0.3*gwesp + Deg. Cor. 0.050 (0.002) 0.030 (0.002) 0.067 (0.002) 0.3*gwesp + Deg. Pop. 0.056 (0.002) 0.034 (0.002) 0.064 (0.002) 0.3*Deg. Cor. + Deg. Pop. 0.009 (0.002) -0.016* (0.002) 0.008 (0.002) 0.3*Deg. Cor. + Deg. Pop. + gwesp 0.041 (0.002) -0.007 (0.002) 0.037 (0.002) 0.933 (0.001) -0.927 (0.001) 0.936 (0.001) Variance mean ratios Endogenous parameters Endogenous parameters*density Intercept 47 Variance component Level 2 intercept 𝑅 2 *** 0.000* (0.000) 0.000* (0.000) 0.000* (0.000) 0.41 0.52 0.43 *Not statistically significant at the 0.1 level; N for count of VIFS is 324,000, N for models is 72,000. **Reference category for network density is 0.1; reference category for variance mean ratios is 0.5 reference category for endogenous parameters is whether the VIF observation was an exogenous variable; reference category for topology is random networks; reference category for interaction is density (0.1)*exogenous variable. ***𝑅 2 is Snijder and Bosker’s ([1999]2008) total variance method. 48 Table 4. Correlations between 𝑙𝑜𝑔(𝑉𝐼𝐹)0.1 and absolute difference of log(variance) estimates VIF Variance Focal variable 0.715 Total model (maximum) 0.873 Total model (mean) 0.853 49 Table 5. Mean absolute difference in parameter variance estimates and standard errors for ranges of VIFs VIF value range Variance Std. Error VIF < 20 0.00 0.00 20 < VIF 0.55 0.20 20 < VIF < 150 0.65 0.23 150 < VIF 0.20 0.10 VIF < 20 0.00 0.00 20 < VIF 1.29 0.47 20 < VIF < 150 1.51 0.51 150 < VIF 0.83 0.38 VIF < 20 0.00 0.00 20 < VIF 0.22 0.08 20 < VIF < 150 0.25 0.09 150 < VIF 0.14 0.06 Focal variable Total model (maximum) Total model (mean) 50 Table 6: Variance inflation factors in the Faux Mesa ERGM Independent Variables Model 1 Model 1 Rerun Model 2 Model 2 Rerun Model 3 Model 3 Rerun Edges -6.602*** (0.201) -6.601*** (0.197) -5.695*** (0.237) -5.686*** (0.234) -7.135*** (0.305) -7.235*** (0.269) Homophily (Race) 0.431** (0.149) 0.437** (0.141) 0.356** (0.135) 0.355* (0.141) 0.298* (0.139) 0.285** (0.101) Homophily (Sex) 0.647*** (0.154) 0.643*** (0.149) 0.652*** (0.152) 0.644*** (0.146) 0.504*** (0.139) 0.512*** (0.096) Homophily (Grade) 2.835*** (0.177) 2.830*** (0.175) 2.632*** (0.178) 2.624*** (0.179) 2.021*** (0.177) 2.048*** (0.174) gwdegree - - -1.043*** (0.154) -1.041*** (0.152) 0.526* (0.208) 0.615** (0.196) gwesp - - - - 1.340*** (0.122) 1.334*** (0.114) VIFs - - - - - - Homophily (Race) 1.711 1.714 2.929 2.756 7.840 11.150 Homophily (Sex) 2.113 2.119 4.701 4.213 9.337 14.902 Homophily (Grade) 2.419 2.381 5.191 4.686 83.869 64.468 gwdegree - - 3.603 3.366 2.837 3.920 gwesp - - - - 80.198 62.724 1919 1919 1885 1885 1729 1730 BIC 1951 1951 ***p<0.001; **p<0.01; *p<0.05 1925 1925 1777 1778 AIC 51 Table 7. Posterior correlation matrix for Model 3 in Faux Mesa ERGM Homophily (Race) Homophily (Sex) Homophily (Grade) gwdegree Homophily (Race) 1.00 Homophily (Sex) 0.882 1.00 Homophily (Grade) 0.930 0.938 1.00 gwdegree 0.602 0.648 0.617 1.00 gwesp 0.926 0.929 0.989 0.545 52 gwesp 1.00 Table 8. VIFs in ERGMs fit to four networks. Independent Variables Model 1 Model 2 Model 3 θ (SE) VIF θ (SE) VIF θ (SE) VIF Edges -10.01*** (.115) - -9.78*** (.105) - -9.90*** (.134) - Homophily (Race) 1.20*** (.081) 3.53 .906*** (.073) 6.93 .939*** (.079) 6.41 Homophily (Grade) 3.23*** (.088) 3.72 2.74*** (.086) 8.66 2.75*** (.087) 9.50 Homophily (Sex) .884*** (.071) 2.88 .778*** (.064) 4.78 .765*** (.062) 4.49 gwesp (.25) - - 1.80*** (.051) 4.19 1.85*** (.061) 5.13 gwdegree (.25) - - - - .120 (.083) 2.95 Edges -1.96*** (.257) - -1.13* (.448) - -3.44** (1.06) - Homophily (Group) 2.09*** (.318) 3.73 2.65*** (.459) 8.67 2.51*** (.498) 4.65 gwidegree (1) -1.83* (.750) 2.25 -3.07** (.961) 2.32 -3.31** (1.02) 2.36 Reciprocity 1.42** (.475) 3.80 1.40** (.469) 4.69 1.56** (.523) 2.39 gwesp (.5) - - -.445* (.172) 6.72 -.303 (.224) 5.82 gwodegree (1) - - - - 1.48* (.666) 3.85 Edges -6.13*** (.063) - -6.23*** (.064) - -8.39*** (.075) - Leader in position 3 – 5 years .136** (.045) 1.07 .144** (.045) 1.11 .076*** (.015) 65.70 Leader in position 6 – 10 years .261*** (.042) 1.09 .279*** (.042) 1.08 .076*** (.010) 53.74 Faux Magnolia Sampson LHDS 53 Leader in position 11+ years .322*** (.040) 1.10 .337*** (.040) 1.11 .107*** (.013) 80.59 Jurisdiction population - - .197*** (.014) 1.15 -.176*** (.024) 144.53 gwesp (.7) - - - - 2.80*** (.042) 327.86 Edges -3.24* (1.35) - -.439 (3.67) - -1.60 (3.67) - Wealth .012* (.006) 3.34 .016* (.008) 6.49 .015 (.008) 5.99 gwesp (.7) .073 (.320) 2.36 .124 (.315) 2.24 .097 (.312) 2.25 gwdegree (.7) .751 (.137) 1.83 -1.12 (2.57) 2.00 -.514 (2.56) 1.83 kstar (2) - - -.448 (.557) 9.09 -.369 (.528) 8.60 Business network - - - - Florentine Marriage 2.22*** 1.24 (.641) ***p<0.001; **p<0.01; *p<0.05. The gwidegree and gwodegree statistics are the directed equivalent of gwdegree score for indegree and outdegree, respectively 54 Table A1. Means (standard deviations) of VIFs across ERGM specifications. VIFs are calculated as 𝑙𝑜𝑔(𝑉𝐼𝐹)0.1. All ERGMs include at least three actor-level covariates specified as maineffects; dyad-independent ERGMs are those that only have three actor-level effects. ERGM specifications Actor-level attributes gwesp Deg. Pop. Deg. Cor. Overall Dyadindependent .85 (.14) - - - .85 (.14) gwesp .88 (.12) .75 (.08) - - .85 (.12) Deg. Pop. .87 (.12) - .73 (.07) - .83 (.13) Deg. Cor. .87 (.12) - - .72 (.08) .83 (.13) gwesp + Deg. Cor .87 (.12) 1.07 (.06) - 1.07 (.06) .95 (.14) gwesp + Deg. Pop. .87 (.12) 1.08 (.07) 1.08 (.06) - .96 (.14) Deg. Pop. + Deg. Cor. .88 (.11) - 1.16 (.02) 1.16 (.02) .99 (.16) gwesp + Deg. Pop. + Deg. Cor. .93 (.13) 1.12 (.10) 1.20 (.05) 1.20 (.05) 1.05 (.16) Overall .88 (.12) 1.01 (.17) 1.04 (.20) 1.04 (.20) .93 (.17) 55 Table A2. Means (standard deviations) of VIFs across ERGM specifications by network density. VIFs are calculated as 𝑙𝑜𝑔(𝑉𝐼𝐹)0.1 . All ERGMs include at least three actor-level covariates specified as main-effects; dyad-independent ERGMs are those that only have three actor-level effects. Actor-level attributes gwesp Deg. Pop. Deg. Cor. Overall Dyadindependent .85 (.14) - - - .85 (.14) gwesp .89 (.11) .77 (.09) - - .86 (.11) Deg. Pop. .87 (.12) - .73 (.07) - .84 (.13) Deg. Cor. .87 (.12) - - .72 (.08) .83 (.13) gwesp + Deg. Cor .87 (.12) 1.00 (.04) - 1.00 (.04) .93 (.12) gwesp + Deg. Pop. .87 (.12) 1.00 (.03) 1.01 (.03) - .93 (.11) Deg. Pop. + Deg. Cor. .88 (.11) - 1.15 (.02) 1.15 (.02) .99 (.16) gwesp + Deg. Pop. + Deg. Cor. .93 (.13) 1.06 (.13) 1.19 (.06) 1.19 (.06) 1.04 (.15) Total .88 (.12) .96 (.14) 1.02 (.19) 1.02 (.19) .92 (.15) Dyadindependent .85 (.14) - - - .85 (.14) gwesp .87 (.12) .75 (.08) - - .84 (.12) Deg. Pop. .87 (.12) - .73 (.07) - .83 (.13) ERGM specifications Density = .1 Density = .2 56 Deg. Cor. .87 (.12) - - .72 (.08) .83 (.13) gwesp + Deg. Cor .87 (.12) 1.09 (.03) - 1.09 (.03) .96 (.13) gwesp + Deg. Pop. .85 (.12) 1.10 (.02) 1.10 (.02) - .95 (.15) Deg. Pop. + Deg. Cor. .88 (.11) - 1.17 (.02) 1.17 (.02) 1.00 (.16) gwesp + Deg. Pop. + Deg. Cor. .93 (.13) 1.12 (.09) 1.20 (.05) 1.20 (.05) 1.05 (.16) Total .87 (.13) 1.01 (.17) 1.05 (.20) 1.04 (.20) .93 (.17) Dyadindependent .86 (.13) - - - .86 (.13) gwesp .87 (.12) .73 (.07) - - .84 (.13) Deg. Pop. .87 (.12) - .73 (.07) - .83 (.13) Deg. Cor. .87 (.12) - - .72 (.08) .83 (.13) gwesp + Deg. Cor .88 (.11) 1.13 (.02) - 1.13 (.01) .98 (.15) gwesp + Deg. Pop. .89 (.11) 1.15 (.02) 1.15 (.02) - 1.00 (.15) Deg. Pop. + Deg. Cor. .88 (.11) - 1.17 (.02) 1.17 (.02) 1.00 (.17) gwesp + Deg. Pop. + Deg. Cor. .93 (.13) 1.17 (.07) 1.22 (.05) 1.21 (.05) 1.07 (.17) Total .88 (.12) 1.05 (.19) 1.07 (.20) 1.06 (.20) .94 (.17) Density = .3 57 Figure 1: Distribution of 𝑙𝑜𝑔(𝑉𝐼𝐹)0.1 58 Figure 2: Frequency of 𝑙𝑜𝑔(𝑉𝐼𝐹)0.1 by count of endogenous variables in an ERGM. y-axis is frequency of 𝑙𝑜𝑔(𝑉𝐼𝐹)0.1 values; x-axis is 𝑙𝑜𝑔(𝑉𝐼𝐹)0.1 values. Lighter shades correspond with denser concentration of values. 59 Figure 3: Interactions for variable type*density. y-axis is mean 𝑙𝑜𝑔(𝑉𝐼𝐹)0.1 for the interaction; x-axis is the density of the network. Black dashed lines are actor-level attributes, red solid lines are the endogenous parameter specifications for the model. 60 Figure 4: Interactions for variable type*density by network topology. The y-axis is the mean 𝑙𝑜𝑔(𝑉𝐼𝐹)0.1 value for the interaction; x-axis is the network density. *Note: The graphs for a single endogenous variable are plotted over a smaller range of 𝑙𝑜𝑔(𝑉𝐼𝐹)0.1 than the other graphs in the figure. 61 Figure 5. Violin plots of VIFs for different variables by network topology. Y-axis is 𝑙𝑜𝑔(𝑉𝐼𝐹)0.1. 62 Figure 6a. Log(variance) fluctuations for 𝑙𝑜𝑔(𝑉𝐼𝐹)0.1 of a focal variable (solid line) and maximum log(variance) fluctuation of a variable in a model for the maximum 𝑙𝑜𝑔(𝑉𝐼𝐹)0.1 of a variable in a model (dashed line). Smoothed Lowess lines. *Note: the x-axis was plotted to scale of 𝑙𝑜𝑔(𝑉𝐼𝐹)0.1 , but untransformed VIF values are marked to make interpretation easier. Similarly, the y-axis was plotted to scale log(variance) fluctuations, but untransformed values are marked. **Note: the linear increase when 10 < VIF < 20 connects the control group of dyad-independent ERGMs to ERGMs with three endogenous effects. The largest VIF in dyad-independent ERGMs was < 10, and the smallest VIF for ERGMs with endogenous effects was >20. 63 Figure 6b. Log(variance) fluctuations for 𝑙𝑜𝑔(𝑉𝐼𝐹)0.1 of a focal variable (solid line) and maximum log(variance) fluctuation of a variable in a model for the maximum 𝑙𝑜𝑔(𝑉𝐼𝐹)0.1 of a variable in a model (dashed line). Unsmoothed Lowess lines. *Note: the x-axis was plotted to scale of 𝑙𝑜𝑔(𝑉𝐼𝐹)0.1 , but untransformed VIF values are marked to make interpretation easier. Similarly, the y-axis was plotted to scale log(variance) fluctuations, but untransformed values are marked. **Note: the linear increase when 10 < VIF < 20 connects the control group of dyad-independent ERGMs to ERGMs with three endogenous effects. The largest VIF in dyad-independent ERGMs was < 10, and the smallest VIF for ERGMs with endogenous effects was >20. 64 Figure 7. Heteroskedasticity plot of absolute difference in parameter variance estimates as 𝑙𝑜𝑔(𝑉𝐼𝐹)0.1 increases for a focal variable. *Note: the x-axis was plotted to scale of 𝑙𝑜𝑔(𝑉𝐼𝐹)0.1 , but untransformed VIF values are marked to make interpretation easier. y-axis is untransformed. 65 Figure 8. Heteroskedasticity plot of maximum absolute difference in parameter variance estimates in a model as the highest 𝑙𝑜𝑔(𝑉𝐼𝐹)0.1 in the model increases. *Note: the x-axis was plotted to scale of 𝑙𝑜𝑔(𝑉𝐼𝐹)0.1 , but untransformed VIF values are marked to make interpretation easier. y-axis is untransformed. 66 Figure 9. Mean VIFs for ERGMs of five empirical networks. In Model 3, the Faux Mesa network had a mean VIF of 29.8 and the LHDS network had a mean VIF of 134.8. 67 Figure A1. Violin plots of VIFs for different variables by variance mean ratio of exogenous actor-level variables. Y-axis is 𝑙𝑜𝑔(𝑉𝐼𝐹)0.1 . 68 Figure A2. Violin plots of VIFs for different variables by network density. Y-axis is 𝑙𝑜𝑔(𝑉𝐼𝐹)0.1. 69