Skip to main content Accessibility help

Clustering ensembles of social networks

  • Tracy M. Sweet (a1), Abby Flynt (a2) and David Choi (a3)


Recently there has been significant work in the social sciences involving ensembles of social networks, that is, multiple, independent, social networks such as students within schools or employees within organizations. There remains, however, very little methodological work on exploring these types of data structures. We present methods for clustering social networks with observed nodal class labels, based on statistics of walk counts between the nodal classes. We extend this method to consider only non-backtracking walks, and introduce a method for normalizing the counts of long walk sequences using those of shorter ones. We then present a method for clustering networks based on these statistics to explore similarities among networks. We demonstrate the utility of this method on simulated network data, as well as on advice-seeking networks in education.


Corresponding author

*Corresponding author. Email:


Hide All
Everitt, B., Landau, S., Leese, M., & Stahl, D. (2011). Cluster analysis. Wiley Series in Probability and Statistics. Hoboken: John Wiley & Sons.
Faust, K., & Skvoretz, J. (2002). Comparing networks across space and time, size and species. Sociological Methodology, 32, 267299.
Fraley, C., & Raftery, A. E. (2002). Model-based clustering, discriminant analysis, and density estimation. Journal of the American statistical Association, 97, 611631.
Frank, K. A., Lo, Y.-J., & Sun, M. (2014). Social network analysis of the influences of educational reforms on teachers practices and interactions. Zeitschrift für Erziehungswissenschaft, 17, 117134.
Frank, K. A., Zhao, Y., & Borman, K. (2004). Social capital and the diffusion of innovations within organizations: The case of computer technology in schools. Sociology of Education, 77, 148171.
Friedman, J., Hastie, T., & Tibshirani, R. (2001). The elements of statistical learning (vol. 1). New York: Springer.
Gest, S. D., & Rodkin, P. C. (2011). Teaching practices and elementary classroom peer ecologies. Journal of Applied Developmental Psychology, 32, 288296.
Harris, A. (2009). Distributed leadership: What we know. In Distributed leadership (pp. 1121). Dordrecht: Springer.
Harris, A., & Spillane, J. (2008). Distributed leadership through the looking glass. Management in Education, 22, 3134.
Harris, K., Halpern, C., Whitsel, E., Hussey, J., Tabor, J., Entzel, P., & Udry, J. (2009). The national longitudinal study of adolescent health. Research design. Retrieved from (September 2011).
Hashimoto, K.-i. (1989). Zeta functions of finite graphs and representations of p-adic groups. Automorphic Forms and Geometry of Arithmetic Varieties, 15, 211280.
Holland, P., Laskey, K., & Leinhardt, S. (1983). Stochastic blockmodels: First steps. Social Networks, 5, 109137.
Hopkins, M., Lowenhaupt, R., & Sweet, T. M. (2015). Organizing instruction in new immigrant destinations: District infrastructure and subject-specific school practice. American Educational Research Journal, 52, 408439.
Hubert, L. & Arabie, P. (1985). Comparing partitions. Journal of Classification, 2, 193218.
Kashima, H., Tsuda, K., & Inokuchi, A. (2003). Menlo Park: AAAI Press. Marginalized kernels between labeled graphs. In Proceedings of the 20th international conference on machine learning (ICML-03) (pp. 321328).
Krzakala, F., Moore, C., Mossel, E., Neeman, J., Sly, A., Zdeborová, L., & Zhang, P. (2013). Spectral redemption in clustering sparse networks. Proceedings of the National Academy of Sciences, 110, 2093520940.
Lazega, E., & Snijders, T. A. (2015). Multilevel network analysis for the social sciences: Theory, methods and applications (vol. 12). Berlin, Germany: Springer.
MacQueen, J., (1967). Some methods for classification and analysis of multivariate observations. In Proceedings of the fifth Berkeley symposium on mathematical statistics and probability, Oakland, CA, USA (vol. 1, pp. 281297). Berkeley: University of California Press.
Maechler, M., Rousseeuw, P., Struyf, A., Hubert, M., & Hornik, K. (2018). cluster: Cluster analysis basics and extensions.
Martin, T., Zhang, X., & Newman, M. (2014). Localization and centrality in networks. Physical Review E, 90, 052808.
Paluck, E. L., & Shepherd, H. (2012). The salience of social referents: A field experiment of collective norms and harassment behavior in a school social network. Journal of Personality and Social Psychology, 103, 899915.
Ralaivola, L., Swamidass, S. J., Saigo, H., & Baldi, P. (2005). Graph kernels for chemical informatics. Neural Networks, 18, 10931110.
Saigo, H., Nowozin, S., Kadowaki, T., Kudo, T., & Tsuda, K. (2009). gBoost: A mathematical programming approach to graph classification and regression. Machine Learning, 75, 6989.
Sarkar, A., Fienberg, S., & Krackhardt, D. (2010). Predicting profitability using advice branch bank networks. Statistical Methodology, 7, 429444.
Sinani, E., Stafsudd, A., Thomsen, S., Edling, C., & Randøy, T. (2008). Corporate governance in Scandinavia: Comparing networks and formal institutions. European Management Review, 5, 2740.
Snijders, T., & Kenny, D. (1999). The social relations model for family data: A multilevel approach. Personal Relationships, 6, 471486.
Snijders, T. A., & Baerveldt, C. (2003). A multilevel network study of the effects of delinquent behavior on friendship evolution. Journal of Mathematical Sociology, 27, 123151.
Snijders, T. A., Steglich, C. E., Schweinberger, M., & Huisman, M. (2008). Manual for SIENA version 3.2. Department of Sociology, ICS, University of Groningen, Groningen, The Netherlands.
Spillane, J., Hopkins, M., & Sweet, T. (2015). Intra- and inter-school instructional interactions: Exploring conditions for instructional knowledge production within and between schools. American Journal of Education, 122, 71110.
Spillane, J. P. (2012). Distributed leadership (vol. 4). San Francisco: John Wiley & Sons.
Spillane, J. P., Halverson, R., & Diamond, J. B. (2001). Investigating school leadership practice: A distributed perspective. Educational Researcher, 30, 2328.
Spillane, J. P., & Hopkins, M. (2013). Organizing for instruction in education systems and school organizations: How the subject matters. Journal of Curriculum Studies, 45, 721747.
Spillane, J. P., Hopkins, M., & Sweet, T. M. (2016). Exploring the relationship between teachers’ instructional ties and teachers’ instructional beliefs: Trying not to ‘put the cart before the horse’. American Journal of Education, 122, 71110.
Spillane, J. P., Shirrell, M., & Sweet, T. M. (2017). The elephant in the schoolhouse: The role of propinquity in school staff interactions about teaching. Sociology of Education, 90, 149171.
Sweet, T., & Zheng, Q. (2017). A mixed membership model-based measure for subgroup integration in social networks. Social Networks, 48, 169180.
Sweet, T. M., Thomas, A. C., & Junker, B. W. (2013). Hierarchical network models for education research: Hierarchical latent space models. Journal of Educational and Behavioral Statistics, 38, 295318.
Tibshirani, R., Walther, G., & Hastie, T. (2001). Estimating the number of clusters in a data set via the gap statistic. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 63, 411423.
Traud, A. L., Kelsic, E. D., Mucha, P. J., & Porter, M. A. (2011). Comparing community structure to characteristics in online collegiate social networks. SIAM Review, 53, 526543.
Vishwanathan, S. V. N., Schraudolph, N. N., Kondor, R., & Borgwardt, K. M. (2010). Graph kernels. Journal of Machine Learning Research, 11, 12011242.
Vogelstein, J. T., Roncal, W. G., Vogelstein, R. J., & Priebe, C. E. (2013). Graph classification using signal-subgraphs: Applications in statistical connectomics. IEEE Transactions on Pattern Analysis and Machine Intelligence, 35, 15391551.
Ward, J. H. Jr. (1963). Hierarchical grouping to optimize an objective function. Journal of the American Statistical Association, 58, 236244.
Wolfe, J. H. (1963). Object cluster analysis of social areas, Ph.D. thesis, University of California.
Zijlstra, B., van Duijn, M., & Snijders, T. (2006). The multilevel p2 model. Methodology: European Journal of Research Methods for the Behavioral and Social Sciences, 2, 4247.
Žnidaršič, A., Ferligoj, A., & Doreian, P. (2017). Actor non-response in valued social networks: The impact of different non-response treatments on the stability of blockmodels. Social Networks, 48, 4656.
Recommend this journal

Email your librarian or administrator to recommend adding this journal to your organisation's collection.

Network Science
  • ISSN: 2050-1242
  • EISSN: 2050-1250
  • URL: /core/journals/network-science
Please enter your name
Please enter a valid email address
Who would you like to send this to? *



Altmetric attention score

Full text views

Total number of HTML views: 0
Total number of PDF views: 0 *
Loading metrics...

Abstract views

Total abstract views: 0 *
Loading metrics...

* Views captured on Cambridge Core between <date>. This data will be updated every 24 hours.

Usage data cannot currently be displayed