Skip to main content Accessibility help
×
Home

Statistical evaluation of spectral methods for anomaly detection in static networks

  • Tomilayo Komolafe (a1), A. Valeria Quevedo (a1) (a2), Srijan Sengupta (a1) and William H. Woodall (a1)

Abstract

The topic of anomaly detection in networks has attracted a lot of attention in recent years, especially with the rise of connected devices and social networks. Anomaly detection spans a wide range of applications, from detecting terrorist cells in counter-terrorism efforts to identifying unexpected mutations during ribonucleic acid transcription. Fittingly, numerous algorithmic techniques for anomaly detection have been introduced. However, to date, little work has been done to evaluate these algorithms from a statistical perspective. This work is aimed at addressing this gap in the literature by carrying out statistical evaluation of a suite of popular spectral methods for anomaly detection in networks. Our investigation on the statistical properties of these algorithms reveals several important and critical shortcomings that we make methodological improvements to address. Further, we carry out a performance evaluation of these algorithms using simulated networks and extend the methods from binary to count networks.

Copyright

Corresponding author

*Corresponding author. Email: tomilayo@vt.edu

References

Hide All
Aiello, W., Chung, F., & Lu, L. (2001). A random graph model for power law graphs. Experimental Mathematics, 10(1), 5366.
Akoglu, L., McGlohon, M., & Faloutsos, C. (2010). Oddball: Spotting anomalies in weighted graphs. In Pacific-Asia Conference on Knowledge Discovery and Data Mining (pp. 410421). Springer, Berlin, Heidelberg.
Akoglu, L., Tong, H., & Koutra, D. (2015). Graph based anomaly detection and description: a survey. Data Mining and Knowledge Discovery, 29(3), 626688.
Albert, R., Albert, I., & Nakarado, G. L. (2004). Structural vulnerability of the North American power grid. Physical Review E, 69(2), 025103.
Azarnoush, B., Paynabar, K., Bekki, J., & Runger, G. (2016). Monitoring temporal homogeneity in attributed network streams. Journal of Quality Technology, 48(1), 2843.
Bader, D. A., & Madduri, K. (2008). Snap, small-world network analysis and partitioning: An open-source parallel graph framework for the exploration of large-scale networks. In 2008 IEEE international symposium on parallel and distributed processing, 2008, Miami, FL (pp. 112). IEEE.
Cer, R., Bruce, K., Donohue, D., Temiz, N., Mudunuri, U., Yi, M., … Stephens, R. (2012). Searching for non-B DNA-forming motifs using nBMST (non-B DNA motif search tool). Current Protocols in Human Genetics (pp. 18.7.118.7.22).
Cer, R. Z., Bruce, K. H., Donohue, D. E., Temiz, A. N., Bacolla, A., Mudunuri, U. S., … Collins, J. R. (2011). Introducing the non-B DNA Motif Search Tool (nBMST). Genome Biology, 12(1), P34.
Chakrabarti, D., Zhan, Y., & Faloutsos, C. (2004). R-MAT: A recursive model for graph mining. In Proceedings of the 2004 SIAM International Conference on Data Mining, Lake Buena Vista, FL (pp. 442446). SIAM.
Chawla, S., & Sun, P. (2006). SLOM: A new measure for local spatial outliers. Knowledge and Information Systems, 9(4), 412429.
Chung, F., Lu, L., & Vu, V. (2004). Spectra of random graphs with given expected degrees. Internet Mathematics, 1(3), 257275.
Dahan, M., Sela, L., & Amin, S. (2017). Network monitoring under strategic disruptions. arXiv preprint arXiv:1705.00349.
Erdos, P., & Rényi, A. (1960). On the evolution of random graphs. Publication of the Mathematical Institute of the Hungarian Academy of Sciences, 5(1), 1760.
Farahani, E. M., Kazemzadeh, R. B., Noorossana, R., & Rahimian, G. (2017). A statistical approach to social network monitoring. Communications in Statistics-Theory and Methods, 46(22), 1127211288.
Haveliwala, T. H. (2003). Topic-sensitive pagerank: A context-sensitive ranking algorithm for web search. IEEE Transactions on Knowledge and Data Engineering, 15(4), 784796.
Lei, J., & Rinaldo, A. (2015). Consistency of spectral clustering in stochastic block models. The Annals of Statistics, 43(1), 215237.
Mall, R., Langone, R., & Suykens, J. A. (2013). Kernel spectral clustering for big data networks. Entropy, 15(5), 15671586.
Miller, B. A., Beard, M. S., Wolfe, P. J., & Bliss, N. T. (2015). A spectral framework for anomalous subgraph detection. IEEE Transactions on Signal Processing, 63(16), 41914206.
Miller, B. A., Bliss, N., & Wolfe, P. J. (2010a). Subgraph detection using eigenvector L1 norms. Advances in Neural Information Processing Systems, 23, 16331641.
Miller, B. A., Bliss, N. T., & Wolfe, P. J. (2010b). Toward signal processing theory for graphs and non-Euclidean data. In 2010 IEEE Proceedings International Conference on Acoustics, Speech and Signal Processing, Dallas, Texas (pp. 54145417). ICASSP.
Nadarajah, S., & Kotz, S. (2004). The beta Gumbel distribution. Mathematical Problems in Engineering, 4, 323332.
Newman, M. (2016). Community detection in networks: Modularity optimization and maximum likelihood are equivalent. arXiv preprint arXiv:1606.02319.
Papadimitriou, S., Kitagawa, H., Gibbons, P. B., & Faloutsos, C. (2003). Loci: Fast outlier detection using the local correlation integral. In Proceedings 19th International Conference on Data Engineering, Bangalore, India pp. 315326. IEEE.
Priebe, C. E., Conroy, J. M., Marchette, D. J., & Park, Y. (2005). Scan statistics on Enron graphs. Computational & Mathematical Organization Theory, 11(3), 229247.
Procter, J. B., Thompson, J., Letunic, I., Creevey, C., Jossinet, F., & Barton, G. J. (2010). Visualization of multiple alignments, phylogenies and gene family evolution. Nature Methods, 7, S16S25.
Qin, T., & Rohe, K. (2013). Regularized spectral clustering under the degree-corrected stochastic blockmodel. In Advances in Neural Information Processing Systems, Lake Tahoe, NV (pp. 31203128).
Ranshous, S., Shen, S., Koutra, D., Harenberg, S., Faloutsos, C., & Samatova, N. F. (2015). Anomaly detection in dynamic networks: A survey. Wiley Interdisciplinary Reviews: Computational Statistics, 7(3), 223247.
Raulf-Heimsoth, M., Chen, Z., Rihs, H., Kalbacher, H., Liebers, V., & Baur, X. (1998). Analysis of t-cell reactive regions and HLA-DR4 binding motifs on the latex allergen Hev b 1 (rubber elongation factor). Clinical and Experimental Allergy, 28(3), 339348.
Rohe, K., Chatterjee, S., & Yu, B. (2011). Spectral clustering and the high-dimensional stochastic blockmodel. The Annals of Statistics, 39(4), 18781915.
Šaltenis, V. (2004). Outlier detection based on the distribution of distances between data points. Informatica, 15(3), 399410.
Savage, D., Zhang, X., Yu, X., Chou, P., & Wang, Q. (2014). Anomaly detection in online social networks. Social Networks, 39, 6270.
Sengupta, S. (2018). Anomaly detection in static networks using egonets. arXiv preprint arXiv:1807.08925.
Sengupta, S., & Chen, Y. (2015). Spectral clustering in heterogeneous networks. Statistica Sinica, 25, 10811106.
Singh, N., Miller, B. A., Bliss, N. T., & Wolfe, P. J. (2011). Anomalous subgraph detection via sparse principal component analysis. In 2011 IEEE Statistical Signal Processing Workshop (SSP), Nice, France (pp. 485488). IEEE.
Sun, J., Qu, H., Chakrabarti, D., & Faloutsos, C. (2005). Neighborhood formation and anomaly detection in bipartite graphs. In Fifth IEEE International Conference on Data Mining (ICDM’05), Houston, TX (pp. 18). IEEE.
Wang, G., Xie, S., Liu, B., & Yu, P. S. (2012). Identify online store review spammers via social review graph. ACM Transactions on Intelligent Systems and Technology (TIST), 3(4), 61.
Woodall, W. H., Zhao, M. J., Paynabar, K., Sparks, R., & Wilson, J. D. (2017). An overview and perspective on social network monitoring. IISE Transactions, 49(3), 354365.

Keywords

Related content

Powered by UNSILO
Type Description Title
PDF
Supplementary materials

Komolafe et al. supplementary material
Komolafe et al. supplementary material

 PDF (4.0 MB)
4.0 MB

Statistical evaluation of spectral methods for anomaly detection in static networks

  • Tomilayo Komolafe (a1), A. Valeria Quevedo (a1) (a2), Srijan Sengupta (a1) and William H. Woodall (a1)

Metrics

Altmetric attention score

Full text views

Total number of HTML views: 0
Total number of PDF views: 0 *
Loading metrics...

Abstract views

Total abstract views: 0 *
Loading metrics...

* Views captured on Cambridge Core between <date>. This data will be updated every 24 hours.

Usage data cannot currently be displayed.