Point-process models of social network interactions: Parameter estimation and missing data recovery

JOSEPH R. ZIPKIN; FREDERIC P. SCHOENBERG; KATHRYN CORONGES; ANDREA L. BERTOZZI

doi:10.1017/S0956792515000492

Point-process models of social network interactions: Parameter estimation and missing data recovery

Published online by Cambridge University Press: 08 October 2015

JOSEPH R. ZIPKIN ,

FREDERIC P. SCHOENBERG ,

KATHRYN CORONGES and

ANDREA L. BERTOZZI

Show author details

JOSEPH R. ZIPKIN: Affiliation:
Department of Mathematics, University of California, Los Angeles, CA 90095, USA email: zipkinj@acm.org; bertozzi@math.ucla.edu
FREDERIC P. SCHOENBERG: Affiliation:
Department of Statistics, University of California, Los Angeles, CA 90095, USA email: frederic@stat.ucla.edu
KATHRYN CORONGES: Affiliation:
Network Science Institute, Northeastern University, Boston, MA 02115, USA email: k.coronges@neu.edu
ANDREA L. BERTOZZI: Affiliation:
Department of Mathematics, University of California, Los Angeles, CA 90095, USA email: zipkinj@acm.org; bertozzi@math.ucla.edu

Article contents

Abstract
References

Get access

Rights & Permissions

Abstract

Electronic communications, as well as other categories of interactions within social networks, exhibit bursts of activity localised in time. We adopt a self-exciting Hawkes process model for this behaviour. First we investigate parameter estimation of such processes and find that, in the parameter regime we encounter, the choice of triggering function is not as important as getting the correct parameters once a choice is made. Then we present a relaxed maximum likelihood method for filling in missing data in records of communications in social networks. Our optimisation algorithm adapts a recent curvilinear search method to handle inequality constraints and a non-vanishing derivative. Finally we demonstrate the method using a data set composed of email records from a social network based at the United States Military Academy. The method performs differently on this data and data from simulations, but the performance degrades only slightly as more information is removed. The ability to fill in large blocks of missing social network data has implications for security, surveillance, and privacy.

Keywords

Hawkes processes maximum likelihood missing data constrained optimization social networks

Type: Papers
Information: European Journal of Applied Mathematics , Volume 27 , Issue 3: Mathematical Modelling of Crime and Security , June 2016 , pp. 502 - 529

DOI: https://doi.org/10.1017/S0956792515000492 [Opens in a new window]
Copyright: Copyright © Cambridge University Press 2015

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

[1]Barabási, A.-L. (2005) The origin of bursts and heavy tails in human dynamics. Nature 435, 207–11.CrossRef Google Scholar PubMed

[2]Candès, E. J., Romberg, J. K. & Tao, T. (2006) Stable signal recovery from incomplete and inaccurate measurements. Commu. Pure Appl. Math. 59, 1207–23.CrossRef Google Scholar

[3]Chambolle, A., Caselles, V., Cremers, D., Novaga, M. & Pock, T. (2010) An introduction to total variation for image analysis. In: Fornasier, M. (editor), Theoretical Foundations and Numerical Methods for Sparse Recovery. De Gruyter, Berlin, pp. 263–340.CrossRef Google Scholar

[4]Chan, T. F. & Shen, J. (2005) Image Processing and Analysis: Variational, PDE, Wavelet, and Stochastic Methods, SIAM, Philadelphia.CrossRef Google Scholar

[5]Cho, Y. S., Galstyan, A., Brantingham, P. J. & Tita, G. (2014) Latent self-exciting point process model for spatial-temporal networks. Discrete Continuous Dyn. Syst. B 19, 1335–54.CrossRef Google Scholar

[6]Crane, R. & Sornette, D. (2008) Robust dynamic classes revealed by measuring the response function of a social system. Proc. Natl. Acad. Sci. 105, 15649–53.CrossRef Google Scholar PubMed

[7]Csermely, P., London, A., Wu, L.-Y. & Uzzi, B. (2013) Structure and dynamics of core/periphery networks. J. Complex Netw. 1, 93–123.CrossRef Google Scholar

[8]Donoho, D. L. (2006) Compressed sensing. IEEE Trans. Inform. Theory 52, 1289–1306.CrossRef Google Scholar

[9]Donoho, D. L. & Tanner, J. (2005) Sparse nonnegative solution of underdetermined linear equations by linear programming. Proc. Natl. Acad. Sci. 102, 9446–51.CrossRef Google Scholar PubMed

[10]Egesdal, M., Fathauer, C., Louie, K., Neuman, J., Mohler, G. & Lewis, E. (2010) Statistical modeling of gang violence in Los Angeles. SIAM Undergrad. Res.Google Scholar

[11]Fox, E. W., Short, M. B., Schoenberg, F. P., Coronges, K. D. & Bertozzi, A. L. Modeling e-mail networks and inferring leadership using self-exciting point processes. Submitted to J. Am. Stat. Assoc.Google Scholar

[12]Goldfarb, D., Wen, Z. & Yin, W. (2009) A curvilinear search method for p-harmonic flows on spheres. SIAM J. Imaging Sci. 2, 84–109.CrossRef Google Scholar

[13]Hawkes, A. G. (1971) Spectra of self-exciting and mutually exciting point processes. Biometrika 58, 83–90.CrossRef Google Scholar

[14]Hawkes, A. G. (1971) Point spectra of some mutually exciting point processes. J. R. Stat. Soc. B 33, 438–43.Google Scholar

[15]Hegemann, R. A., Lewis, E. A. & Bertozzi, A. L. (2013) An “Estimate & Score Algorithm” for simultaneous parameter estimation and reconstruction of incomplete data on social networks. Secur. Inform. 2, 1.CrossRef Google Scholar

[16]Isella, L., Stehlé, J., Barrat, A., Cattuto, C., Pinton, J.-F. & Van den Broeck, W. (2011) What's in a crowd? Analysis of face-to-face behavioral networks. J. Theor. Biol. 271, 166–80.CrossRef Google Scholar

[17]Lee, N. H., Yoder, J., Tang, M. & Priebe, C. E. (2013) On latent position inference from doubly stochastic messaging activities. Multiscale Model. Simul. 11, 683–718.CrossRef Google Scholar

[18]Lewis, E. & Mohler, G. A nonparametric EM algorithm for multiscale Hawkes processes. Preprint.Google Scholar

[19]Lewis, E., Mohler, G., Brantingham, P. J. & Bertozzi, A. (2010) Self-exciting point process models of insurgency in Iraq. UCLA CAM Report 10–38.Google Scholar

[20]Lewis, P. A. W. & Shedler, G. S. (1979) Simulation of nonhomogeneous Poisson processes by thinning. Naval Res. Logist. Q. 26, 403–13.CrossRef Google Scholar

[21]Marsan, D. & Lengliné, O. (2008) Extending earthquakes' reach through cascading. Science 319, 1076–79.CrossRef Google Scholar PubMed

[22]Masuda, N., Takaguchi, T., Sato, N. & Yano, K. (2013) Self-exciting point process modeling of conversation event sequences. In: Holme, P. & Saramäki, J. (editors), Temporal Networks, Springer–Verlag, Berlin, pp. 245–64.CrossRef Google Scholar PubMed

[23]McLachlan, G. J. & Krishnan, T. (2008) The EM Algorithm and Extensions, 2nd ed.Wiley, Hoboken, New Jersey.CrossRef Google Scholar

[24]Miritello, G., Moro, E. & Lara, R. (2011) Dynamical strength of social ties in information spreading. Phys. Rev. E 83, 045102(R).CrossRef Google Scholar PubMed

[25]Mohler, G. (2013) Modeling and estimation of multi-source clustering in crime and security data. Ann. Appl. Stat. 7, 1525–39.CrossRef Google Scholar

[26]Ogata, Y. (1981) On Lewis' simulation method for point processes. IEEE Trans. Inform. Theory 27, 23–31.CrossRef Google Scholar

[27]Ogata, Y. (1998) Space-time point process models for earthquake occurrences. Ann. Inst. Stat. Math. 50, 379–402.CrossRef Google Scholar

[28]Ogata, Y. (1999) Seismicity analysis through point-process modeling: A review. Pure Appl. Geophys. 155, 471–501.CrossRef Google Scholar

[29]Ozaki, T. (1979) Maximum likelihood estimation of Hawkes' self-exciting point processes. Ann. Inst. Stat. Math. 31, 145–55.CrossRef Google Scholar

[30]Paxson, V. & Floyd, S. (1995) Wide area traffic: The failure of Poisson modeling. IEEE/ACM Trans. Netw. 3, 226–44.CrossRef Google Scholar

[31]Rubin, I. (1972) Regular point processes and their detection. IEEE Trans. Inform. Theory 18, 547–57.CrossRef Google Scholar

[32]Rudin, L. I., Osher, S. & Fatemi, E. (1992) Nonlinear total variation based noise removal algorithms. Physica D 60, 259–68.CrossRef Google Scholar

[33]Rybski, D., Buldyrev, S. V., Havlin, S., Liljeros, F. & Makse, H. A. (2009) Scaling laws of human interaction activity. Proc. Natl. Acad. Sci. 106, 12640–45.CrossRef Google Scholar PubMed

[34]Stomakhin, A., Short, M. B. & Bertozzi, A. L. (2011) Reconstruction of missing data in social networks based on temporal patterns of interactions. Inverse Problems 27, 115013.CrossRef Google Scholar

[35]Vázquez, A., Oliveira, J. G., Dezsö, Z., Goh, K.-I., Kondor, I. & Barabási, A.-L. (2006) Modeling bursts and heavy tails in human dynamics. Phys. Rev. E 73, 036127.CrossRef Google Scholar PubMed

[36]Veen, A. & Schoenberg, F. P. (2008) Estimation of space–time branching process models in seismology using an EM-type algorithm. J. Am. Stat. Assoc. 103, 614–24.CrossRef Google Scholar

[37]Vese, L. A. & Osher, S. J. (2002) Numerical methods for p-harmonic flows and applications to image processing. SIAM J. Numer. Anal. 40, 2085–2104.CrossRef Google Scholar

[38]Wen, Z. & Yin, W. (2013) A feasible method for optimization with orthogonality constraints. Math. Program. A 142, 397–434.CrossRef Google Scholar

[39]Wu, C. F. J. (1983) On the convergence properties of the EM algorithm. Ann. Stat. 11, 95–103.CrossRef Google Scholar

Article contents

Point-process models of social network interactions: Parameter estimation and missing data recovery

Abstract

Keywords

Access options

References

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests