Chandramowlishwaran, Aparna, Madduri, Kamesh, and Vuduc, Richard. Diagnosis, tuning, and redesign for multicore performance: A case study of the fast multipole method. In Proceedings of the 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis, pages 1–12. IEEE Computer Society, 2010.
Cheng, H., Greengard, L., and Rokhlin, V.. A fast adaptive multipole algorithm in three dimensions. Journal of Computational Physics, 155(2):468–498, 1999.
Engquist, Björn, Ying, Lexing, et al. A fast directional algorithm for high frequency acoustic scattering in two dimensions. Communications in Mathematical Sciences, 7(2):327–345, 2009.
Ethridge, Frank and Greengard, Leslie. A new fast-multipole accelerated poisson solver in two dimensions. SIAM Journal on Scientific Computing, 23(3):741–760, 2001.
Fong, William and Darve, Eric. The black-box fast multipole method. Journal of Computational Physics, 228(23):8712–8725, 2009.
Fu, Yuhong, Klimkowski, Kenneth J., Rodiny, Gregory J., Berger, Emery, Browne, James C., C, James, Singer, Jrgen K., Van De Geijn, Robert A., and Vemaganti, Kumar S.. A fast solution method for three-dimensional many-particle problems of linear elasticity. Int. J. Num. Meth. Engrg, 42:1215–1229, 1998.
Fu, Yuhong and Rodin, Gregory J. Fast solution method for three-dimensional stokesian many-particle problems. Communications in Numerical Methods in Engineering, 16(2):145–149, 2000.
Gimbutas, Zydrunas and Rokhlin, Vladimir. A generalized fast multipole method for nonoscillatory kernels. SIAM Journal on Scientific Computing, 24(3):796–817, 2003.
Greengard, L. and Rokhlin, V.. A fast algorithm for particle simulations. J. Comput. Phys., 73(2):325–348, December 1987.
Greengard, Leslie. Fast algorithms for classical physics. Science, 265(5174):909–914, 1994.
Greengard, Leslie F. and Huang, Jingfang. A new version of the fast multipole method for screened coulomb interactions in three dimensions. Journal of Computational Physics, 180(2):642–658, 2002.
Hamada, T., Narumi, T., Yokota, R., Yasuoka, K., Nitadori, K., and Taiji, M.. 42 TFlops hierarchical N-body simulations on GPUs with applications in both astrophysics and turbulence. In Proceedings of SC09, The SCxy Conference series, Portland, Oregon, November 2009. ACM/IEEE.
Hu, Qi, Gumerov, Nail A, and Duraiswami, Ramani. Scalable fast multipole methods on distributed heterogeneous architectures. In Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis, page 36. ACM, 2011.
Jetley, Pritish, Wesolowski, Lukasz, Gioachin, Filippo, Kaleé, Laxmikant V, and Quinn, Thomas R. Scaling hierarchical n-body simulations on gpu clusters. In Proceedings of the 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis, pages 1–11. IEEE Computer Society, 2010.
Langston, Harper, Greengard, Leslie, and Zorin, Denis. A free-space adaptive fmm-based pde solver in three dimensions. Communications in Applied Mathematics and Computational Science, 6(1):79–122, 2011.
Lashuk, Ilya, Chandramowlishwaran, Aparna, Langston, Harper, Nguyen, Tuan-Anh, Sampath, Rahul, Shringarpure, Aashay, Vuduc, Richard, Ying, Lexing, Zorin, Denis, and Biros, George. A massively parallel adaptive fast multipole method on heterogeneous architectures. Communications of the ACM, 55(5):101–109, May 2012.
Lindsay, Keith and Krasny, Robert. A particle method and adaptive treecode for vortex sheet motion in three-dimensional flow. Journal of Computational Physics, 172(2):879–907, 2001.
Makino, Junichiro, Fukushige, Toshiyuki, and Koga, Masaki. A 1.349 tflops simulation of black holes in a galactic center on grape-6. In Supercomputing, ACM/IEEE 2000 Conference, pages 43–43. IEEE, 2000.
Malhotra, Dhairya and Biros, George. pvfmm: A distributed memory fast multipole method for volume potentials, 2014. submitted.
Malhotra, Dhairya, Gholami, Amir, and Biros, George. A volume integral equation stokes solver for problems with variable coefficients. In High Performance Computing, Networking, Storage and Analysis, SC14: International Conference for, pages 92–102. IEEE, 2014.
Rahimian, A., Lashuk, I., Veerapaneni, S.K., Chandramowlishwaran, A., Malhotra, D., Moon, L., Sampath, R., Shringarpure, A., Vetter, J., Vuduc, R., Zorin, D., and Biros, G.. Petascale direct numerical simulation of blood flow on 200k cores and heterogeneous architectures. In SC ’10: Proceedings of the 2010 ACM/IEEE conference on Supercomputing, pages 1–12, Piscataway, NJ, USA, 2010. IEEE Press.
Song, Jiming, Lu, Cai-Cheng, and Chew, Weng Cho. Multilevel fast multipole algorithm for electromagnetic scattering by large complex objects. Antennas and Propagation, IEEE Transactions on, 45(10):1488–1493, 1997.
Takahashi, Toru, Cecka, Cris, Fong, William, and Darve, Eric. Optimizing the multipole-to-local operator in the fast multipole method for graphical processing units. International Journal for Numerical Methods in Engineering, 89(1):105–133, 2012.
Warren, Michael S and Salmon, John K. Astrophysical n-body simulations using hierarchical tree data structures. In Proceedings of the 1992 ACM/IEEE Conference on Supercomputing, pages 570–576. IEEE Computer Society Press, 1992.
Warren, Michael S and Salmon, John K. A parallel hashed oct-tree n-body algorithm. In Proceedings of the 1993 ACM/IEEE conference on Supercomputing, pages 12–21. ACM, 1993.
Ying, Lexing, Biros, George, and Zorin, Denis. A kernel-independent adaptive fast multipole method in two and three dimensions. Journal of Computational Physics, 196(2):591–626, 2004.
Ying, Lexing, Biros, George, Zorin, Denis, and Langston, Harper. A new parallel kernel-independent fast multipole method. In Supercomputing, 2003 ACM/IEEE Conference, pages 14–14. IEEE, 2003.
Yokota, R., Bardhan, J.P., Knepley, M.G., Barba, LA, and Hamada, T.. Biomolecular electrostatics using a fast multipole bem on up to 512 gpus and a billion unknowns. Computer Physics Communications, 2011.