Skip to main content Accessibility help
  • Get access
    Check if you have access via personal or institutional login
  • Cited by 1
  • Print publication year: 2016
  • Online publication date: December 2015

5 - Big data analytics systems

from Part II - Big data over cyber networks


Performing timely analysis on huge datasets is the central promise of big data analytics. To cope with the high volumes of data to be analyzed, computation frameworks have resorted to “scaling out” – parallelization of analytics that allows for seamless execution across large clusters. These frameworks automatically compose analytics jobs into a DAG of small tasks, and then aggregate the intermediate results from the tasks to obtain the final result. Their ability to do so relies on an efficient scheduler and a reliable storage layer that distributes the datasets on different machines.

In this chapter, we survey the above two aspects, scheduling and storage, which are the foundations of modern big data analytics systems.We describe their key principles, and how these principles are realized in widely deployed systems.


Analyzing large volumes of data has become the major source for innovation behind large Internet services as well as scientific applications. Examples of such “big data analytics” occur in personalized recommendation systems, online social networks, genomic analyses, and legal investigations for fraud detection. A key property of the algorithms employed for such analyses is that they provide better results with increasing amount of data processed. In fact, in certain domains (like search) there is a trend towards using relatively simpler algorithms and instead relying on more data to produce better results.

While the amount of data to be analyzed increases on the one hand, the acceptable time to produce results is shrinking on the other hand. Timely analyses have significant ramifications for revenue as well as productivity. Low latency results in online services leads to improved user satisfaction and revenue. Ability to crunch large datasets in short periods results in faster iterations and progress in scientific theories.

To cope with the dichotomy of ever-growing datasets and shrinking times to analyze them, analytics clusters have resorted to scaling out. Data are spread across many different machines, and the computations on them are executed in parallel. Such scaling out is crucial for fast analytics and allows coping with the trend of datasets growing faster than Moore's laws increase in processor speeds.

Many data analytics frameworks have been built for large scale-out parallel executions. Some of the widely used frameworks are MapReduce [1], Dryad [2] and Apache Yarn [3].

[1] J., Dean and S., Ghemawat, “Mapreduce: simplified data processing on large clusters,” Communications of the ACM, vol. 51, no. 1, pp. 107–113, 2008.
[2] M., Isard, M., Budiu, Y., Yu, A., Birrell, and D., Fetterly, “Dryad: distributed data-parallel programs from sequential building blocks,” in ACM EuroSys, 2007.
[3] V., Vavilapalli et al., “Apache hadoop yarn: yet another resource negotiator,” in ACM SoCC, 2013.
[4] B., Hindman, A., Konwinski, M., Zaharia, et al., “Mesos: a platform for fine-grained resource sharing in the data center,” in USENIX NSDI, 2011.
[5] M., Shreedhar and G., Varghese, “Efficient fair queuing using deficit round-robin,” IEEE/ACM Transactions on Networking, vol. 4, no. 3, pp. 375–385, 1996.
[6] A., Demers, S., Keshav, and S., Shenker, “Analysis and simulation of a fair queueing algorithm,” in ACM SIGCOMM Computer Communication Review, vol. 19, no. 4. ACM, 1989, pp. 1–12.
[7] A., Ghodsi, M., Zaharia, B., Hindman, et al., “Dominant resource fairness: fair allocation of multiple resource types.” in NSDI, vol. 11, 2011, pp. 24–24.
[8] M., Zaharia, D., Borthakur, J. S., Sarma, et al., “Job scheduling for multi-user mapreduce clusters,” EECS Department, University of California, Berkeley, Tech. Rep. UCB/EECS- 2009-55, 2009.
[9] R., Grandl, G., Ananthanarayanan, S., Kandula, S., Rao, and A., Akella, “Multi-resource packing for cluster schedulers,” in ACMSIGCOMM, 2014, pp. 455–466. [Online]. Available:
[10] A., Ghodsi,M., Zaharia, S., Shenker, and I., Stoica, “Choosy: max-min fair sharing for datacenter jobs with constraints,” in Proceedings of the 8th ACMEuropean Conference on Computer Systems, ACM, 2013, pp. 365–378.
[11] B., Sharma, V., Chudnovsky, J. L., Hellerstein, R., Rifaat, and C. R., Das, “Modeling and synthesizing task placement constraints in google compute clusters,” in Proceedings of the 2nd ACM Symposium on Cloud Computing, ACM, 2011, p. 3.
[12] M., Zaharia, D., Borthakur, J. Sen, Sarma, et al., “Delay scheduling: a simple technique for achieving locality and fairness in cluster scheduling,” in Proceedings of the 5th European Conference on Computer Systems, ACM, 2010, pp. 265–278.
[13] M., Isard, V., Prabhakaran, J., Currey, U., Wieder, K., Talwar, and A., Goldberg, “Quincy: fair scheduling for distributed computing clusters,” in Proceedings of the ACM SIGOPS 22nd Symposium on Operating Systems Principles, ACM, 2009, pp. 261–276.
[14] P., Bodík, I., Menache, M., Chowdhury, et al., “Surviving failures in bandwidth-constrained datacenters,” in Proceedings of the ACMSIGCOMM 2012 Conference on Applications, Technologies, Architectures, and Protocols for Computer Communication, ACM, 2012, pp. 431– 442.
[15] A. D., Ferguson, P., Bodik, S., Kandula, E., Boutin, and R., Fonseca, “Jockey: guaranteed job latency in data parallel clusters,” in Proceedings of the 7th ACM european conference on Computer Systems, ACM, 2012, pp. 99–112.
[16] C., Curino, D. E., Difallah, C., Douglas, et al., “Reservation-based scheduling: If you're late don't blame us!” in Proceedings of the ACM Symposium on Cloud Computing, ACM, 2014, pp. 1–14.
[17] N., Jain, I., Menache, J., Naor, and J., Yaniv, “Near-optimal scheduling mechanisms for deadline-sensitive jobs in large computing clusters,” in SPAA, 2012, pp. 255–266.
[18] B., Lucier, I., Menache, J., Naor, and J., Yaniv, “Efficient online scheduling for deadlinesensitive jobs: extended abstract,” in SPAA, 2013, pp. 305–314.
[19] P., Bodík, I., Menache, J. S., Naor, and J., Yaniv, “Brief announcement: deadline-aware scheduling of big-data processing jobs,” in Proceedings of the 26th ACM symposium on Parallelism in algorithms and architectures, ACM, 2014, pp. 211–213.
[20] M., Zaharia, A., Konwinski, A. D., Joseph, R. H., Katz, and I., Stoica, “Improving mapreduce performance in heterogeneous environments,” in OSDI, vol. 8, no. 4, 2008, p. 7.
[21] G., Ananthanarayanan, S., Kandula, A. G., Greenberg, I., Stoica, Y., Lu, B., Saha, and E., Harris, “Reining in the outliers in map-reduce clusters using mantri,” in OSDI, vol. 10, no. 1, 2010, p. 24.
[22] G., Ananthanarayanan, A., Ghodsi, S., Shenker, and I., Stoica, “Effective straggler mitigation: attack of the clones.” in NSDI, vol. 13, 2013, pp. 185–198.
[23] “Posix,”
[24] G., Ananthanarayanan, A., Ghodsi, A., Wang, et al., “Pacman: coordinated memory caching for parallel jobs,” in USENIX NSDI, 2012.
[25] G., Ananthanarayanan, S., Agarwal, S., Kandula, et al., “Scarlett: coping with skewed popularity content in mapreduce clusters,” in ACM EuroSys, 2011.
[26] M., Zaharia, M., Chowdhury, T., Das, et al., “Resilient distributed datasets: a fault-tolerant abstraction for in-memory cluster computing,” in USENIX NSDI, 2012.
[27] S., Melnik, A., Gubarev, J. J., Long, et al., “Dremel: Interactive analysis of web-scale datasets,” in Proceedings of the 36th International Conf on Very Large Data Bases, 2010, pp. 330–339.
[28] R., Xin, J., Rosen, M., Zaharia, et al., “Shark: SQL and rich analytics at scale,” in Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data, 2013.
[29] S., Agarwal, B., Mozafari, A., Panda, et al., “Blinkdb: queries with bounded errors and bounded response times on very large data,” in Proceedings of the 8th European Conference on Computer Systems, ACM, 2013.
[30] J., Liu,W.-K., Shih, K.-J., Lin, R., Bettati, and J.-Y., Chung, “Imprecise computations.” in IEEE, 1994.
[31] S., Lohr, “Sampling: design and analysis,” in Thomson, 2009.
[32] Y., Chen, S., Alspaugh, D., Borthakur, and R., Katz, “Energy efficiency for large-scale mapreduce workloads with significant interactive analysis,” in Proceedings of the 7th ACM European Conference on Computer Systems, ACM, 2012, pp. 43–56.
[33] Z., Liu, Y., Chen, C., Bash, et al., “Renewable and cooling aware workload management for sustainable data centers,” in ACM SIGMETRICS Performance Evaluation Review, vol. 40, no. 1, ACM, 2012, pp. 175–186.
[34] A., Beloglazov, R., Buyya, Y. C., Lee, et al., “A taxonomy and survey of energy-efficient data centers and cloud computing systems,” Advances in Computers, vol. 82, no. 2, pp. 47–111, 2011.
[35] A., Gandhi, M., Harchol-Balter, R., Das, and C., Lefurgy, “Optimal power allocation in server farms,” in ACM SIGMETRICS Performance Evaluation Review, vol. 37, no. 1, ACM, 2009, pp. 157–168.
[36] A., Gandhi, V., Gupta, M., Harchol-Balter, and M. A., Kozuch, “Optimality analysis of energyperformance trade-off for server farmmanagement,” Performance Evaluation, vol. 67, no. 11, pp. 1155–1171, 2010.
[37] N., Buchbinder, N., Jain, and I., Menache, “Online job-migration for reducing the electricity bill in the cloud,” in NETWORKING 2011, Springer, 2011, pp. 172–185.
[38] “EC2 pricing,”
[39] I., Menache, O., Shamir, and N., Jain, “On-demand, spot, or both: dynamic resource allocation for executing batch jobs in the cloud,” in 11th International Conference on Autonomic Computing (ICAC), 2014.