BigDataBench is a comprehensive big data benchmark suite (published on HPCA’14). The source code of BigDataBench is available from http://prof.ict.ac.cn/BigDataBench This tutorial presents what is BigDataBench and how to use BigDataBench. Please feel free to download Handbook of BigDataBench [BigDataBench Handbook]
Prof. Jianfeng Zhan, Chinese Academy of Sciences, and University of Chinese Academy of Sciences, email: email@example.com
Dr. Gang Lu, Beijing Academy of Frontier Sciences and Technology, email: firstname.lastname@example.org
Xinhui Tian, ICT, CAS, and University of CAS, email: email@example.com
BigDataBench is an open-source big data benchmark suite, publicly available from BigDataBench. After identifying diverse data models and representative big data workloads, BigDataBench proposes several benchmarks specifications to model five important application domains, including search engine, social networks, ecommerce, multimedia data analytics and bioinformatics. BigDataBench implements the same benchmarks specifications using variety of competitive techniques. The current version BigDataBench 3.2 adds graph, streaming frameworks and flink support, including 15 real-world data sets and the corresponding scalable big data generation tools, and 34 big data workloads. To allow flexible setting and replaying of mixed workloads, BigDataBench provides the multi-tenancy version; To save the benchmarking cost, BigDataBench reduces the full workloads to a subset according to workload characteristics from a specific perspective. It also provides both MARSSx86 and Simics simulator versions for architecture communities.
Presenting the real-world data sets, workloads, and parallel data generation tools in BigDataBench;
Giving a guideline on how to use BigDataBench.
Objectives: the audiences can use BigDataBench with reference to the BigDataBench user manual. Please feel free to download the BigDataBench user manual from BigDataBench-User-Manual
BigDataBench methodology and what is BigDataBench?
Characterizing big data workloads/ big data dwarfs
Real-workloads sets/ how to generate large-scale data
Subsetting big data workloads from BigDataBench
How to use the simulator versions?
Multi-tenancy version of BigDataBench
System, architecture, storage, security and privacy researcher and graduate students. More BigDataBench users are available from BigDataBench-Users.
8:30-9:00 What is BigDataBench and BigDataBench Methodology [pdf]
9:30-10:00 How to generate large-scale data from small-scale real-world one [pdf]
10:00-10:20 Coffee Break
10:20-10:50 Multi-tenancy version of BigDataBench [pdf]
10:50-11:20 Subsetting big data workloads from BigDataBench [pdf]
11:20-11:50 BigDataBench Dwarfs [pdf]
Jianfeng Zhan is a Professor of Computer Science and Engineering at Institute of Computing Technology, Chinese Academy of Sciences and University of Chinese Academy of Sciences. His research interests include computer architecture, operating systems, data management, parallel and distributed systems. He has published over 100 papers in major journals and international conferences related to these research areas, and filed 40 patents. From 2004 to 2010, he leaded the R&D efforts of innovative cluster and cloud systems software for the dawning-series super computers (which ranked top 2 and top 10 on the top 500 list in 2010 and 2004, respectively). Among them, GridView was transferred to Sugon, which is a premier supercomputing company in China, and becomes its popular software product. Currently, he is leading the research efforts for datacenter and big data software stacks, including BigDataBench—an open source big data benchmarking project, and RainForest— an operating system for big data and cloud computing. He received the second-class Chinese National Technology Promotion Prize in 2006, the Distinguished Achievement Award of the Chinese Academy of Sciences in 2005, IISWC Best paper award in 2013, and Huawei Contribution Prize in 2013, respectively. More details about Prof. Zhan are available at http://prof.ict.ac.cn/jfzhan
Gang Lu is the executive director of Beijing Academy of Frontier Science & Technology (BAFST). He received his Ph.D. degree in 2016 from Institute of Computing Technology, Chinese Academy of Sciences. He received his Bachelor’s Degree in 2010 from Huazhong University of Science and Technology. His current research interests include operating systems, cloud computing, and distributed and parallel systems.
Xinhui Tian received his Bachelor’s Degree of computer science in 2011 from Peking University in China. He is currently working toward PhD degree in computer science at Institute of Computing Technology, Chinese Academy of Sciences. His current research interests include Distributed System and Data Management.
Micro’14. Dec 13, 2014.