上海大学学报(自然科学版) ›› 2016, Vol. 22 ›› Issue (1): 69-80.doi: 10.3969/j.issn.1007-2861.2015.04.017

• 大数据 • 上一篇    下一篇

面向大数据应用的多层次混合式并行方法

黄磊1, 支小莉1, 郑圣安2   

  1. 1. 上海大学 计算机工程与科学学院, 上海 200444; 2. 上海交通大学 计算机科学与工程系, 上海 200240
  • 收稿日期:2015-11-19 出版日期:2016-02-29 发布日期:2016-02-29
  • 通讯作者: 支小莉(1974—), 女, 副研究员, 博士, 研究方向为并行计算、软件定义网络. E-mail: xlzhi@mail.shu.edu.cn
  • 基金资助:

    上海市科委科研计划资助项目(15DZ1100305)

Multilevel hybrid parallel method for big data applications

HUANG Lei1, ZHI Xiaoli1, ZHENG Shengan2   

  1. 1. School of Computer Engineering and Science, Shanghai University, Shanghai 200444, China; 2. Department of Computer Science and Engineering, Shanghai Jiao Tong University, Shanghai 200240, China
  • Received:2015-11-19 Online:2016-02-29 Published:2016-02-29

摘要:

基于很多大数据应用存在对数据进行多种并行处理的需求, 提出两层混合式并行方法, 即执行单元的混合并行和计算模型的混合并行. 通过在同一个计算节点上执行单元的混合并行, 充分挖掘基础设施的计算能力, 从而提高数据处理性能; 采用在同一个执行引擎中集成多个计算模型的并行方法, 以适合应用多样异质处理模式. 不同的混合并行方法可以契合不同的数据和计算特点, 以满足不同的并行目标. 介绍了混合式并行方法的基本思想, 并以前期开发的并行编程模型BSPCloud为基础, 阐述了进程和线程混合并行、BSP和MapReduce混合并行的主要实现机制.

关键词: BSP), MapReduce, 编程模型, 混合并行, 整体同步并行(bulk synchronous parallel

Abstract:

Many large data applications require a variety of parallel data processing. This paper presents a two-layer hybrid parallel method, i.e., hybrid parallel of execution units and hybrid parallel of computing model. By hybrid parallel of execution units on the same computing node. The computing power of infrastructure can be fully taped, and thus data processing performance can be improved. By integrating several calculation models into the same execution engine in a parallel way, diverse heterogeneous processing modes may be applied. Different hybrid parallel ways can meet different data and calculation characteristics, and meet different parallel objectives as well. This paper introduces the basic ideas of hybrid parallel methods, and describes main implementation mechanisms of hybrid parallelism.

Key words: bulk synchronous parallel (BSP), hybrid parallelism, MapReduce, programming model