A coflow is a collection of related parallel flows that occur typically between two stages of a multi-stage computing task in a network, such as shuffle flows in MapReduce. The coflow abstraction allows applications to convey their semantics to the network so that application-level requirements can be better satisfied. In this paper, we study the routing and scheduling of multiple coflows to minimize the total weighted coflow completion time CCT. We first propose a rounding-based randomized approximation algorithm, called OneCoflow, for single coflow routing and scheduling. The multiple coflow problem is more challenging as coexisting coflows will compete for the same network resources, such as link bandwidth. To minimize the total weighted CCT, we derive an online multiple coflow routing and scheduling algorithm, called OMCoflow. We then derive a competitive ratio bound of our problem and prove that the competitive ratio of OMCoflow is nearly tight. To the best of our knowledge, this is the first online algorithm with theoretical performance guarantees which considers routing and scheduling simultaneously for multi-coflows. Compared with existing methods, OMCoflow runs more efficiently and avoids frequently rerouting the flows. Extensive simulations on a Facebook data trace show that OMCoflow outperforms the state-of-the-art heuristic schemes significantly e.g., reducing the total weighted CCT by up to 41.8 and the execution time by up to 99.2 against RAPIER.
- Datacenter networks
- Online algorithm