Recently, studying fundamental graph problems in the Massively Parallel Computation (MPC) framework, inspired by the MapReduce paradigm, has gained a lot of attention. An assumption common to a vast majority of approaches is to allow Ω˜(n) memory per machine, where n is the number of nodes in the graph and Ω˜ hides polylogarithmic factors. However, as pointed out by Karloff et al. [SODA'10] and Czumaj et al. [STOC'18], it might be unrealistic for a single machine to have linear or only slightly sublinear memory. In this paper, we thus study a more practical variant of the MPC model which only requires substantially sublinear or even subpolynomial memory per machine. In contrast to the linear-memory MPC model and also to streaming algorithms, in this low-memory MPC setting, a single machine will only see a small number of nodes in the graph. We introduce a new and strikingly simple technique to cope with this imposed locality. In particular, we show that the Maximal Independent Set (MIS) problem can be solved efficiently, that is, in O(log 3logn) rounds, when the input graph is a tree. This constitutes an almost exponential speed-up over the low-memory MPC algorithm in O˜(logn) rounds in a concurrent work by Ghaffari and Uitto [SODA'19] and substantially reduces the local memory from Ω˜(n) required by the recent O(loglogn)-round MIS algorithm of Ghaffari et al. [PODC'18] to n ε for any ε>0, without incurring a significant loss in the round complexity. Moreover, it demonstrates how to make use of the all-to-all communication in the MPC model to almost exponentially improve on the corresponding bound in the LOCAL and PRAM models by Lenzen and Wattenhofer [PODC'11].