The rapid penetration of photovoltaic generation reduces power grid inertia and increases the need for intelligent energy resources that can cope in real time with the imbalance between power generation and consumption. Virtual power plants are a technology for coordinating such resources and monetizing them, for example on electricity markets with real-time pricing or on frequency reserves markets. Accurate short-term photovoltaic generation forecasts are essential for such virtual power plants. Although significant research has been done on medium- and long-term photovoltaic generation forecasting, the short-term forecasting problem requires special attention to sudden fluctuations due to the high variability of cloud cover and related weather events. Solar irradiance nowcasting aims to resolve this variability by providing reliable short-term forecasts of the expected power generation capacity. Sky images captured in proximity to the photovoltaic panels are used to determine cloud behavior and solar intensity. This is a computationally challenging task for conventional computer vision techniques and only a handful of Artificial Intelligence (AI) methods have been proposed. In this paper, a novel multimodal approach is proposed based on two Long Short-Term Memory Networks (LSTM) that receives a temporal image modality of a stream of sky images, a temporal numerical modality of a time-series of past solar irradiance readings and cloud cover readings as inputs for irradiance nowcasting. The proposed nowcasting pipeline consists of a preprocessing module and an irradiance augmentation module that implements methods for cloud detection, Sun localization and mask generation. The complete approach was empirically evaluated on a real-world solar irradiance case study across the four seasons of the northern hemisphere, resulting in a mean improvement of 39% for multimodality.