On Optimizing Resources for Real-time End-to-End Machine Learning in Heterogeneous Edges

Tutkimustuotos: TyöpaperiEsipainosScientific

70 Lataukset (Pure)


Employing Machine Learning (ML) services within applications at the edge becomes a viable solution to achieve performance and data regulations. With the microservice architecture, these applications can scale dynamically, improving service availability under dynamic workloads. However, it poses numerous challenges in provisioning shared resources for multiple end-to-end ML applications. Prevalent orchestration tools/frameworks supporting edge ML serving are inefficient in provisioning while facing many challenges due to constrained resources and diverse resource demands. In this work, we present a provisioning method to optimize resource utilization for end-to-end ML applications on a heterogeneous edge. By profiling all microservices within the application, we estimate scales and allocate them on desired hardware platforms with sufficient resources, considering their runtime utilization patterns. We also provide several practical analyses on runtime monitoring metrics to detect and mitigate resource contentions, guaranteeing performance. The experiments with three real-world ML applications demonstrate the practicality of our method on a heterogeneous edge cluster of Raspberry Pis and Jetson Developer Kits.
TilaJätetty - 29 tammik. 2024
OKM-julkaisutyyppiEi oikeutettu


NimiSoftware: Practice and Experience
KustantajaJohn Wiley & Sons
ISSN (painettu)0038-0644


Sukella tutkimusaiheisiin 'On Optimizing Resources for Real-time End-to-End Machine Learning in Heterogeneous Edges'. Ne muodostavat yhdessä ainutlaatuisen sormenjäljen.

Siteeraa tätä