Advanced search
Start date
Betweenand

Delivering Augmented Reality to the Edge: An Approach Toward Model Partitioning Across Heterogeneous Devices

Grant number: 25/08976-4
Support Opportunities:Scholarships abroad - Research Internship - Doctorate
Start date: November 15, 2025
End date: November 14, 2026
Field of knowledge:Physical Sciences and Mathematics - Computer Science - Computer Systems
Principal Investigator:Fabio Luciano Verdi
Grantee:Washington Rodrigo Dias da Silva
Supervisor: Omer Farooq Rana
Host Institution: Centro de Ciências em Gestão e Tecnologia (CCGT). Universidade Federal de São Carlos (UFSCAR). Campus de Sorocaba. Sorocaba , SP, Brazil
Institution abroad: Cardiff University, Wales  
Associated to the scholarship:23/04760-1 - Delivering Augmented Reality to the Edge: An Approach Toward Object Recognition through the In-Network Computing Paradigm, BP.DR

Abstract

With the emergence of In-Network Machine Learning (ML), we have seen a paradigm shift in moving ML applications from the control plane to the data plane. However, implementing large-scale Deep Learning (DL) models on programmable network devices poses significant challenges due to their computing constraints. A potential solution is to decompose the model into components and distribute them across target devices. Given the diversity and heterogeneity of devices in an edge server, this approach introduces unique challenges in ensuring compatibility and efficient communication. In this PhD project, we investigate how to leverage programmable network devices to run object detection tasks from Augmented Reality (AR) applications. Hence, we propose a resource-aware model partitioning workflow to address these challenges. In the proposed approach, a DL model is converted into a Directed Acyclic Graph (DAG) Intermediate Representation (IR) to ensure compatibility across different devices. Each DAG node is then profiled by estimating computing time and energy cost on target devices, including FPGAs, DPUs, and GPUs. These profiling results enable us to optimally split the DAG into subgraphs and execute them on heterogeneous devices according to each device's computing capacity. Throughout this PhD project, we investigate linear and non-linear splitting methods; in the current phase, we address model partitioning with linear programming, aiming to minimize total energy consumption while meeting latency requirements. Preliminary results indicate that this approach allows an edge server equipped with an FPGA using a single DPU core and a BlueField 2 to run an object recognition task within 20 milliseconds without needing a GPU. These results also indicate that distributed inference consumes less energy than offloading the entire DAG to the GPU.

News published in Agência FAPESP Newsletter about the scholarship:
More itemsLess items
Articles published in other media outlets ( ):
More itemsLess items
VEICULO: TITULO (DATA)
VEICULO: TITULO (DATA)