| Grant number: | 25/08976-4 |
| Support Opportunities: | Scholarships abroad - Research Internship - Doctorate |
| Start date: | November 15, 2025 |
| End date: | November 14, 2026 |
| Field of knowledge: | Physical Sciences and Mathematics - Computer Science - Computer Systems |
| Principal Investigator: | Fabio Luciano Verdi |
| Grantee: | Washington Rodrigo Dias da Silva |
| Supervisor: | Omer Farooq Rana |
| Host Institution: | Centro de Ciências em Gestão e Tecnologia (CCGT). Universidade Federal de São Carlos (UFSCAR). Campus de Sorocaba. Sorocaba , SP, Brazil |
| Institution abroad: | Cardiff University, Wales |
| Associated to the scholarship: | 23/04760-1 - Delivering Augmented Reality to the Edge: An Approach Toward Object Recognition through the In-Network Computing Paradigm, BP.DR |
Abstract With the emergence of In-Network Machine Learning (ML), we have seen a paradigm shift in moving ML applications from the control plane to the data plane. However, implementing large-scale Deep Learning (DL) models on programmable network devices poses significant challenges due to their computing constraints. A potential solution is to decompose the model into components and distribute them across target devices. Given the diversity and heterogeneity of devices in an edge server, this approach introduces unique challenges in ensuring compatibility and efficient communication. In this PhD project, we investigate how to leverage programmable network devices to run object detection tasks from Augmented Reality (AR) applications. Hence, we propose a resource-aware model partitioning workflow to address these challenges. In the proposed approach, a DL model is converted into a Directed Acyclic Graph (DAG) Intermediate Representation (IR) to ensure compatibility across different devices. Each DAG node is then profiled by estimating computing time and energy cost on target devices, including FPGAs, DPUs, and GPUs. These profiling results enable us to optimally split the DAG into subgraphs and execute them on heterogeneous devices according to each device's computing capacity. Throughout this PhD project, we investigate linear and non-linear splitting methods; in the current phase, we address model partitioning with linear programming, aiming to minimize total energy consumption while meeting latency requirements. Preliminary results indicate that this approach allows an edge server equipped with an FPGA using a single DPU core and a BlueField 2 to run an object recognition task within 20 milliseconds without needing a GPU. These results also indicate that distributed inference consumes less energy than offloading the entire DAG to the GPU. | |
| News published in Agência FAPESP Newsletter about the scholarship: | |
| More itemsLess items | |
| TITULO | |
| Articles published in other media outlets ( ): | |
| More itemsLess items | |
| VEICULO: TITULO (DATA) | |
| VEICULO: TITULO (DATA) | |