Advanced search
Start date
Betweenand

Dynamic Management of Last-Mile On-Demand Transit: Designing Transferable Models for Deep Reinforcement Learning Algorithms

Grant number: 24/18526-3
Support Opportunities:Scholarships in Brazil - Master
Start date: April 01, 2025
End date: July 31, 2026
Field of knowledge:Physical Sciences and Mathematics - Computer Science - Computing Methodologies and Techniques
Principal Investigator:Fabio Kon
Grantee:Gustavo Henrique Santos Rodrigues
Host Institution: Instituto de Matemática e Estatística (IME). Universidade de São Paulo (USP). São Paulo , SP, Brazil
Associated research grant:23/00811-0 - EcoSustain: computer and data science for the environment, AP.TEM

Abstract

Technological advancements have enabled the advent of mobility-on-demand (MoD) systems, which are mostly used in ride-hailing platforms. There is increasing interest in applying this concept to public transportation, particularly in the context of sustainability concerns like reducing carbon footprints. One innovative approach is the implementation of on-demand transit (ODTs). For instance, research is exploring the use of medium-sized vehicles, such as small buses or vans, to improve first and last-mile connectivity to public transport hubs. However, dynamically managing these systems poses significant challenges due to varying demand patterns, uncertain traffic conditions, and the need for effective vehicle assignment, routing, and fleet rebalancing. A possible approach is the use of Deep RL and trajectory transformer algorithms to select target routes for ODTs. The main drawback is that algorithms are trained specifically for a target region, which limits their applicability, especially given the high demand for training data. In this project, we will propose and evaluate transferable state representations for transit hub integration with last-mile ODT. We will use the representations to train trajectory transformer models to generate actions from a sequence of past states and rewards and evaluate them using as metrics the accuracy of selected actions, improvements in accuracy when training using data from multiple regions, and transferability of learned models. Finally, we will deploy the trajectory transformer model as an agent for controlling a fleet of ODTs in synthetic and real-world scenarios and evaluate the performance with and without online fine-tuning, comparing it to existing assignment and repositioning algorithms.

News published in Agência FAPESP Newsletter about the scholarship:
More itemsLess items
Articles published in other media outlets ( ):
More itemsLess items
VEICULO: TITULO (DATA)
VEICULO: TITULO (DATA)