| Grant number: | 12/50468-6 |
| Support Opportunities: | Research Grants - Research Partnership for Technological Innovation - PITE |
| Start date: | August 01, 2012 |
| End date: | July 31, 2014 |
| Field of knowledge: | Physical Sciences and Mathematics - Computer Science - Computing Methodologies and Techniques |
| Agreement: | Microsoft Research |
| Principal Investigator: | Siome Klein Goldenstein |
| Grantee: | Siome Klein Goldenstein |
| Host Institution: | Instituto de Computação (IC). Universidade Estadual de Campinas (UNICAMP). Campinas , SP, Brazil |
| City of the host institution: | Campinas |
| Partner institutions: | Microsoft |
| Associated scholarship(s): | 13/21349-1 - Vision for the blind: translating 3D visual concepts into 3D auditory clues,
BP.MS 12/22653-3 - Vision for the blind: translating 3D visual concepts into 3D auditory clues, BP.MS |
Abstract
The goal of this project is to construct and validate a complete proof-of-concept assistive device for the blind and low-vision. The device is based on translating visual information into auditory information. The key problem in translating visual information into auditory is one of bandwidth, which is order of magnitudes higher in the visual system when compared to the auditory system. We believe this is, in essence, what has made most of the previous sensory substitution proposals fail. In this project, we propose to circumvent that by using two key concepts: 1) using computer vision to "simplify" the visual scene, and 2) using 3D audio to exploit the inherent special sense of the auditory system. This system will use computer vision algorithms to extract high-level information and will communicate this information using different codification approaches, but exploring 30 audio capabilities to provide spatial localization. The hardware component of this system will combine an off-the-shelf image + depth camera (Microsoft Kinect), an accelerometer/gyroscope, a headphone, and a notebook. The software component will be modular and extensible. The system will have distinct modes of operations, to provide specialized functionalities such as navigation, people localization, and textual information translation, such as signs and currency identification. Each of these modes has different requirements - in the computer vision side to extract the desired high-level information of the environment, and in the 30 audio to best communicate the desired information. A fully operational system presents significant scientific and technological challenges, from the development and proper Integration of computer vision algorithms to the best design and user-validation of the audio interfaces. (AU)
| Articles published in Agência FAPESP Newsletter about the research grant: |
| More itemsLess items |
| TITULO |
| Articles published in other media outlets ( ): |
| More itemsLess items |
| VEICULO: TITULO (DATA) |
| VEICULO: TITULO (DATA) |