Advanced search
Start date
Betweenand

Vision for the blind: translating 3D visual concepts into 3D auditory clues

Abstract

The goal of this project is to construct and validate a complete proof-of-concept assistive device for the blind and low-vision. The device is based on translating visual information into auditory information. The key problem in translating visual information into auditory is one of bandwidth, which is order of magnitudes higher in the visual system when compared to the auditory system. We believe this is, in essence, what has made most of the previous sensory substitution proposals fail. In this project, we propose to circumvent that by using two key concepts: 1) using computer vision to "simplify" the visual scene, and 2) using 3D audio to exploit the inherent special sense of the auditory system. This system will use computer vision algorithms to extract high-level information and will communicate this information using different codification approaches, but exploring 30 audio capabilities to provide spatial localization. The hardware component of this system will combine an off-the-shelf image + depth camera (Microsoft Kinect), an accelerometer/gyroscope, a headphone, and a notebook. The software component will be modular and extensible. The system will have distinct modes of operations, to provide specialized functionalities such as navigation, people localization, and textual information translation, such as signs and currency identification. Each of these modes has different requirements - in the computer vision side to extract the desired high-level information of the environment, and in the 30 audio to best communicate the desired information. A fully operational system presents significant scientific and technological challenges, from the development and proper Integration of computer vision algorithms to the best design and user-validation of the audio interfaces. (AU)

Articles published in Agência FAPESP Newsletter about the research grant:
More itemsLess items
Articles published in other media outlets ( ):
More itemsLess items
VEICULO: TITULO (DATA)
VEICULO: TITULO (DATA)

Scientific publications (6)
(References retrieved automatically from Web of Science and SciELO through information on FAPESP grants and their corresponding numbers as mentioned in the publications by the authors)
SILVA, FERNANDA B.; WERNECK, RAFAEL DE O.; GOLDENSTEIN, SIOME; TABBONE, SALVATORE; TORRES, RICARDO DA S.. Graph-based bag-of-words for classification. PATTERN RECOGNITION, v. 74, p. 266-285, . (16/18429-1, 12/50468-6, 13/11378-4, 13/50155-0, 14/12236-1, 12/16172-2, 13/50169-1)
GRIJALVA, FELIPE; MARTINI, LUIZ CESAR; MASIERO, BRUNO; GOLDENSTEIN, SIOME. A Recommender System for Improving Median Plane Sound Localization Performance Based on a Nonlinear Representation of HRTFs. IEEE ACCESS, v. 6, p. 24829-24836, . (12/50468-6, 14/14630-9, 13/21349-1)
GRIJALVA, FELIPE; MARTINI, LUIZ CESAR; FLORENCIO, DINEI; GOLDENSTEIN, SIOME. Interpolation of Head-Related Transfer Functions Using Manifold Learning. IEEE SIGNAL PROCESSING LETTERS, v. 24, n. 2, p. 221-225, . (12/50468-6, 13/21349-1, 14/14630-9)
NETO, LAURINDO BRITTO; GRIJALVA, FELIPE; MARGARETH LIMA MAIKE, VANESSA REGINA; MARTINI, LUIZ CESAR; FLORENCIO, DINEI; CALANI BARANAUSKAS, MARIA CECILIA; ROCHA, ANDERSON; GOLDENSTEIN, SIOME. A Kinect-Based Wearable Face Recognition System to Aid Visually Impaired Users. IEEE TRANSACTIONS ON HUMAN-MACHINE SYSTEMS, v. 47, n. 1, p. 52-64, . (12/50468-6, 15/19222-9, 13/21349-1, 14/14630-9)
ANDALO, FERNANDA A.; TAUBIN, GABRIEL; GOLDENSTEIN, SIOME. PSQP: Puzzle Solving by Quadratic Programming. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, v. 39, n. 2, p. 385-396, . (12/50468-6)
ANDALO, FERNANDA A.; TAUBIN, GABRIEL; GOLDENSTEIN, SIOME. Efficient height measurements in single images based on the detection of vanishing points. COMPUTER VISION AND IMAGE UNDERSTANDING, v. 138, p. 51-60, . (12/50468-6)