Listen "EP04: A Product Developer, Blind Technology Advocate, and Computer Vision Researcher Discuss the Future for Visual Interpretation Technologies (Stephanie, James, Karthik)"
Episode Synopsis
Brief summary of the episode
Stephanie Enyart, James Coughlan, and Karthik Kannan share very diverse perspectives around the development of visual interpretation technologies to meet the interests and needs of people with vision impairments.
Questions asked in the episode
[02:24] Could you share about what has surprised you the most about progress that has taken place over the past 10-20 years around technologies that provide visual assistance to real-world users?
[08:25] What do you see as the current limiting factor or barriers in developing better visual interpretation technologies?
[15:00] Could you describe how you envision technology will work in 10 years for interpreting visual information for real-world users? For example, what skills will the technology have? Also, how will the technology deliver information, such as via a live video feed or augmented reality or something else?
[23:18] Could you discuss how you think we should decide what information to include in a visual description?
[32:28] I next want to dig into one of the issues that is critical for designing vision assistance technology, which is access to large datasets from people with vision impairments to support evaluation and training of computer vision models. What are your expectations about how such datasets can be built responsibly and any experience you have in building such datasets?
[41:55] Could you please share about to what extent each of you already have conversations with or collaborate with researchers, industry developers, and blind technology advocates to advance products and services that can advance visual assistance products and services? What do you find works well versus does not work well in these collaborations or conversations?
Guest bios
Stephanie Enyart is the Chief Public Policy and Research Officer at the American Foundation for the Blind. Stephanie serves as a strategic leader in developing policy that benefits people who are blind in education, employment, aging, and the intersectional issues of technology and transportation.
James Coughlan is a Senior Scientist at the Smith-Kettlewell Eye Research Institute, with a PhD in Physics from Harvard University. James has been at Smith-Kettlewell since 1998 and over this time has developed a wide array of impactful technologies for the blind and low-vision community.
Karthik Kannan is the co-founder and chief technology officer of Envision, a company that provides technology to help people with visual impairments in their daily lives. His company builds an app and smart glasses which helps people with visual impairments learn about their surroundings including to read text and recognize faces.
Danna Gurari is an Assistant Professor at University of Colorado Boulder where she also leads the Image and Video Computing research group.
Links to resources mentioned
https://vizwiz.org/workshops/2022-workshop/
Stephanie Enyart, James Coughlan, and Karthik Kannan share very diverse perspectives around the development of visual interpretation technologies to meet the interests and needs of people with vision impairments.
Questions asked in the episode
[02:24] Could you share about what has surprised you the most about progress that has taken place over the past 10-20 years around technologies that provide visual assistance to real-world users?
[08:25] What do you see as the current limiting factor or barriers in developing better visual interpretation technologies?
[15:00] Could you describe how you envision technology will work in 10 years for interpreting visual information for real-world users? For example, what skills will the technology have? Also, how will the technology deliver information, such as via a live video feed or augmented reality or something else?
[23:18] Could you discuss how you think we should decide what information to include in a visual description?
[32:28] I next want to dig into one of the issues that is critical for designing vision assistance technology, which is access to large datasets from people with vision impairments to support evaluation and training of computer vision models. What are your expectations about how such datasets can be built responsibly and any experience you have in building such datasets?
[41:55] Could you please share about to what extent each of you already have conversations with or collaborate with researchers, industry developers, and blind technology advocates to advance products and services that can advance visual assistance products and services? What do you find works well versus does not work well in these collaborations or conversations?
Guest bios
Stephanie Enyart is the Chief Public Policy and Research Officer at the American Foundation for the Blind. Stephanie serves as a strategic leader in developing policy that benefits people who are blind in education, employment, aging, and the intersectional issues of technology and transportation.
James Coughlan is a Senior Scientist at the Smith-Kettlewell Eye Research Institute, with a PhD in Physics from Harvard University. James has been at Smith-Kettlewell since 1998 and over this time has developed a wide array of impactful technologies for the blind and low-vision community.
Karthik Kannan is the co-founder and chief technology officer of Envision, a company that provides technology to help people with visual impairments in their daily lives. His company builds an app and smart glasses which helps people with visual impairments learn about their surroundings including to read text and recognize faces.
Danna Gurari is an Assistant Professor at University of Colorado Boulder where she also leads the Image and Video Computing research group.
Links to resources mentioned
https://vizwiz.org/workshops/2022-workshop/
ZARZA We are Zarza, the prestigious firm behind major projects in information technology.