EP05: A Product Developer, Blind Technology Advocate, and Computer Vision Researcher Discuss the Future for Visual Interpretation Technologies (Daniel, Andrew, Karthik)

01/08/2022 42 min Temporada 1 Episodio 5

Listen "EP05: A Product Developer, Blind Technology Advocate, and Computer Vision Researcher Discuss the Future for Visual Interpretation Technologies (Daniel, Andrew, Karthik)"

Episode Synopsis

Brief summary of the episode
Daniel Kish, Andrew Howard, and Will Butler share very diverse perspectives around the development of visual interpretation technologies to meet the interests and needs of people with vision impairments.
Questions asked in the episode

[03:04] Could you share about what has surprised you the most about progress that has taken place over the past 10-20 years around technologies that provide visual assistance to real-world users?
[08:10] What do you see as the current limiting factor or barriers in developing better visual interpretation technologies?
[12:21] Could you describe how you envision technology will work in 10 years for interpreting visual information for real-world users?  For example, what skills will the technology have?  Also, how will the technology deliver information, such as via a live video feed or augmented reality or something else?
[18:34] Could you discuss how you think we should decide what information to include in a visual description?
[26:44] I next want to dig into one of the issues that is critical for designing vision assistance technology, which is access to large datasets from people with vision impairments to support evaluation and training of computer vision models. What are your expectations about how such datasets can be built responsibly and any experience you have in building such datasets?
[35:18] Could you please share about to what extent each of you already have conversations with or collaborate with researchers, industry developers, and blind technology advocates to advance products and services that can advance visual assistance products and services?  What do you find works well versus does not work well in these collaborations or conversations?

Guest bios
Daniel Kish is the President of World Access for the Blind.  He is a world leader in perceptual navigation and ecolocation, through which he has developed his own method of generating vocal clicks and using echoes to identify his surroundings and navigate.
Andrew Howard is a Senior Staff Software Engineer at Google Research, with a PhD in Computer Science from Columbia University. Andrew is most well-known for his work in mobile-friendly deep learning models. Starting with MobileNets, then MobileNetsV2, then MobileNetsV3, and also MnasNets, his work has been broadly adopted in deep learning packages like PyTorch and Tensorflow as well as across a host of mobile phone platforms and apps.
Will Butler is the Chief Experience Officer at Be My Eyes, a free app with one of the largest online communities, that supports visually impaired individuals to get free, on-demand, live video support from around 4 million volunteers and companies.  Will also has hosted two podcasts on the topic of vision loss and accessibility.
Ed Cutrell is a Senior Principal Research Manager at Microsoft Research (MSR), where he leads the MSR Ability Team, a group of researchers focused on innovating new technologies for people with a range of disabilities.
Links to resources mentioned

https://vizwiz.org/workshops/2022-workshop/