Computer Vision | Theory: How Computers See in the Real World

17/11/2025 36 min Temporada 2 Episodio 20

Listen "Computer Vision | Theory: How Computers See in the Real World"

Descargar episodio Ver en sitio original

Episode Synopsis

In this episode of Big Ideas Only, host Mikkel Svold takes a theoretical deep dive into how computers “see” with Andreas Møgelmose (Associate Professor of AI, Aalborg University; Visual Analysis & Perception Lab).We unpack the neural-network ideas behind modern vision, why 2012 was a turning point, how convolutional networks work, the difference between training, fine-tuning and adding context, plus explainability, bias traps, multimodality, and what still needs solving.In this episode, you’ll learn about:How a 2012 vision breakthrough reshaped speech and language research2. Neural networks explained simply — how they learn patterns from data3. CNNs: how computers spot shapes and textures in images4. Training, fine-tuning, and adding context to make models smarter5. From hand-crafted features to fully data-driven learning6. Explainability: the “ruler in skin-cancer photos” bias trap and what it teaches us7. Multimodal systems: models combining text, images, and tools8. Depth sensing with stereo, lidar, radar, and time-of-flight — and when 3D is essential9. Privacy and governance: why real risk lies in implementation, not vision itself10. Open challenges: fine-grained recognition, explainability, and machine unlearning11. The pace of progress: steady research with headline-making leapsEpisode Content01:09 How computer vision differs from other AI fields01:16 The 2012 breakthrough: neural networks in vision that spread to speech and text04:05 Neural networks 101: neurons, weights, and simple math scaled up to complex decisions07:06 Training at scale: millions of images, pretraining, and fine-tuning for specific tasks10:39 Fine-tuning vs. adding context in large language models; backpropagation explained16:52 Layered learning: from edges to shapes, faces, and full objects18:22 Before deep learning: feature engineering and why it hit its limits20:44 How it’s built: data collection, architecture design, training loops, and learning plateaus22:54 Bias pitfalls: the “ruler in skin-cancer photos” example and why explainability matters25:23 Regulation and trust: high-risk uses and the demand for transparency26:13 Connecting vision to action: from black-box outputs to robots with “vision in the loop”27:41 Ensemble systems: language models coordinating other models (e.g., text-to-image)29:03 True multimodality: training models jointly on text and images30:17 AGI reflections: embodiment, experience, and the limits of data32:44 Human vision vs. computer vision: depth of field, aperture, and why machines see everything in focus34:40 Is progress slowing or steady? Research milestones versus quiet, continuous work36:43 Public perception: many versions, but most still see “just ChatGPT”37:41 Why the research pace feels natural — more people means faster progressThis podcast is produced by Montanus.

More episodes of the podcast Big Ideas Only

Computer Vision | Practise: How Computers See in the Real World 10/11/2025

3D Printing | Theory: Going deeper into the Print 03/11/2025

3D Printing | Practice: From Hobby Tech to Industrial Workhorse 27/10/2025

Dynamic Power-to-X: Turning Surplus Wind & Solar into Ammonia. 20/10/2025

Fusion Energy | Practice: The Future of Power or Just a Scientific Dream? 13/10/2025

Fusion Energy | Theory: The Future of Power or Just a Scientific Dream? 13/05/2025

Artificial Intelligence: Maybe the Most Significant Thing Since the Wheel 11/01/2023

Nuclear Power | Now or Never 03/01/2023

VR AR – Shortcomings and possibilities 30/11/2022

Virtual Reality - Fad or Future 23/11/2022

Ver todos los episodios

ZARZA We are Zarza, the prestigious firm behind major projects in information technology.

Computer Vision | Theory: How Computers See in the Real World

Listen "Computer Vision | Theory: How Computers See in the Real World"

Episode Synopsis

More episodes of the podcast Big Ideas Only

Email on your own domain, luxury or need?

Localhost, there’s no place like 127.0.0.1

Bandwidth: Broadband or Narrowband?

Personnel recruitment via Web

Deep web or Invisible Internet

Subdomains, a glance with the experts!

Free Internet, a prediction in Nostradamus style

Educational Technology: From traditional to digital

Localhost, there’s no place like 127.0.0.1

Googling with breathtaking tricks you ignore

Gray Hat Hacking, those with ambiguous ethics…

Internet Predators on the prowl

Dot COM: The Internet’s dominant TLD