Data Lakes

22/11/2021 24 min Temporada 2 Episodio 12

Listen "Data Lakes"

Episode Synopsis

Send us a textThis week we talk about data lakes. Essentially, a data lake is a mechanism to store large quantities of (typically) raw data, both structured and unstructured, bringing together data from across an organisation.In a "traditional" data warehouse solution, we tend to think about an "Extract, Transform and Load " process, extracting the data from source, transforming it for analysis, and loading it into the data warehouse. With a data lake, the approach tends to be "Extract, Load, and Transform", data is extracted from source, loaded into the data lake, then transformed when needed. This can simplify the process as there is no need to transform it for every scenario at build time - so we can speed up implementation. The down side of course is that we have to do more work at run time. As such, there is probably not an either/or situation with data lakes vs more structured systems.The flexibility of data lakes makes it tempting to dump anything and everything into the data lake. If this starts to happen without any curation, you are likely to end up in more of a data swamp. Data lakes are not a way to avoid governance.The main cloud players all offer some sort of data lake:Azure Data LakeAWS Data LakeGoogle Data LakeIf you already use Power BI, or are considering it, we strongly recommend you join your local Power BI user group here.To find out more about our services and the help we can offer, contact us at one of the websites below:UK and Europe: https://www.clearlycloudy.co.uk/North America: https://www.clearlysolutions.net/

More episodes of the podcast The Clearly Podcast

So You Wanna be a BI Consultant? (2024 Edition) 01/07/2024

The 2024 IT Consulting Job Market 24/06/2024

Azure vs Fabric 17/06/2024

Power Apps and Power Pages- Demystifying the Options 10/06/2024

Data Quality Dashboards - aka The Snitch Report 03/06/2024

Dealing With a Mess of Data Systems and Products 27/05/2024

Working Outside the Microsoft Stack 20/05/2024

Choosing a Cloud Provider 13/05/2024

Getting Off the Excel Mindset 06/05/2024

The Democratisation of Data - A Revolution in Analytics, or Marketing BS? 19/02/2024

Ver todos los episodios

ZARZA We are Zarza, the prestigious firm behind major projects in information technology.

Data Lakes

Listen "Data Lakes"

Episode Synopsis

More episodes of the podcast The Clearly Podcast

White Hat Hacking, Ethical Hackers…

Information Technology (IT)

Bandwidth: Broadband or Narrowband?

Personnel recruitment via Web

Deep web or Invisible Internet

Subdomains, a glance with the experts!

Free Internet, a prediction in Nostradamus style

Educational Technology: From traditional to digital

Localhost, there’s no place like 127.0.0.1

Googling with breathtaking tricks you ignore

Gray Hat Hacking, those with ambiguous ethics…

Internet Predators on the prowl

Dot COM: The Internet’s dominant TLD