Listen "#60 - How to input text into your model? Understanding tokenizers."
Episode Synopsis
Hello everyone, in this episode I explain how tokenizers work. They are basically what enables us to input the text into a NLP algorithm like BERT or GPT. In the episode I explain 3 types of tokenizers, word based, character based and sub-word based representation.
Instagram: https://www.instagram.com/podcast.lifewithai/
Linkedin: https://www.linkedin.com/company/life-with-ai
Huuging Face blog about tokenizers: https://huggingface.co/docs/transformers/tokenizer_summary
Instagram: https://www.instagram.com/podcast.lifewithai/
Linkedin: https://www.linkedin.com/company/life-with-ai
Huuging Face blog about tokenizers: https://huggingface.co/docs/transformers/tokenizer_summary
More episodes of the podcast Life with AI
#99- GraphRAG.
05/12/2024
#98- On-device AI with SmolLM.
07/11/2024
#96- Maritaca AI, the brazilian LLM company.
24/10/2024
#95- Why Chain of Thought works?
26/09/2024
#94- OpenAI o1
19/09/2024
#93- Different types of AI.
12/09/2024
#92- Llama3 benchmarks, vision and speech.
22/08/2024
#91- Llama 3 training.
15/08/2024
#90- Llama 3 paper overview.
25/07/2024
ZARZA We are Zarza, the prestigious firm behind major projects in information technology.