Andrej Karpathy Launches “nanochat”: A ChatGPT-Like Language Model on a $100 Budget
Andrej Karpathy — founder of Eureka Labs and former co-founder of OpenAI — has caused a major stir with the release of nanochat: a full-stack language model project similar to ChatGPT, but in a small and simplified software package!
This isn’t just a language model — it’s a complete pipeline that covers all stages of building a conversational AI:
✴️ Tokenization training built using Rust for optimal efficiency.
✴️ Initial training on open datasets, followed by training for dialogues and tool usage.
✴️ Fine-tuning, and then inference using efficient techniques with KV Cache support.
✴️ A simple web-based UI to interact with your custom model — just like ChatGPT.
💡 The project emphasizes clean, minimalist code — fewer than 8,000 lines of PyTorch and Python — making it a perfect educational tool. Karpathy aims to use nanochat as the core project for his upcoming course LLM101n.
Forget massive budgets — a chat-capable version of the model can be trained on an 8xH100 GPU node in just 4 hours, costing only around $100 😍
This exciting development opens the door for developers and researchers to build and experiment with large language models locally, at low cost — advancing the democratization of AI technology!
GitHub link 🔗👇
https://github.com/karpathy/nanochat