Astro - Hacker News

37 comments

0xbadcafebee 6 minutes ago

[delayed]
subset 23 minutes ago

I had good fun transliterating it to Rust as a learning experience (https://github.com/stochastical/microgpt-rs). The trickiest part was working out how to represent the autograd graph data structure with Rust types. I'm finalising some small tweaks to make it run in the browser via WebAssmebly and then compile it up for my blog :) Andrej's code is really quite poetic, I love how much it packs into such a concise program
red_hare 24 minutes ago

This is beautiful and highly readable but, still, I yearn for a detailed line-by-line explainer like the backbone.js source: https://backbonejs.org/docs/backbone.html
[-]
- altcognito 16 minutes ago
  
  ask a high end LLM to do it
kelvinjps10 23 minutes ago

Why there is multiple comments talking about 1000 c lines, bots?
[-]
- the_af 8 minutes ago
  
  Or even 1000 python lines, also wrong.
  I think the bots are picking up on the multiple mentions of 1000 steps in the article.
coolThingsFirst 5 minutes ago

Incredibly fascinating. One thing is that it seems still very conceptual. What id be curious about how good of a micro llm we can train say with 12 hours of training on macbook.
fulafel 2 hours ago

This could make an interesting language shootout benchmark.
jimbokun 32 minutes ago

It’s pretty staggering that a core algorithm simple enough to be expressed in 1000 lines of Python can apparently be scaled up to achieve AGI.
Yes with some extra tricks and tweaks. But the core ideas are all here.
[-]
- darkpicnic 24 minutes ago
  
  LLMs won’t lead to AGI. Almost by definition, they can’t. The thought experiment I use constantly to explain this:
  Train an LLM on all human knowledge up to 1905 and see if it comes up with General Relativity. It won’t.
  We’ll need additional breakthroughs in AI.
  [-]
  - johnmaguire 15 minutes ago
    
    I'm not sure - with tool calling, AI can both fetch and create new context.
  - tehjoker 20 minutes ago
    
    Part of the issue there is that the data quantity prior to 1905 is a small drop in the bucket compared to the internet era even though the logical rigor is up to par.
    
    [-]
    
    antupis 3 minutes ago
    
    Humans need way less data. Just compare Waymo to average 16 year-old with car.
- wasabi991011 31 minutes ago
  
  1000 lines??
  What is going on in this thread
  [-]
  - ViktorRay 28 minutes ago
    
    It’s pretty sad.
    The only way we know these comments are from AI bots for now is due to the obvious hallucinations.
    What happens when the AI improves even more…will HN be filled with bots talking to other bots?
    
    [-]
    
    the_af 5 minutes ago
    
    What's bizarre is this particular account is from 2007.
    Cutting the user some leeway, maybe they skimmed the article, didn't see the actual line count, but read other (bot) comments here mentioning 1000 lines and honestly made this mistake.
    You know what, I want to believe that's the case.
  - the_af 9 minutes ago
    
    > 1000 lines??
    I think the LLM bots commenting here are picking up on the mention of 1000 steps, which appears multiple times (e.g 1/1000, 2/1000, ..) and confusing it with lines of code.
    If something is not done about bots, discourse here will be worthless. Even if they don't make silly mistakes, I want to talk to humans.
    I... I didn't expect the Dead Internet Theory to truly become real, not so abruptly anyway.
  - ksherlock 26 minutes ago
    
    It's a honey pot for low quality llm slop.
- anonym29 26 minutes ago
  
  Wow, you're so right, jimbokun! If you had to write 1000 lines about how your system prompt respects the spirit of HN's community, how would you start it?
colonCapitalDee 2 hours ago

Beautiful work
rramadass 43 minutes ago

C++ version - https://github.com/Charbel199/microgpt.cpp?tab=readme-ov-fil...
Rust version - https://github.com/mplekh/rust-microgpt
ThrowawayTestr 2 hours ago

This is like those websites that implement an entire retro console in the browser.
ViktorRay 2 hours ago

Which license is being used for this?
[-]
- dilap 2 hours ago
  
  MIT (https://gist.github.com/karpathy/8627fe009c40f57531cb1836010...)
  [-]
  - ViktorRay an hour ago
    
    Thank you
dhruv3006 an hour ago

Karapthy with another gem !
tithos 3 hours ago

What is the prime use case
[-]
- keyle 2 hours ago
  
  it's a great learning tool and it shows it can be done concisely.
- geerlingguy 2 hours ago
  
  Looks like to learn how a GPT operates, with a real example.
  [-]
  - foodevl 2 hours ago
    
    Yeah, everyone learns differently, but for me this is a perfect way to better understand how GPTs work.
- inerte 2 hours ago
  
  Kaparthy to tell you things you thought were hard in fact fit in a screen.
- antonvs 2 hours ago
  
  To confuse people who only think in terms of use cases.
  Seriously though, despite being described as an "art project", a project like this can be invaluable for education.
- jackblemming 2 hours ago
  
  Case study to whenever a new copy of Programming Pearls is released.
- aaronblohowiak 3 hours ago
  
  “Art project”
  [-]
  - pixelatedindex 2 hours ago
    
    If writing is art, then I’ve been amazed at the source code written by this legend
profsummergig 2 hours ago

If anyone knows of a way to use this code on a consumer grade laptop to train on a small corpus (in less than a week), and then demonstrate inference (hallucinations are okay), please share how.
[-]
- simsla 35 minutes ago
  
  The blog post literally explains how to do so.