Astro - Hacker News

13 comments

kraddypatties an hour ago

I feel like most of this recent Autoresearch trend boils down to reinventing hyper-parameter tuning. Is the SOTA still Bayesian optimization when given a small cluster? It was ~3 years ago when I was doing this kind of work, haven't kept up since then.
Also, shoutout SkyPilot! It's been a huge help for going multi-cloud with our training and inference jobs (getting GPUs is still a nightmare...)!
[-]
- karpathy 9 minutes ago
  
  Wrong and short-sighted take given that the LLM explores serially learning along the way, and can tool use and change code arbitrarily. It seems to currently default to something resembling hyperparameter tuning in absence of more specific instructions. I briefly considered calling the project “autotune” at first but I think “autoresearch” will prove to be the significantly more appropriate name.
- ipsum2 30 minutes ago
  
  Hyperparam tuning that has better intuition and can incorporate architecture changes automatically. It won't invent something completely new though.
  [-]
  - kraddypatties 14 minutes ago
    
    Hm, that's fair. It does feel like there's low hanging fruit in combining "old school" methods for conducting a hyperparameter sweep efficiently _with_ the higher level architecture edit ability of Autoresearch.
    Probably would cut the number of runs down by a significant number (as far as I can tell it's doing a grid search once it decides to mess with a knob or section of the architecture).
zhwu an hour ago

The most surprising part: the agent had access to both H100s and H200s. Without being told, it noticed H200s scored better and started screening ideas on H100s, then promoting winners to H200s for validation. That strategy emerged entirely on its own.
[-]
- rogerrogerr an hour ago
  
  Why do we think this emerged “on its own”? Surely this technique has been discussed in research papers that are in the training set.
- hhh 27 minutes ago
  
  Why?… The experiment.yaml shows that it is calling h100/200 explicitly, it’s pretty common for humans to say “number bigger more gooder” for anything… Lie and reverse the values and see what happens. I would put money on a rabbit hole of complaining about it being misconfigured.
  [-]
  - ed 8 minutes ago
    
    Models are familiar with H100’s. They even predate ChatGPT.
- Aboutplants an hour ago
  
  Yeah I thought that was a particularly neat part
ipsum2 35 minutes ago

A cluster is 2 nodes? That's technically true, but not very exciting.
covi an hour ago

This feels like the chimpanzee with a power drill. An agent is honestly just brute-force search, but guided.
[-]
- chaos_emergent 11 minutes ago
  
  Human-driven research is also brute-force but with a more efficient search strategy. One can think of a parameter that represents research-search-space-navigation efficiency. RL-trained agents will inevitably optimize for that parameter. I agree with your statement insomuch as the value of that efficiency parameter is lower for agents than humans today.
  It's really hard to imagine that they __won't__ exceed the human value for that efficiency parameter rather soon given that 1. there are plenty of scalar value functions that can represent research efficiency, of which a subset will result in robust training, and 2. that AI labs have a massive incentive to increase their research efficiency overall, along with billions of dollars and really good human researchers working on the problem.
- robotresearcher 18 minutes ago
  
  > An agent is honestly just brute-force search, but guided.
  Heuristic search, then.