> The reason I put off starting the series for so long is one of the same reasons blocking the writing of the paper: some of the introductory material is some of the most difficult to write. It has been such a long time that I no longer know how to adequately explain why the problem is so difficult.
So the model generates code, and let's say it is wrongly typed, we then take the rightly typed version and use cross entropy between them? Is that right? That just sounds like the typical training, unless you can somehow take arbitrary code that the model generated and automatically find the rightly typed version, so you won't need a dataset for it
Rather than letting the model generate arbitrary code and type-checking it afterward, the author wants to pre-restrict the output with templates that are well-typed by construction and only let the model make choices between valid alternatives in that restricted output space.
Related:
https://cybercat.institute/2025/05/07/neural-alchemy/
https://cybercat.institute/2026/02/20/categorical-semantics-...
https://cybercat.institute/2025/10/16/dependent-optics-ii/
> The reason I put off starting the series for so long is one of the same reasons blocking the writing of the paper: some of the introductory material is some of the most difficult to write. It has been such a long time that I no longer know how to adequately explain why the problem is so difficult.
My sympathies to Jules
So the model generates code, and let's say it is wrongly typed, we then take the rightly typed version and use cross entropy between them? Is that right? That just sounds like the typical training, unless you can somehow take arbitrary code that the model generated and automatically find the rightly typed version, so you won't need a dataset for it
Rather than letting the model generate arbitrary code and type-checking it afterward, the author wants to pre-restrict the output with templates that are well-typed by construction and only let the model make choices between valid alternatives in that restricted output space.