I think this post does a really good job of covering how multi-pronged performance is: it certainly doesn't hurt uv to be written in Rust, but it benefits immensely from a decade of thoughtful standardization efforts in Python that lifted the ecosystem away from needing `setup.py` on the hot path for most packages.
Someone once told me a benefit of staffing a project for Haskell was it made it easy to select for the types of programmers that went out of their way to become experts in Haskell.
Tapping the Rust community is a decent reason to do a project in Rust.
It's an interesting debate. The flip side of this coin is getting hires who are more interested in the language or approach than the problem space and tend to either burn out, actively dislike the work at hand, or create problems that don't exist in order to use the language to solve them.
With that said, Rust was a good language for this in my experience. Like any "interesting" thing, there was a moderate bit of language-nerd side quest thrown in, but overall, a good selection metric. I do think it's one of the best Rewrite it in X languages available today due to the availability of good developers with Rewrite in Rust project experience.
The Haskell commentary is curious to me. I've used Haskell professionally but never tried to hire for it. With that said, the other FP-heavy languages that were popular ~2010-2015 were absolutely horrible for this in my experience. I generally subscribe to a vague notion that "skill in a more esoteric programming language will usually indicate a combination of ability to learn/plasticity and interest in the trade," however, using this concept, I had really bad experiences hiring both Scala and Clojure engineers; there was _way_ too much academic interest in language concepts and way too little practical interest in doing work. YMMV :)
Paul Graham said the same thing about Python 20 years ago [1], and back then it was true. But once a programming langauge hits mainstream, this ceases to be a good filter.
This is important. The benefit here isn't the language itself. It's the fact that you're pulling from an esoteric language. People should not overfit and feel that whichever language is achieving that effect today is special in this regard.
I'm my experience this is definitely where rust shined. The language wasn't really what made the project succeed so much as having relatively curious, meticulous, detail-oriented people on hand who were interested in solving hard problems.
Sometimes I thought our teams would be a terrible fit for more cookie-cutter applications where rapid development and deployment was the primary objective. We got into the weeds all the time (sometimes because of rust itself), but it happened to be important to do so.
Had we built those projects with JavaScript or Python I suspect the outcomes would have been worse for reasons apart from the language choice.
I think a lot of rust rewrites have this benefit; if you start with hindsight you can do better more easily. Of course, rust is also often beneficial for its own sake, so it's a one-two punch:)
In Python's case, as the article describes quite clearly, the issue is that the design of "working software" (particularly setup.py) was bad to the point of insane (in much the same way as the NPM characteristics that enabled the recent Shai Hulud supply chain attacks, but even worse). At some point, compatibility with insanity has got to go.
Helpfully, though, uv retains compatibility with newer (but still well-established) standards in the Python community that don't share this insanity!
> uv is fast because of what it doesn’t do, not because of what language it’s written in. The standards work of PEP 518, 517, 621, and 658 made fast package management possible. Dropping eggs, pip.conf, and permissive parsing made it achievable. Rust makes it a bit faster still.
Isn't assigning out what all made things fast presumptive without benchmarks? Yes, I imagine a lot is gained by the work of those PEPs. I'm more questioning how much weight is put on dropping of compatibility compared to the other items. There is also no coverage for decisions influenced by language choice which likely influences "Optimizations that don’t need Rust".
This also doesn't cover subtle things. Unsure if rkyv is being used to reduce the number of times that TOML is parsed but TOML parse times do show up in benchmarks in Cargo and Cargo/uv's TOML parser is much faster than Python's (note: Cargo team member, `toml` maintainer). I wish the TOML comparison page was still up and showed actual numbers to be able to point to.
> Isn't assigning out what all made things fast presumptive without benchmarks?
We also have the benchmark of "pip now vs. pip years ago". That has to be controlled for pip version and Python version, but the former hasn't seen a lot of changes that are relevant for most cases, as far as I can tell.
> This also doesn't cover subtle things. Unsure if rkyv is being used to reduce the number of times that TOML is parsed but TOML parse times do show up in benchmarks in Cargo and Cargo/uv's TOML parser is much faster than Python's (note: Cargo team member, `toml` maintainer). I wish the TOML comparison page was still up and showed actual numbers to be able to point to.
This is interesting in that I wouldn't expect that the typical resolution involves a particularly large quantity of TOML. A package installer really only needs to look at it at all when building from source, and part of what these standards have done for us is improve wheel coverage. (Other relevant PEPs here include 600 and its predecessors.) Although that has also largely been driven by education within the community, things like e.g. https://blog.ganssle.io/articles/2021/10/setup-py-deprecated... and https://pradyunsg.me/blog/2022/12/31/wheels-are-faster-pure-... .
> Zero-copy deserialization. uv uses rkyv to deserialize cached data without copying it. The data format is the in-memory format. This is a Rust-specific technique.
This (zero-copy deserialization) is not a rust-specific technique, so I'm not entirely sure why the author describes it as one. Any good low level language (C/C++ included) can do this from my experience.
I think the framing in the post is that it's specific to Rust, relative to what Python packaging tools are otherwise written in (Python). It's not very easy to do zero-copy deserialization in pure Python, from experience.
(But also, I think Rust can fairly claim that it's made zero-copy deserialization a lot easier and safer.)
I suppose it can fairly claim that now every other library and blog post invokes "zero-copy" this and that, even in the most nonsensical scenarios. It's a technique for when you can literally not afford the memory bandwidth, because you are trying to saturate a 100Gbps NIC or handling 8k 60Hz video, not for compromising your data serialization schemes portability for marketing purposes while all applications hit the network first, disk second and memory bandwidth never.
Many of the hot paths in uv involve an entirely locally cached set of distributions that need to be loaded into memory, very lightly touched/filtered, and then sunk to disk somewhere else. In those contexts, there are measurable benefits to not transforming your representation.
(I'm agnostic on whether zero-copy "matters" in every single context. If there's no complexity cost, which is what Rust's abstractions often provide, then it doesn't really hurt.)
> When a package says it requires python<4.0, uv ignores the upper bound and only checks the lower. This reduces resolver backtracking dramatically since upper bounds are almost always wrong. Packages declare python<4.0 because they haven’t tested on Python 4, not because they’ll actually break. The constraint is defensive, not predictive.
This is kind of fascinating. I've never considered runtime upper bound requirements. I can think of compelling reasons for lower bounds (dropping version support) or exact runtime version requirements (each version works for exact, specific CPython versions). But now that I think about it, it seems like upper bounds solve a hypothetical problem that you'd never run into in practice.
If PSF announced v4 and declared a set of specific changes, I think this would be reasonable. In the 2/3 era it was definitely reasonable (even necessary). Today though, it doesn't actually save you any trouble.
The content is nice and insightful! But God I wish people stopped using LLMs to 'improve' their prose... Ironically, some day we might employ LLMs to re-humanize texts that had been already massacred.
The author’ blog was on HN a few days ago as well for an article on SBOMs and Lockfiles. They’ve done a lot of work in the supply-chain security side and are clearly knowledgeable, and yet the blog post got similarly “fuzzified” by the LLM.
I have reached a point where any AI smell (of which this articles has many) makes me want to exit immediately. It feels tortuous to my reading sensibilities.
I blame fixed AI system prompts - they forcibly collapse all inputs into the same output space. Truly disappointing that OpenAI et all have no desire to change this before everything on the internet sounds the same forever.
You're probably right about the latter point, but I do wonder how hard it'd be to mask the default "marketing copywriter" tone of the LLM by asking it to assume some other tone in your prompt.
As you said, reading this stuff is taxing. What's more, this is a daily occurrence by now. If there's a silver lining, it's that the LLM smells are so obvious at the moment; I can close the tab as soon as I notice one.
> do wonder how hard it'd be to mask the default "marketing copywriter" tone of the LLM by asking it to assume some other tone in your prompt.
Fairly easy, in my wife's experience. She repeatedly got accused of using chatgpt in her original writing (she's not a native english speaker, and was taught to use many of the same idioms that LLMs use) until she started actually using chatgpt with about two pages of instructions for tone to "humanize" her writing. The irony is staggering.
I remain baffled about these posts getting excited about uv’s speed. I’d like to see a real poll but I personally can’t imagine people listing speed as one of the their top ten concerns about python package managers. What are the common use cases where the delay due to package installation is at all material?
At a previous job, I recall updating a dependency via poetry would take on the order of ~5-30m. God forbid after 30 minutes something didn’t resolve and you had to wait another 30 minutes to see if the change you made fixed the problem. Was not an enjoyable experience.
> updating a dependency via poetry would take on the order of ~5-30m. God forbid after 30 minutes something didn’t resolve and you had to wait another 30 minutes to see if the change you made fixed the problem
Working heavily in Python for the last 20 years, it absolutely was a big deal. `pip install` has been a significant percentage of the deploy time on pretty much every app I've ever deployed and I've spent countless hours setting up various caching techniques trying to speed it up.
As a multi decade Python user, uv's speed is "life changing". It's a huge devx improvement. We lived with what came before, but now that I have it, I would never want to go back and it's really annoying to work on projects now that aren't using it.
`poetry install` on my dayjob’s monolith took about 2 minutes, `uv sync` takes a few seconds. Getting 2 minutes back on every CI job adds up to a lot of time saved
I can run `uvx sometool` without fear because I know that it'll take a few seconds to create a venv, download all the dependencies, and run the tool. uv's speed has literally changed how I work with Python.
Docker builds are a big one, at least at my company. Any tool that reduces wait time is worth using, and uv is an amazing tool that removes that wait time. I take it you might not use python much as it solves almost every pain point, and is fast which feels rare.
I avoided Python for years, especially because of package and environment management. Python is now my go to for projects since discovering uv, PEP 723 metadata, and LLMs’ ability to write Python.
Setting up a new dev instance took 2+ hours with pip at my work. Switching to uv dropped the Python portion down to <1 minute, and the overall setup to 20 minutes.
A similar, but less drastic speedup applied to docker images.
The speed is nice, but I switched because uv supports "pip compile" from pip-tools, and it is better at resolving dependencies. Also pip-tools uses (used?) internal pip methods and breaks frequently because of that, uv doesn't.
> Every code path you don’t have is a code path you don’t wait for.
No, every code path you don't execute is that. Like
> No .egg support.
How does that explain anything if the egg format is obsolete and not used?
Similar with spec strictness fallback logic - it's only slow if the packages you're installing are malformed, otherwise the logic will not run and not slow you down.
And in general, instead of a list of irrelevant and potentially relevant things would be great to understand some actual time savings per item (at least those that deliver the most speedup)!
But otherwise great and seemingly comprehensive list!
There's an interesting psychology at play here as well, if you are a programmer that chooses a "fast language" it's indicative of your priorities already, it's often not much the language, but that the programmer has decided to optimize for performance from the get go.
> PEP 658 went live on PyPI in May 2023. uv launched in February 2024. The timing isn’t coincidental. uv could be fast because the ecosystem finally had the infrastructure to support it. A tool like uv couldn’t have shipped in 2020. The standards weren’t there yet.
How/why did the package maintainers start using all these improvements? Some of them sound like a bunch of work, and getting a package ecosystem to move is hard. Was there motivation to speed up installs across the ecosystem? If setup.py was working okay for folks, what incentivized them to start using pyproject.toml?
> If setup.py was working okay for folks, what incentivized them to start using pyproject.toml?
It wasn't working okay for many people, and many others haven't started using pyproject.toml.
For what I consider the most egregious example: Requests is one of the most popular libraries, under the PSF's official umbrella, which uses only Python code and thus doesn't even need to be "built" in a meaningful sense. It has a pyproject.toml file as of the last release. But that file isn't specifying the build setup following PEP 517/518/621 standards. That's supposed to appear in the next minor release, but they've only done patch releases this year and the relevant code is not at the head of the repo, even though it already caused problems for them this year. It's been more than a year and a half since the last minor release.
> pip could implement parallel downloads, global caching, and metadata-only resolution tomorrow. It doesn’t, largely because backwards compatibility with fifteen years of edge cases takes precedence.
pip is simply difficult to maintain. Backward compatibility concerns surely contribute to that but also there are other factors, like an older project having to satisfy the needs of modern times.
For example, my employer (Datadog) allowed me and two other engineers to improve various aspects of Python packaging for nearly an entire quarter. One of the items was to satisfy a few long-standing pip feature requests. I discovered that the cross-platform resolution feature I considered most important is basically incompatible [1] with the current code base. Maintainers would have to decide which path they prefer.
> No bytecode compilation by default. pip compiles .py files to .pyc during installation. uv skips this step, shaving time off every install. You can opt in if you want it.
Are we losing out on performance of the actual installed thing, then? (I'm not 100% clear on .pyc files TBH; I'm guessing they speed up start time?)
No, because Python itself will generate bytecode for packages once you actually import them. uv just defers that to first-import time, but the cost is amortized in any setting where imports are performed over multiple executions.
That sounds like yes? Instead of doing it once at install time, it's done once at first use. It's only once so it's not persistently slower, but that is a perf hit.
My first cynical instinct is to say that this is uv making itself look better by deferring the costs to the application, but it's probably a good trade-off if any significant percentage of the files being compiled might not be used ever so the overall cost is lower if you defer to run time.
> It's only once so it's not persistently slower, but that is a perf hit.
Sure, but you pay that hit either way. Real-world performance is always usage based: the assumption that uv makes is that people run (i.e. import) packages more often than they install them, so amortizing at the point of the import machinery is better for the mean user.
Which part? The assumption is that when you `$TOOL install $PACKAGE`, you run (i.e. import) `$PACKAGE` more than you re-install it. So there's no point in slowing down (relatively less common) installation events when you can pay the cost once on import.
(The key part being that 'less common' doesn't mean a non-trivial amount of time.)
Probably for any case where an actual human is doing it. On an image you obviously want to do it at bake time, so I feel default off with a flag would have been a better design decision for pip.
I just read the thread and use Python, I can't comment on the % speedup attributed to uv that comes from this optimization.
Images are a good example where doing it at install-time is probably the best yeah, since every run of the image starts 'fresh', losing the compilation which happened last time the image got started.
If it was a optional toggle it would probably become best practice to activate compilation in dockerfiles.
you are right. it depends on how often this first start is, if its bad or not..most usecases id guess (total guess, have limited exp with python projects professionally) its not an issue.
This optimization hits serverless Python the worst. At Modal we ensure users of uv are setting UV_COMPILE_BYTECODE to avoid the cold start penalty. For large projects .pyc compilation can take hundreds of milliseconds.
Yes, uv skipping this step is a one time significant hit to start up time. E.g. if you're building a Dockerfile I'd recommend setting `--compile-bytecode` / `UV_COMPILE_BYTECODE`
Historically the practice of producing pyc files on install started with system wide installed packages, I believe, when the user running the program might lack privileges to write them.
If the installer can write the .oy files it can also write the .pyc, while the user running them might not in that location.
If you have a dependency graph large enough for this to be relevant, it almost certainly includes a large number of files which are never actually imported. At worst the hit to startup time will be equal to the install time saved, and in most cases it'll be a lot smaller.
So... will uv make Python a viable cross-platform utility solution?
I was going to learn Python for just that (file-conversion utilities and the like), but everybody was so down on the messy ecosystem that I never bothered.
The article info is great, but why do people put up with LLM ticks and slop in their writing? These sentences add no value and treats the reader as stupid.
> This is concurrency, not language magic.
> This is filesystem ops, not language-dependent.
Duh, you literally told me that the previous sentence and 50 million other times.
This kind of writing goes deeper than LLM's, and reflects a decline in both reading ability, patience, and attention. Without passing judgement, there are just more people now who benefit from repetition and summarization embedded directly in the article. The reader isn't 'stupid', just burdened.
Indeed, I am coming around in the past few weeks to realization and acceptance that the LLM editorial voice is a benefit to an order of magnitude more hn readers than those (like us) for whom it is ice pick in the nostril stuff.
I've talked about this many times on HN this year but got beaten to the punch on blogging it seems. Curses.
... Okay, after a brief look, there's still lots of room for me to comment. In particular:
> pip’s slowness isn’t a failure of implementation. For years, Python packaging required executing code to find out what a package needed.
This is largely refuted by the fact that pip is still slow, even when installing from wheels (and getting PEP 600 metadata for them). Pip is actually still slow even when doing nothing. (And when you create a venv and allow pip to be bootstrapped in it, that bootstrap process takes in the high 90s percent of the total time used.)
If it's really not doing any upper bound checks, I could see it blowing up under more mundane conditions; Python includes breaking changes on .x releases, so I've had eg. packages require (say) Python 3.10 when 3.11/12 was current.
> Some of uv’s speed comes from Rust. But not as much as you’d think. Several key optimizations could be implemented in pip today: […] Python-free resolution
very nice article, always good to get a review of what a "simple" looking tool does behind the scense
about rust though
some say a nicer language helps finding the right architecture (heard that about cpp veteran dropping it for ocaml, any attempted idea would take weeks in cpp, was a few days in ocaml, they could explore more)
also the parallelism might be a benefit the language orientation
This is great to read because it validates my impression that Python packaging has always been a tremendous overengineered mess. Glad to see someone finally realized you just need a simple standard metadata file per package.
this shit is ChatGPT-written and I'm really tired of it. If I wanted to read chatgpt I would have asked it myself. Half of the article are nonsensical repeated buzzwords thrown in for absolutely no reason
I too use pipenv unless there's a reason not to. I hope people use whatever works best for them.
I feel that sometimes there's a desire on the part of those who use tool X that everyone should use tool X. For some types of technology (car seat belts, antibiotics...) that might be reasonable but otherwise it seems more like a desire for validation of the advocate's own choice.
I think this post does a really good job of covering how multi-pronged performance is: it certainly doesn't hurt uv to be written in Rust, but it benefits immensely from a decade of thoughtful standardization efforts in Python that lifted the ecosystem away from needing `setup.py` on the hot path for most packages.
Someone once told me a benefit of staffing a project for Haskell was it made it easy to select for the types of programmers that went out of their way to become experts in Haskell.
Tapping the Rust community is a decent reason to do a project in Rust.
It's an interesting debate. The flip side of this coin is getting hires who are more interested in the language or approach than the problem space and tend to either burn out, actively dislike the work at hand, or create problems that don't exist in order to use the language to solve them.
With that said, Rust was a good language for this in my experience. Like any "interesting" thing, there was a moderate bit of language-nerd side quest thrown in, but overall, a good selection metric. I do think it's one of the best Rewrite it in X languages available today due to the availability of good developers with Rewrite in Rust project experience.
The Haskell commentary is curious to me. I've used Haskell professionally but never tried to hire for it. With that said, the other FP-heavy languages that were popular ~2010-2015 were absolutely horrible for this in my experience. I generally subscribe to a vague notion that "skill in a more esoteric programming language will usually indicate a combination of ability to learn/plasticity and interest in the trade," however, using this concept, I had really bad experiences hiring both Scala and Clojure engineers; there was _way_ too much academic interest in language concepts and way too little practical interest in doing work. YMMV :)
Paul Graham said the same thing about Python 20 years ago [1], and back then it was true. But once a programming langauge hits mainstream, this ceases to be a good filter.
[1] https://paulgraham.com/pypar.html
This is important. The benefit here isn't the language itself. It's the fact that you're pulling from an esoteric language. People should not overfit and feel that whichever language is achieving that effect today is special in this regard.
I'm my experience this is definitely where rust shined. The language wasn't really what made the project succeed so much as having relatively curious, meticulous, detail-oriented people on hand who were interested in solving hard problems.
Sometimes I thought our teams would be a terrible fit for more cookie-cutter applications where rapid development and deployment was the primary objective. We got into the weeds all the time (sometimes because of rust itself), but it happened to be important to do so.
Had we built those projects with JavaScript or Python I suspect the outcomes would have been worse for reasons apart from the language choice.
I think a lot of rust rewrites have this benefit; if you start with hindsight you can do better more easily. Of course, rust is also often beneficial for its own sake, so it's a one-two punch:)
Succinctly, perhaps with some loss of detail:
"Rewrite" is important as "Rust".
as important as
> I think a lot of rust rewrites have this benefit
I think Rust itself has this benefit
Completely agreed!
Rust rewrites are known for breaking (compatibility with) working software. That's all there is to them.
In Python's case, as the article describes quite clearly, the issue is that the design of "working software" (particularly setup.py) was bad to the point of insane (in much the same way as the NPM characteristics that enabled the recent Shai Hulud supply chain attacks, but even worse). At some point, compatibility with insanity has got to go.
Helpfully, though, uv retains compatibility with newer (but still well-established) standards in the Python community that don't share this insanity!
[delayed]
> uv is fast because of what it doesn’t do, not because of what language it’s written in. The standards work of PEP 518, 517, 621, and 658 made fast package management possible. Dropping eggs, pip.conf, and permissive parsing made it achievable. Rust makes it a bit faster still.
Isn't assigning out what all made things fast presumptive without benchmarks? Yes, I imagine a lot is gained by the work of those PEPs. I'm more questioning how much weight is put on dropping of compatibility compared to the other items. There is also no coverage for decisions influenced by language choice which likely influences "Optimizations that don’t need Rust".
This also doesn't cover subtle things. Unsure if rkyv is being used to reduce the number of times that TOML is parsed but TOML parse times do show up in benchmarks in Cargo and Cargo/uv's TOML parser is much faster than Python's (note: Cargo team member, `toml` maintainer). I wish the TOML comparison page was still up and showed actual numbers to be able to point to.
> Isn't assigning out what all made things fast presumptive without benchmarks?
We also have the benchmark of "pip now vs. pip years ago". That has to be controlled for pip version and Python version, but the former hasn't seen a lot of changes that are relevant for most cases, as far as I can tell.
> This also doesn't cover subtle things. Unsure if rkyv is being used to reduce the number of times that TOML is parsed but TOML parse times do show up in benchmarks in Cargo and Cargo/uv's TOML parser is much faster than Python's (note: Cargo team member, `toml` maintainer). I wish the TOML comparison page was still up and showed actual numbers to be able to point to.
This is interesting in that I wouldn't expect that the typical resolution involves a particularly large quantity of TOML. A package installer really only needs to look at it at all when building from source, and part of what these standards have done for us is improve wheel coverage. (Other relevant PEPs here include 600 and its predecessors.) Although that has also largely been driven by education within the community, things like e.g. https://blog.ganssle.io/articles/2021/10/setup-py-deprecated... and https://pradyunsg.me/blog/2022/12/31/wheels-are-faster-pure-... .
> Zero-copy deserialization. uv uses rkyv to deserialize cached data without copying it. The data format is the in-memory format. This is a Rust-specific technique.
This (zero-copy deserialization) is not a rust-specific technique, so I'm not entirely sure why the author describes it as one. Any good low level language (C/C++ included) can do this from my experience.
Given the context of the article, I think "Rust specific" here means that "it couldn't be done in python".
For example "No interpreter startup" is not specific to Rust either.
I think the framing in the post is that it's specific to Rust, relative to what Python packaging tools are otherwise written in (Python). It's not very easy to do zero-copy deserialization in pure Python, from experience.
(But also, I think Rust can fairly claim that it's made zero-copy deserialization a lot easier and safer.)
I suppose it can fairly claim that now every other library and blog post invokes "zero-copy" this and that, even in the most nonsensical scenarios. It's a technique for when you can literally not afford the memory bandwidth, because you are trying to saturate a 100Gbps NIC or handling 8k 60Hz video, not for compromising your data serialization schemes portability for marketing purposes while all applications hit the network first, disk second and memory bandwidth never.
Many of the hot paths in uv involve an entirely locally cached set of distributions that need to be loaded into memory, very lightly touched/filtered, and then sunk to disk somewhere else. In those contexts, there are measurable benefits to not transforming your representation.
(I'm agnostic on whether zero-copy "matters" in every single context. If there's no complexity cost, which is what Rust's abstractions often provide, then it doesn't really hurt.)
It's Rust vs Python in this case.
> When a package says it requires python<4.0, uv ignores the upper bound and only checks the lower. This reduces resolver backtracking dramatically since upper bounds are almost always wrong. Packages declare python<4.0 because they haven’t tested on Python 4, not because they’ll actually break. The constraint is defensive, not predictive.
This is kind of fascinating. I've never considered runtime upper bound requirements. I can think of compelling reasons for lower bounds (dropping version support) or exact runtime version requirements (each version works for exact, specific CPython versions). But now that I think about it, it seems like upper bounds solve a hypothetical problem that you'd never run into in practice.
If PSF announced v4 and declared a set of specific changes, I think this would be reasonable. In the 2/3 era it was definitely reasonable (even necessary). Today though, it doesn't actually save you any trouble.
The content is nice and insightful! But God I wish people stopped using LLMs to 'improve' their prose... Ironically, some day we might employ LLMs to re-humanize texts that had been already massacred.
The author’ blog was on HN a few days ago as well for an article on SBOMs and Lockfiles. They’ve done a lot of work in the supply-chain security side and are clearly knowledgeable, and yet the blog post got similarly “fuzzified” by the LLM.
Interestingly I didn’t catch this, I liked it for not looking LLM written!
“Why this matters” being the final section is a guaranteed give away, among innumerable others.
I have reached a point where any AI smell (of which this articles has many) makes me want to exit immediately. It feels tortuous to my reading sensibilities.
I blame fixed AI system prompts - they forcibly collapse all inputs into the same output space. Truly disappointing that OpenAI et all have no desire to change this before everything on the internet sounds the same forever.
You're probably right about the latter point, but I do wonder how hard it'd be to mask the default "marketing copywriter" tone of the LLM by asking it to assume some other tone in your prompt.
As you said, reading this stuff is taxing. What's more, this is a daily occurrence by now. If there's a silver lining, it's that the LLM smells are so obvious at the moment; I can close the tab as soon as I notice one.
> do wonder how hard it'd be to mask the default "marketing copywriter" tone of the LLM by asking it to assume some other tone in your prompt.
Fairly easy, in my wife's experience. She repeatedly got accused of using chatgpt in her original writing (she's not a native english speaker, and was taught to use many of the same idioms that LLMs use) until she started actually using chatgpt with about two pages of instructions for tone to "humanize" her writing. The irony is staggering.
I remain baffled about these posts getting excited about uv’s speed. I’d like to see a real poll but I personally can’t imagine people listing speed as one of the their top ten concerns about python package managers. What are the common use cases where the delay due to package installation is at all material?
Edit to add: I use python daily
At a previous job, I recall updating a dependency via poetry would take on the order of ~5-30m. God forbid after 30 minutes something didn’t resolve and you had to wait another 30 minutes to see if the change you made fixed the problem. Was not an enjoyable experience.
uv has been a delight to use
> updating a dependency via poetry would take on the order of ~5-30m. God forbid after 30 minutes something didn’t resolve and you had to wait another 30 minutes to see if the change you made fixed the problem
I'd characterize that as unusable, for sure.
Working heavily in Python for the last 20 years, it absolutely was a big deal. `pip install` has been a significant percentage of the deploy time on pretty much every app I've ever deployed and I've spent countless hours setting up various caching techniques trying to speed it up.
As a multi decade Python user, uv's speed is "life changing". It's a huge devx improvement. We lived with what came before, but now that I have it, I would never want to go back and it's really annoying to work on projects now that aren't using it.
`poetry install` on my dayjob’s monolith took about 2 minutes, `uv sync` takes a few seconds. Getting 2 minutes back on every CI job adds up to a lot of time saved
I can run `uvx sometool` without fear because I know that it'll take a few seconds to create a venv, download all the dependencies, and run the tool. uv's speed has literally changed how I work with Python.
Docker builds are a big one, at least at my company. Any tool that reduces wait time is worth using, and uv is an amazing tool that removes that wait time. I take it you might not use python much as it solves almost every pain point, and is fast which feels rare.
I avoided Python for years, especially because of package and environment management. Python is now my go to for projects since discovering uv, PEP 723 metadata, and LLMs’ ability to write Python.
Setting up a new dev instance took 2+ hours with pip at my work. Switching to uv dropped the Python portion down to <1 minute, and the overall setup to 20 minutes.
A similar, but less drastic speedup applied to docker images.
The biggest benefit is in CI environments and Docker images and the like where all packages can get reinstalled on every run.
The speed is nice, but I switched because uv supports "pip compile" from pip-tools, and it is better at resolving dependencies. Also pip-tools uses (used?) internal pip methods and breaks frequently because of that, uv doesn't.
for me it's being able to do `uv run whatever` and always know I have the correct dependencies
(also switching python version is so fast)
> Every code path you don’t have is a code path you don’t wait for.
No, every code path you don't execute is that. Like
> No .egg support.
How does that explain anything if the egg format is obsolete and not used?
Similar with spec strictness fallback logic - it's only slow if the packages you're installing are malformed, otherwise the logic will not run and not slow you down.
And in general, instead of a list of irrelevant and potentially relevant things would be great to understand some actual time savings per item (at least those that deliver the most speedup)!
But otherwise great and seemingly comprehensive list!
There's an interesting psychology at play here as well, if you are a programmer that chooses a "fast language" it's indicative of your priorities already, it's often not much the language, but that the programmer has decided to optimize for performance from the get go.
> PEP 658 went live on PyPI in May 2023. uv launched in February 2024. The timing isn’t coincidental. uv could be fast because the ecosystem finally had the infrastructure to support it. A tool like uv couldn’t have shipped in 2020. The standards weren’t there yet.
How/why did the package maintainers start using all these improvements? Some of them sound like a bunch of work, and getting a package ecosystem to move is hard. Was there motivation to speed up installs across the ecosystem? If setup.py was working okay for folks, what incentivized them to start using pyproject.toml?
> If setup.py was working okay for folks, what incentivized them to start using pyproject.toml?
It wasn't working okay for many people, and many others haven't started using pyproject.toml.
For what I consider the most egregious example: Requests is one of the most popular libraries, under the PSF's official umbrella, which uses only Python code and thus doesn't even need to be "built" in a meaningful sense. It has a pyproject.toml file as of the last release. But that file isn't specifying the build setup following PEP 517/518/621 standards. That's supposed to appear in the next minor release, but they've only done patch releases this year and the relevant code is not at the head of the repo, even though it already caused problems for them this year. It's been more than a year and a half since the last minor release.
Because static declaration was clearly safer and more performant? My question is why pip isn't fully taking advantage
Because pip contains decades of built-up code and lacks the people willing to work on updating it.
Mmm I don't buy it. Not many projects use setup.py now anyway and pip is still super slow.
> Plenty of tools are written in Rust without being notably fast.
This also hasn't been my experience. Most tools written in Rust are notably fast.
> pip could implement parallel downloads, global caching, and metadata-only resolution tomorrow. It doesn’t, largely because backwards compatibility with fifteen years of edge cases takes precedence.
pip is simply difficult to maintain. Backward compatibility concerns surely contribute to that but also there are other factors, like an older project having to satisfy the needs of modern times.
For example, my employer (Datadog) allowed me and two other engineers to improve various aspects of Python packaging for nearly an entire quarter. One of the items was to satisfy a few long-standing pip feature requests. I discovered that the cross-platform resolution feature I considered most important is basically incompatible [1] with the current code base. Maintainers would have to decide which path they prefer.
[1]: https://github.com/pypa/pip/issues/13111
> No bytecode compilation by default. pip compiles .py files to .pyc during installation. uv skips this step, shaving time off every install. You can opt in if you want it.
Are we losing out on performance of the actual installed thing, then? (I'm not 100% clear on .pyc files TBH; I'm guessing they speed up start time?)
No, because Python itself will generate bytecode for packages once you actually import them. uv just defers that to first-import time, but the cost is amortized in any setting where imports are performed over multiple executions.
That sounds like yes? Instead of doing it once at install time, it's done once at first use. It's only once so it's not persistently slower, but that is a perf hit.
My first cynical instinct is to say that this is uv making itself look better by deferring the costs to the application, but it's probably a good trade-off if any significant percentage of the files being compiled might not be used ever so the overall cost is lower if you defer to run time.
> It's only once so it's not persistently slower, but that is a perf hit.
Sure, but you pay that hit either way. Real-world performance is always usage based: the assumption that uv makes is that people run (i.e. import) packages more often than they install them, so amortizing at the point of the import machinery is better for the mean user.
(This assumption is not universal, naturally!)
Ummm, your comment is backwards, right?
Which part? The assumption is that when you `$TOOL install $PACKAGE`, you run (i.e. import) `$PACKAGE` more than you re-install it. So there's no point in slowing down (relatively less common) installation events when you can pay the cost once on import.
(The key part being that 'less common' doesn't mean a non-trivial amount of time.)
Probably for any case where an actual human is doing it. On an image you obviously want to do it at bake time, so I feel default off with a flag would have been a better design decision for pip.
I just read the thread and use Python, I can't comment on the % speedup attributed to uv that comes from this optimization.
Images are a good example where doing it at install-time is probably the best yeah, since every run of the image starts 'fresh', losing the compilation which happened last time the image got started.
If it was a optional toggle it would probably become best practice to activate compilation in dockerfiles.
you are right. it depends on how often this first start is, if its bad or not..most usecases id guess (total guess, have limited exp with python projects professionally) its not an issue.
This optimization hits serverless Python the worst. At Modal we ensure users of uv are setting UV_COMPILE_BYTECODE to avoid the cold start penalty. For large projects .pyc compilation can take hundreds of milliseconds.
Yes, uv skipping this step is a one time significant hit to start up time. E.g. if you're building a Dockerfile I'd recommend setting `--compile-bytecode` / `UV_COMPILE_BYTECODE`
Historically the practice of producing pyc files on install started with system wide installed packages, I believe, when the user running the program might lack privileges to write them. If the installer can write the .oy files it can also write the .pyc, while the user running them might not in that location.
If you have a dependency graph large enough for this to be relevant, it almost certainly includes a large number of files which are never actually imported. At worst the hit to startup time will be equal to the install time saved, and in most cases it'll be a lot smaller.
So... will uv make Python a viable cross-platform utility solution?
I was going to learn Python for just that (file-conversion utilities and the like), but everybody was so down on the messy ecosystem that I never bothered.
I write all of my scripts in Python with PEP 723 metadata and run them with `uv run`. Works great on Windows and Linux for me.
It has been viable for a long time, and the kinds of projects you describe are likely well served by the standard library.
The article info is great, but why do people put up with LLM ticks and slop in their writing? These sentences add no value and treats the reader as stupid.
> This is concurrency, not language magic.
> This is filesystem ops, not language-dependent.
Duh, you literally told me that the previous sentence and 50 million other times.
This kind of writing goes deeper than LLM's, and reflects a decline in both reading ability, patience, and attention. Without passing judgement, there are just more people now who benefit from repetition and summarization embedded directly in the article. The reader isn't 'stupid', just burdened.
Indeed, I am coming around in the past few weeks to realization and acceptance that the LLM editorial voice is a benefit to an order of magnitude more hn readers than those (like us) for whom it is ice pick in the nostril stuff.
Oh well, all I can do is flag.
I've talked about this many times on HN this year but got beaten to the punch on blogging it seems. Curses.
... Okay, after a brief look, there's still lots of room for me to comment. In particular:
> pip’s slowness isn’t a failure of implementation. For years, Python packaging required executing code to find out what a package needed.
This is largely refuted by the fact that pip is still slow, even when installing from wheels (and getting PEP 600 metadata for them). Pip is actually still slow even when doing nothing. (And when you create a venv and allow pip to be bootstrapped in it, that bootstrap process takes in the high 90s percent of the total time used.)
> When a package says it requires python<4.0, uv ignores the upper bound and only checks the lower.
I will bring popcorn on python 4 release date.
It would be popcorn-worthy regardless, given the rhetoric surrounding the idea in the community.
If it's really not doing any upper bound checks, I could see it blowing up under more mundane conditions; Python includes breaking changes on .x releases, so I've had eg. packages require (say) Python 3.10 when 3.11/12 was current.
I always bring popcorn on major version changes for any programming language. I hope Rust's never 2.0 stance holds.
> Some of uv’s speed comes from Rust. But not as much as you’d think. Several key optimizations could be implemented in pip today: […] Python-free resolution
Umm…
I don't have any real disagreement with any of the details the author said.
But still, I'm skeptical.
If it is doable, the best way to prove it is to actually do it.
If no one implements it, was it ever really doable?
Even if there is no technical reason, perhaps there is a social one?
What are you talking about, this all exists
very nice article, always good to get a review of what a "simple" looking tool does behind the scense
about rust though
some say a nicer language helps finding the right architecture (heard that about cpp veteran dropping it for ocaml, any attempted idea would take weeks in cpp, was a few days in ocaml, they could explore more)
also the parallelism might be a benefit the language orientation
enough semi fanboyism
This is great to read because it validates my impression that Python packaging has always been a tremendous overengineered mess. Glad to see someone finally realized you just need a simple standard metadata file per package.
Great post, but the blatant chatgpt-esque feel hits hard… Don’t get me wrong, I love astral! and the content, but…
Reading the other replies here makes it really obvious that this is some LLM’s writing. Maybe even all of it…
this shit is ChatGPT-written and I'm really tired of it. If I wanted to read chatgpt I would have asked it myself. Half of the article are nonsensical repeated buzzwords thrown in for absolutely no reason
uv seems to be a pet peeve of HN. I always thought pipenv was good but yeah, seems like I was being ignorant
> uv seems to be a pet peeve of HN.
Unless I've been seeing very different submissions than you, "pet peeve" seems like the exact opposite of what is actually the case?
Indeed; I don't think he knows what "peeve" means...
I too use pipenv unless there's a reason not to. I hope people use whatever works best for them.
I feel that sometimes there's a desire on the part of those who use tool X that everyone should use tool X. For some types of technology (car seat belts, antibiotics...) that might be reasonable but otherwise it seems more like a desire for validation of the advocate's own choice.