Reminds me of when I tried to use the library of babel as a data compression tool. It led me down a fun rabbit hole and was my first introduction to information theory.
The conclusion being that you basically need the same amount of data to represent the address of your data as the data itself, so it's not really effective at compression, just a fun thought experiment.
The cool part of this in modern times is that LLMs are basically a form of lossy compression that actually achieves the gist of what these tools fail at. Although it is lossy, and requires a massive substrate. This is related to the idea of AI/LLMs being a form of language compression.
It is worth noting that as the length of data increases it becomes extremely unlikely that the index and length of the sequence within pi would actually be smaller than the data.
I vaguely remember an entry to a compression-benchmark that gamed the benchmark by treating the filename as part of the input to the decompression-algorithm, thus beating the metric that only measured the size of the file.
> Matches that occur early enough in π to attain significant compression will not be varied. That is, it isn't possible to use π to compress interesting, real-world data because real-word strings are unlikely to arise early.
So does every other random infinite sequence of bits. The unintuitive part comes from infinity, not pi.
It also doesn't contain all past and future knowledge because it also contains all possible falsehoods about the past and future in a way that's indiscernible from the truth.
Encoding information as an offset into a pseudorandom sequence is no more storage efficient than storing the information directly.
And also all the days you don’t, so, by itself not very meaningful. Especially since you can’t tell which one is right in advance. In some sense, so does a calendar
This is probably a dumb question, but do we actually know that pi has an infinite number of decimal digits or are we assuming that it does because we haven’t developed a sufficiently powerful computer to calculate the last digit of pi?
I’m guessing this is something that could be formally proven?
It's amazing how inscrutable calculus can be when you return to reading it after not doing so for a period of time, much like lisp or forth. I don't think I've actually done an integral or taken a derivative in years. I can see the elegance of that proof but I'll be damned if I can actually follow the mathematics from one step to the next.
jshell> "πfs".toUpperCase()
$1 ==> "ΠFS"
Welcome to Node.js v26.3.0.
Type ".help" for more information.
> "πfs".toUpperCase()
'ΠFS'
Python 3.14.5 (main, May 10 2026, 10:21:34) [Clang 21.0.0 (clang-2100.0.123.102)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> "πfs".upper()
'ΠFS'
echo 'πfs' | awk '{print toupper($0)}'
ΠFS
Part of the joke is that, in this implementation, the metadata is guaranteed to be larger than the file:
> Now, we all know that it can take a while to find a long sequence of digits in π, so for practical reasons, we should break the files up into smaller chunks that can be more readily found.
> In this implementation, to maximise performance, we consider each individual byte of the file separately, and look it up in π.
Related. Others?
πfs – A data-free filesystem - https://news.ycombinator.com/item?id=36357466 - June 2023 (107 comments)
πfs – A data-free filesystem - https://news.ycombinator.com/item?id=28699499 - Sept 2021 (30 comments)
PiFS – The Data-Free Filesystem - https://news.ycombinator.com/item?id=26208704 - Feb 2021 (1 comment)
Πfs: Never worry about data again - https://news.ycombinator.com/item?id=21359338 - Oct 2019 (1 comment)
The π Filesystem for FUSE: Store Your Data in π - https://news.ycombinator.com/item?id=19223032 - Feb 2019 (1 comment)
pifs - Avoid disk space usage by saving your files in the digits of Pi - https://news.ycombinator.com/item?id=18687275 - Dec 2018 (1 comment)
πfs – A data-free filesystem - https://news.ycombinator.com/item?id=13869691 - March 2017 (105 comments)
Πfs: Stores your data in π - https://news.ycombinator.com/item?id=10856108 - Jan 2016 (1 comment)
Πfs: Never worry about data again - https://news.ycombinator.com/item?id=10847693 - Jan 2016 (1 comment)
File system that stores location of file in Pi - https://news.ycombinator.com/item?id=8018818 - July 2014 (98 comments)
100% Compression Using Pi - https://news.ycombinator.com/item?id=6698852 - Nov 2013 (32 comments)
(Reposts are fine after a year or so; links to past threads are just to satisfy extra-curious readers)
Reminds me of when I tried to use the library of babel as a data compression tool. It led me down a fun rabbit hole and was my first introduction to information theory.
The conclusion being that you basically need the same amount of data to represent the address of your data as the data itself, so it's not really effective at compression, just a fun thought experiment.
The cool part of this in modern times is that LLMs are basically a form of lossy compression that actually achieves the gist of what these tools fail at. Although it is lossy, and requires a massive substrate. This is related to the idea of AI/LLMs being a form of language compression.
It is worth noting that as the length of data increases it becomes extremely unlikely that the index and length of the sequence within pi would actually be smaller than the data.
That seems easy enough to solve. Simply record the index and length in pi of the index and length in pi.
See also: Recursion
yes I believe that's the joke
I vaguely remember an entry to a compression-benchmark that gamed the benchmark by treating the filename as part of the input to the decompression-algorithm, thus beating the metric that only measured the size of the file.
Reminds me of: https://www.spronck.net/sloot.html
Further reading: https://en.wikipedia.org/wiki/Sloot_Digital_Coding_System
Never heard of that one, that's amazing! Love it.
https://cs.stackexchange.com/a/53737/1704
> Matches that occur early enough in π to attain significant compression will not be varied. That is, it isn't possible to use π to compress interesting, real-world data because real-word strings are unlikely to arise early.
Finally, someone is doing something about the rising prices of storage!
So, π has been Boltzmann's Brain, this whole time?
This is disturbing to realize that pi then contains all the past and future knowledge, including when I'll pass away.
So does every other random infinite sequence of bits. The unintuitive part comes from infinity, not pi.
It also doesn't contain all past and future knowledge because it also contains all possible falsehoods about the past and future in a way that's indiscernible from the truth.
Encoding information as an offset into a pseudorandom sequence is no more storage efficient than storing the information directly.
Are you aware this is meant as a joke, right?
And also all the days you don’t, so, by itself not very meaningful. Especially since you can’t tell which one is right in advance. In some sense, so does a calendar
The worst part is that it contains Star Wars 4-6 from an alternate timeline where Disney did a reboot casting Chris Pratt as Han Solo.
(Fun fact: "Chrispratt" is an ancient Californian word that means "Joel McHale didn't want the role.")
Thank you for this Prattfall
Fear not! It’s probably so deep in pi that you’d pass away listening to someone tell you where!
So does a calendar, if you you buy them enough years in advance.
So does a random number generator
this statement is equivalent to "pi is a normal number." While most real numbers are normal and pi is suspected to be so, it isn't known.
https://en.wikipedia.org/wiki/Normal_number
It isn't actually proven true.
I... I can't tell if this is an elaborate troll or pure genius. I love it.
Both.
https://hn.algolia.com/?dateRange=all&page=0&prefix=false&qu...
This is probably a dumb question, but do we actually know that pi has an infinite number of decimal digits or are we assuming that it does because we haven’t developed a sufficiently powerful computer to calculate the last digit of pi?
I’m guessing this is something that could be formally proven?
Here is a one page proof that pi is irrational - https://heuklyd.github.io/papers/pdf/Niven-1947.pdf
Thanks for the PDF. I feel like I understand even less now than I did before.
Thanks for sharing. That’s a nice read. I’m glad I asked :)
It's amazing how inscrutable calculus can be when you return to reading it after not doing so for a period of time, much like lisp or forth. I don't think I've actually done an integral or taken a derivative in years. I can see the elegance of that proof but I'll be damned if I can actually follow the mathematics from one step to the next.
We definitely know that Pi is irrational, we just don't know if it's normal (i.e. if the PiFS joke even works).
Well, that should get GPT-5.5 extended thinking going for a few weeks.
I'm intrigued that π was capitalized to Π presumably automatically in the HN headline.
Why does your Python terminal report May 10th? Today is June 10th.
Posted many, many times before https://news.ycombinator.com/from?site=github.com/philipl
My favourite issue being about GDPR compliance https://github.com/philipl/pifs/issues/56
At what point is the metadata larger than the actual file?
Part of the joke is that, in this implementation, the metadata is guaranteed to be larger than the file:
> Now, we all know that it can take a while to find a long sequence of digits in π, so for practical reasons, we should break the files up into smaller chunks that can be more readily found.
> In this implementation, to maximise performance, we consider each individual byte of the file separately, and look it up in π.
Half the time it should be larger, right?
Love it! This feels very much in the spirit of Tom7's Harder Drive [1]
[1] https://www.youtube.com/watch?v=JcJSW7Rprio
Short Storage Number - SSN
0x123456789ABCDEF0
use this number as a shorter nibble storage alternative...
What a brilliant idea! Of course, of course, it’s not in the repository so I can’t apt-get install it. Debian...always so far behind.
absolutely genius