Maybe I’m missing it but the page is really light on technical information. Is this a quantized / distilled model of a larger LLM? Which one? How many parameters? What quantization? What T/s can I expect? What are the VRAM requirements? Etc etc
Huh, 1.6B/2B/4B models, I guess they weren't joking when they said "not as powerful as ChatGPT or Claude Code". Also unsure why they said "Claude Code", it's not an CLI agent AFAIK?
> otherwise small models can be very effective within the right use cases and orchestration
very limited amount of use cases, perhaps. As a generalized chat assistant? I'm not sure you'd be able to get anything of value out from them, but happy to be proven otherwise. I have all of those locally already, without fine-tuning, what use case could I try right now where any of those are "very effective"?
I tried it on my iPhone 13 mini. I believe the model you get changes depending on your phone specs. For me it downloaded a ~1.3GB model which can speak in complete sentences but can’t do much beyond that. Can’t blame them though—that model is tiny, and my device wasn’t designed for this.
Heard the first time about them (ente) yesterday in a discussion about "which 2FA are u using?". Directly switched to https://ente.com/auth/ on Android and Linux Desktop and very happy with it.
You presumably had a working 2fa app already, but off the cuff decide to switch to new unvetted variant X; basically unknown auth system after reading a few paragraphs of text in an afternoon?
While I would have the same reaction, in this case I think it is a sane decision. Ente is cornering the privacy market and I think they're doing a great job. They have a lot to lose (trust) and it would be stupid if they did something shady with the data entered in the 2FA app.
Not knowing them, how could OP trust them instantly? Whether they really have that trust or not, you have to know them for a while and from many different trustable sources. The story is a bit strange.
Ente is extremely well known in the privacy circles, so this is not just some random company with a random app out of nowhere.
Check PrivacyGuides for example.
Oh, wow, thanks for posting that. I switched to Ente for my photos recently, had no idea they also have a 2FA app. I was looking for a replacement for Aegis (after a switch to iOS), and this can even import from Aegis backup files. Neat. This means I can finally ditch my old phone I still had to have around just for 2FA :)
I just tried it. It downloaded Qwen3.5 2B on my phone and it's pretty coherent in its sentences, but really annoying with the amount of Ente products mentioned in every occasion.
Other than that it's fast enough to talk to and definitely an easy way to run a model locally on your phone.
Have you tried WebLLM? Or this wrapper: CodexLocal.com
Basically, you would have a rather simple but capable LLM right in your browser using WebLLM and GPU
There's dozens of local inference apps that basically wrap llama.cpp and someone else's GGUFs. The decentralized sync history part seems new? Not much else. But the advertisement copy is so insufferably annoying in how it presents this wrapper as a product.
Have a comparison chart to Ollama, LMStudio, LocalAI, Exo, Jan.AI, GPT4ALL, PocketPal, etc.
> This is not the beginning, nor is this the end. This is just a checkpoint.
Come onnnnnn. I would rather read a one line "Check out our offline llm" rather than a whole press release of slop.
This looks very neat. I'm not familiar with the nitty gritty of AI so I really don't understand how it can reply so quickly running on an iPhone 16. But I'm not even going to bother searching for details because I don't want to read slop.
This looks amazing! As I learn and experiment more with local LLMs, I'm becoming more of a fan of local/offline LLMs. I believe there's a huge gap between local LLM based apps and commercial models like Claude/ChatGPT. Excited to see more apps leveraging local LLMs.
Please god stop letting LLMs write your copy. My brain just slides right over this slop. Perhaps you have a useful product but christ almighty I cannot countenance this boring machine generated text.
The essence works, I was able to let it make a simple summary on CMS content. So next is making it do something useful, and making it clear how other plugins could use it.
It seems your link about the Wordpress variation validated my idea :).
If the new Wordpress feature would allow for connecting to Ollama, then there is no need anymore for my plugin. But I don't see that in the current documentation.
So for now, I see my solution being superior for anyone who doesn't have a paid subscription, but has a decent laptop, that would like to use an LLM 'for free' (apart from power usage) with 100% privacy on their website.
For when wordpress doesn't have enough exploits and bugs as it is. Also why bother with wordpress in the first place if you're already having an LLM spit out content for you ?
What's your point? Don't use LLM for CMS content? That my code is buggy? Or that people shouldn't trust the LLM they run on their computer on their own website?
You can check the code for exploits yourself. And other than that it's just your LLM talking to your own website.
> Also why bother with wordpress in the first place
Weird question, but sure, I use WordPress, because I have a website that I want to run with a simple CMS that can also run my custom Wordpress plugins.
Maybe I’m missing it but the page is really light on technical information. Is this a quantized / distilled model of a larger LLM? Which one? How many parameters? What quantization? What T/s can I expect? What are the VRAM requirements? Etc etc
You can see what it uses here - https://github.com/ente-io/ente/blob/main/web/apps/ensu/src/...
Either LFM2.5-1.6B-4bit or Qwen3.5-2B-8bit or Qwen3.5-4B-4bit
Hmm, the Mac app downloaded gemma-3-4b-it-Q4_K_M.gguf for me (on an Apple M4) - maybe the desktop apps download different models?
Though, I don't see any references to Gemma at all in the open source code...
Huh, 1.6B/2B/4B models, I guess they weren't joking when they said "not as powerful as ChatGPT or Claude Code". Also unsure why they said "Claude Code", it's not an CLI agent AFAIK?
> Also unsure why they said "Claude Code", it's not an CLI agent AFAIK?
Claude Code is a Desktop app as well.
I don’t think so. IIRC the desktop app is called Claude and it has a code option in the UI.
The consfusing way AI companies like to name products is something to be studied.
This seems to be a general chat app, but otherwise small models can be very effective within the right use cases and orchestration.
> otherwise small models can be very effective within the right use cases and orchestration
very limited amount of use cases, perhaps. As a generalized chat assistant? I'm not sure you'd be able to get anything of value out from them, but happy to be proven otherwise. I have all of those locally already, without fine-tuning, what use case could I try right now where any of those are "very effective"?
I tried it on my iPhone 13 mini. I believe the model you get changes depending on your phone specs. For me it downloaded a ~1.3GB model which can speak in complete sentences but can’t do much beyond that. Can’t blame them though—that model is tiny, and my device wasn’t designed for this.
I have the same questions. After installing the app, it downloads 2.5 GB of data. I presume this is the model.
Heard the first time about them (ente) yesterday in a discussion about "which 2FA are u using?". Directly switched to https://ente.com/auth/ on Android and Linux Desktop and very happy with it.
Going to give this a try...
You presumably had a working 2fa app already, but off the cuff decide to switch to new unvetted variant X; basically unknown auth system after reading a few paragraphs of text in an afternoon?
Does this seem sound?
While I would have the same reaction, in this case I think it is a sane decision. Ente is cornering the privacy market and I think they're doing a great job. They have a lot to lose (trust) and it would be stupid if they did something shady with the data entered in the 2FA app.
Not knowing them, how could OP trust them instantly? Whether they really have that trust or not, you have to know them for a while and from many different trustable sources. The story is a bit strange.
> cornering the privacy market
this seems self-contradictory
Ente is extremely well known in the privacy circles, so this is not just some random company with a random app out of nowhere. Check PrivacyGuides for example.
I ended up picking them because they were the only open source one that worked on all my devices IIRC.
https://en.wikipedia.org/wiki/Comparison_of_OTP_applications
if it helps, I've used ente for a year and I really like it.
This sounds like an ad.
As do most of the associated comments. I think we're surrounded by bots.
I'm not a bot. Check my comment history and account age.
Oh, wow, thanks for posting that. I switched to Ente for my photos recently, had no idea they also have a 2FA app. I was looking for a replacement for Aegis (after a switch to iOS), and this can even import from Aegis backup files. Neat. This means I can finally ditch my old phone I still had to have around just for 2FA :)
Weird hype going on here in comments.
I just tried it. It downloaded Qwen3.5 2B on my phone and it's pretty coherent in its sentences, but really annoying with the amount of Ente products mentioned in every occasion. Other than that it's fast enough to talk to and definitely an easy way to run a model locally on your phone.
There are literally 1000s of these types of apps. Why is this on the Front Page?
Have you tried WebLLM? Or this wrapper: CodexLocal.com Basically, you would have a rather simple but capable LLM right in your browser using WebLLM and GPU
There's dozens of local inference apps that basically wrap llama.cpp and someone else's GGUFs. The decentralized sync history part seems new? Not much else. But the advertisement copy is so insufferably annoying in how it presents this wrapper as a product.
Have a comparison chart to Ollama, LMStudio, LocalAI, Exo, Jan.AI, GPT4ALL, PocketPal, etc.
I like Ente, but isn't their core product a photos application? Its offshoots like this and 2FA feel incongruous.
if you are into local LLMs check out apfel
https://github.com/Arthur-Ficial/apfel
Apple Ai on the command line
Looks excellent -- thanks; shame older intel macs don't get it (a lot of those still around)
I've been having fun with this one https://github.com/osaurus-ai/osaurus
There is also another app called Off Grid, which lets you run any model from Hugging Face (of course you need to choose one your phone can handle).
https://github.com/alichherawalla/off-grid-mobile-ai
Had used cactus before - https://news.ycombinator.com/item?id=44524544
Then moved to pocket pal now for local llm.
The (hn) title is misleading (unlike the actual title): It's an LLM _App_ not an LLM.
> This is not the beginning, nor is this the end. This is just a checkpoint.
Come onnnnnn. I would rather read a one line "Check out our offline llm" rather than a whole press release of slop.
This looks very neat. I'm not familiar with the nitty gritty of AI so I really don't understand how it can reply so quickly running on an iPhone 16. But I'm not even going to bother searching for details because I don't want to read slop.
How is this any different from Ollama plus Open Web UI?
This looks amazing! As I learn and experiment more with local LLMs, I'm becoming more of a fan of local/offline LLMs. I believe there's a huge gap between local LLM based apps and commercial models like Claude/ChatGPT. Excited to see more apps leveraging local LLMs.
Please god stop letting LLMs write your copy. My brain just slides right over this slop. Perhaps you have a useful product but christ almighty I cannot countenance this boring machine generated text.
I'm working on a rather simple idea; a Wordpress plugin that allows you to use a local LLM inside your wordpress CMS.
It requires a Firefox add-on to act as a bridge: https://addons.mozilla.org/en-US/firefox/addon/ai-s-that-hel...
There is honestly not much to test just yet, but feel free to check it out here, provide feedback on the idea: https://codeberg.org/Helpalot/ais-that-helpalot
The essence works, I was able to let it make a simple summary on CMS content. So next is making it do something useful, and making it clear how other plugins could use it.
Spam? Ad?
Also: "Your AI agent can now create, edit, and manage content on WordPress.com" https://wordpress.com/blog/2026/03/20/ai-agent-manage-conten...
Spam for what? This is hackernews, I'm "hacking something" to push more control to users.
I'm talking about connecting Ollama to your wordpress.
Not via MCP or something that's complicated for a relatively normal user. But thanks for the link.
It seems your link about the Wordpress variation validated my idea :).
If the new Wordpress feature would allow for connecting to Ollama, then there is no need anymore for my plugin. But I don't see that in the current documentation.
So for now, I see my solution being superior for anyone who doesn't have a paid subscription, but has a decent laptop, that would like to use an LLM 'for free' (apart from power usage) with 100% privacy on their website.
> use a local LLM inside your wordpress CMS
For when wordpress doesn't have enough exploits and bugs as it is. Also why bother with wordpress in the first place if you're already having an LLM spit out content for you ?
What's your point? Don't use LLM for CMS content? That my code is buggy? Or that people shouldn't trust the LLM they run on their computer on their own website?
You can check the code for exploits yourself. And other than that it's just your LLM talking to your own website.
> Also why bother with wordpress in the first place
Weird question, but sure, I use WordPress, because I have a website that I want to run with a simple CMS that can also run my custom Wordpress plugins.