Hacker Newsnew | past | comments | ask | show | jobs | submit | digitalbase's commentslogin

Cool stuff. You should make it OSS and ask a one time fee for it. I would run it on my own infra but pay you once(.com)

tmux too.

Trying workmux with claude. Really cool combo


Was searching for this this morning and settled on https://handy.computer/


Big fan of handy and it’s cross platform as well. Parakeet V3 gives the best experience with very fast and accurate-enough transcriptions when talking to AIs that can read between the lines. It does have stuttering issues though. My primary use of these is when talking to coding agents.

But a few weeks ago someone on HN pointed me to Hex, which also supports Parakeet-V3 , and incredibly enough, is even faster than Handy because it’s a native MacOS-only app that leverages CoreML/Neural Engine for extremely quick transcriptions. Long ramblings transcribed in under a second!

It’s now my favorite fully local STT for MacOS:

https://github.com/kitlangton/Hex


I installed a few different STT apps at the same time that used Parakeet and I think they disagreed with each other. But Hex otherwise would’ve won for me I think. Wanna reformat the Mac & try again (been a while anyway).

My comment on this from a month back: https://news.ycombinator.com/item?id=46637040


Hex is great and not trying to pull you away from them - would love to get your pov when you give these a spin next time. email or DM me


I was having the same journey but landed on https://github.com/hoomanaskari/mac-dictate-anywhere


I just learned about Handy in this thread and it looks great!

I think the biggest difference between FreeFlow and Handy is that FreeFlow implements what Monologue calls "deep context", where it post-processes the raw transcription with context from your currently open window.

This fixes misspelled names if you're replying to an email / makes sure technical terms are spelled right / etc.

The original hope for FreeFlow was for it to use all local models like Handy does, but with the post-processing step the pipeline took 5-10 seconds instead of <1 second with Groq.


There's an open PR in the repo which will be merged which adds this support. Post processing is an optional feature if you want to use it, and when using it, end to end latency can still be under 3 seconds easily


That’s awesome! The specific thing that was causing the long latency was the image LLM call to describe the current context. I’m not sure if you’ve tested Handy’s post-processing with images or if there’s a technique to get image calls to be faster locally.

Thank you for making Handy! It looks amazing and I wish I found it before making FreeFlow.


Could you go into a little more detail about the deep context - what does it grab, and which model is used to process it? Are you also using a groq model for the transcription?


It takes a screenshot of the current window and sends it to Llama in Groq asking it to describe what you’re doing and pull out any key info like names with spelling.

You can go to Settings > Run Logs in FreeFlow to see the full pipeline ran on each request with the exact prompt and LLM response to see exactly what is sent / returned.


You can try ottex for this use case - it has both context capture (app screenshots), native LLMs support, meaning it can send audio AND screenshot directly to gemini 3 flash to produce the bespoke result.


As a very happy Handy user, it doesn't do that indeed. It will be interesting to see if it works better, I'll give FreeFlow a shot, thanks!


Yes, I also use Handy. It supports local transcription via Nvidia Parakeet TDT2, which is extremely fast and accurate. I also use gemini 2.5 flash lite for post-processing via the free AI studio API (post-processing is optional and can also use a locally-hosted LM).


I didn't try Handy but been using Whisper-Key its super simple get out of your way all local single file executable (portable so zero install too) -- thats for Windows idk about the Mac version

[1] https://github.com/PinW/whisper-key-local


the astroturfing here off topic of op post is unbearable


think you need to refresh your understanding of 'off topic' and 'astroturfing'


Handy rocks. I recently had minor surgery on my shoulder that required me to be in a sling for about a month, and I thought I'd give Handy a try for dictating notes and so on. It works phenomenally well for most text-to-speech use cases - homonyms included.


Thanks for the recommendation! I picked the smallest model (Moonshine Base @ 58MB), and it works great for transcribing English.

Surprisingly, it produced a better output (at least I liked its version) than the recommended but heavy model (Parakeet V3 @ 478 MB).


Great feedback :) also support for the v2 versions of the moonshine models should be out today!


Handy's great! I find the latency to be just a bit too much for my taste. Like half the people on this thread, built my own but with a bit more emphasis on speed

https://usetalkie.com


Not sure if it's just me but Handy crashes on my Arch setup. Never mind which version I run. Could be something with Wayland or Pipewire but didn't see anything obvious in the logs.


https://github.com/goodroot/hyprwhspr have you tried this? I have a nice 64GB new linux machine waiting to be set up for me to kick the tires on this

pretty sure it's awesome - sorry OP about mentioning another project, we're all learning here :)


Thanks, will take a look.


Handy is genuinely great and it supports Parakeet V3. It’s starting to change how I "type" on my computer.


I use handy as well, and love it.


Handy is nothing short of fantastic, really brilliant when combined with Parakeet v2!


Prezly | Remote (Europe ±2h CET) | Full-time | https://www.prezly.com

We run a platform that handles story publishing, media delivery and newsroom workflows for hundreds of organizations, with millions of readers hitting it every month. PHP on backend (API's/app), TS/React on frontend.

Hiring: • Senior Backend Engineer (PHP, Symfony, PostgreSQL) • Full-Stack Engineer (React, TypeScript, PHP)

Why join Prezly? • Work on systems that deal with high traffic, large media libraries and evolving data models • Improve and modernize a long-lived PHP codebase without breaking production • Own features end-to-end: database → API → UI → production • Direct line to product and customers, no layers of PMs or process • Profitable, remote-first company with high autonomy

Stack: PHP (Symfony), PostgreSQL, React/TypeScript, Kubernetes on AWS. We actively use AI coding tools like Claude Code and Cursor.

Apply: https://careers.prezly.com


Same point as https://news.ycombinator.com/item?id=46282964

Disagree though, people manage keys just fine, or they can be thought.

But even if there are people in the world that never get it, it could be outsourced to a central identity provider that manages your key and messages. For the end user they would have a user/password combo they can reset.

If the network becomes more popular someone will definitely build something like that.

The technical capabilities (remote signers, bunkers, ...) already exist


rglullis wrote that they "do not want to". I went a step further, expressing that they couldn't even if they wanted to. Not necessarily from lack of understanding so much as poor computing habits--malware, crashes without backups, forgetfulness, post-it notes in the same household as untrustworthy relatives, etc. Normies need the administrative solution, but then we're back to Sauron.


Incorrect.

Everyone can announce to the network where they read/write from. Clients can figure out (based on the people you follow) from which relays to get the content.

I've been using it like this for nearly a year. It works


Key difference is that is one relay author becomes "oligarchical" the notes just route around that (through different relays).


+1, user owning the ID is a step in the right direction compared to "homeserver" owning the right key and makes this possible.

That said - maybe (total hypothetical) the reason one relay becomes really big is because a lot of people think it provides really good service, and maybe it's difficult to convince the majority of the network to route around it. This would create a similar problem to what we see in more well established federated chat networks.


It doesn't work like that.

Your followers fetch the note from your relays. You tell the network where they can find your notes (self hosted relay) and their client will take the effort to find your content


I'm building a Nostr app (+- 2mio notes). There is a lot of spam and much worse content.

But it's kinda a solved problem (not through PoW) but through Web of Trust and not having algorithms. You see what the people/communities you follow post.

> I tried Nostr but like a lot of people here have been saying, it falls short in many ways due to the way it is structured. Relays are not really relays, they are more but also less. They are like community servers. Sure you can connect to many, have the same UI, but they are still disjoint and feels lonely.

I'd like to know more. Imho the fact that relays are dumb is a feature.

> You keep saying you can sign your messages and there is value there to people who are saying it is censorable in the ways they described.

All messages are signed. There is no way NOT to sign a message. This comes with the advantage that you don't need to trust the relays/pipes where messages go through which is an immense benefit

> This is not a personal thing, I want to like Nostr and I tried using it. I can and would probably get some use out of using it as a pubsub or message delivery infrastructure for two things I want to connect but what if the relay goes down? It is like a centralized pubsub messagebox thing. But can't even do that fully.

Relays go down all the time. There was an experiment where a major relay (Damus) just deleted the entire dataset. People barely noticed. And as any client (not just the author) and other relays can re-broadcast events the relay eventually recovers.

> There needs to be some sort of relay to relay communication (actual relaying) that needs to go on. And that wouldn't scale, even if it would work for now.

There are three mechanisms that do that:

- clients posts to multiple relays - clients/followers can rebroadcast notes (to other relays) - quite a few relays are syncing (negentropy sync)


I'm building a Q&A/community on top of Nostr and using those same concepts:

Original Author posts a kind:1 note with a question

A bot sends a kind:1985 note (NIP-32 https://github.com/nostr-protocol/nips/blob/master/32.md) that labels the content.

It can be done by the author (self-label), by an app, or by third parties (moderators/curators), depending on the trust model.

Other clients can decide to use that classification/label

--

For moderation purposes. If the behavior is closer to abuse (spam, scams, harassment...), use NIP-56 (Reporting). Reporting harmful/should-be-moderated content.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: