Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> I think there's potential here.

But how? Even if those interfaces were actually working, it's still extremely inconvenient to talk when you can click. You have to be somewhere where talking out loud doesn't disturb the people around you. That excludes most situations: open space offices, restaurants, coffee shops, public transport, cars with passengers, and most places in the home except maybe the bathroom.

And even if you're all alone in a silent place, giving instructions out loud takes more time than configuring a screen, and will always be error prone, because the feedback will always be ambiguous and imprecise.

Except maybe if the feedback is on a screen, but then if there's already a screen, why not use it.



I think the best use cases for voice assistants are when you don’t have free hands. I have two scenarios where I use voice assistants: setting a timer while cooking and changing the music while showering. Both could be done by other means as well but they wouldn‘t be more convenient.


Exactly. For instance, in the mornings Google Assistant has been really useful for when I say "OK Google, Good Morning". It then runs through and tells me:

* Current time, and weather forecast for the day

* Upcoming meetings today

* My current commute time to work, including traffic

* NPR news podcast

So during my routine of letting the dogs out, starting the coffee, etc. in the morning, I get the daily "essential" info.


Also when driving but Siri / Google assistant are more applicable for that use case


Asking the time whilst getting ready.


Seems like a perfect fit for a clock?


Or a watch?


Apple watch does have Siri, I suppose. They could be really bold and remove the screen.


> They could be really bold and remove the screen.

Then it would be called AirPods.


Both or either would suffice.


> But how? Even if those interfaces were actually working, it's still extremely inconvenient to talk when you can click. You have to be somewhere where talking out loud doesn't disturb the people around you. That excludes most situations: open space offices, restaurants, coffee shops, public transport, cars with passengers, and most places in the home except maybe the bathroom.

I would separate out the two, actually. There's a "natural language control system for the entire OS" and then there's the actual voice part. Voice is often mostly useful for accessibility purposes -- hands full, running, driving, etc. However, the other side is that a text-based NL assistant would also be profoundly useful. On iOS, you can enable "Type-to-siri" and you can just type sentences and Siri will respond back in text.

If we make progress on NL-driven command-lines, we can actually make progress on voice-assistants, and vice versa. The catch is that the voice side still needs recognition work.


Well, you are not trying to operate heavy machinery with Amazon Echo - hopefully. Voice as a common interface - I agree with all of that, but to me the everyday utility of being able to add something to my shopping list or my TODO list without having to fire up an APP greatly increases my quality of life. That part is magical, but I don't expect a lot more from it.


I used to use Alexa for my shopping list. I guess over time I came to the conclusion that adding something to a steno pad or my whiteboard was even easier.


If the assistant AI was advanced enough for pleasant conversations to occur, it would be useful.

The would be trivial to use the interface on screen when appropriate, and a truly smart assistant should be able to follow the context and be aware of your preferences and mood.

This is not fundamentally impossible, we're simply not there yet.


> But how? Even if those interfaces were actually working, it's still extremely inconvenient to talk when you can click.

Smart home light/etc while hands are occupied like with a baby. But usecases are quite limited


> But how? Even if those interfaces were actually working, it's still extremely inconvenient to talk when you can click

Working from home changes that. I can see many more opportunities for a multimodal input interface. Examples:

1. My fingertips now are closer to the "reply" button below this text area than they are even to the touchpad. Touching "reply" is half a second, moving one hand to the touchpad, aiming the pointer at the button and clicking takes longer. With a mouse: much longer. Anyway, my screen is not a touchscreen. I'll click.

2. Or, with an assistant, I could have said "Click reply", provided that the assistant knows where the focus is and that it can read the form I'm typing in.


Your fingertips while typing are even closer to the Tab and Enter keys on your keyboard, which, if pressed in sequence, have the exact same effect. Much simpler and much faster than either of your options.


Faster, don't know. Simpler, I didn't even think about it. However I'm doing it now. Thanks.


Wow, the second point is really interesting. Binding a voice command to a key, in this case Enter.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: