The problem isn't voice, it's natural language. Natural language is a fundamenta...

bombcar · on Nov 23, 2022

I'd be perfectly happy with a list of Siri commands that I would have to learn to be able to do things. I don't care if I ended up sounding like:

Hey Siri

Turn lights on 50 percent

For one hour

Dim over that time

Play music.

I can learn what I need to do; JUST LET ME KNOW THE MAGIC WORDS!

LooseMarmoset · on Nov 23, 2022

It's like playing Zork all over again.

guestbest · on Nov 23, 2022

A lisp compiler in a voice assistant would seem like an improvement in that the user could define objects and then express the actions to be performed in the same room. But these assistants seem to drop objects between commands making them hard to program conversationally.

I guess a list like language would be ideal and the pauses would be like parentheses

toxik · on Nov 23, 2022

But with the added complexity that sometimes the speech-to-text will just crap out completely.

LooseMarmoset · on Nov 23, 2022

Alexa, turn on lights

...I don't know how to do that

Alexa, turn lights on

...What do I turn the lights with?

Alexa, activate lights

...I don't know what you mean

...It is pitch black. You are likely to be eaten by a grue.

ALEXA TURN ON THE DAMN LIGHTS

...I don't know the word "lights"

...Oh no! You have walked into the slavering fangs of a grue!

** You have died **

bombcar · on Nov 23, 2022

Siri, turn on bathroom lights.

Downstairs or upstairs bathroom?

Downstairs.

Sorry, I didn’t understand. Downstairs it upstairs bathroom?

Downstairs bathroom.

Sorry, I didn’t understand. Downstairs it upstairs bathroom?

Cancel.

Ok. Cancelling.

Siri turn on downstairs bathroom lights.

(Turns off all lights)

gernb · on Nov 23, 2022

For me, about once a week it's

"hey siri?"

(no response, no icon),

"hey siri?"

(no response, no icon),

"hey siri?" (louder)

(no response, no icon),

"hey siri?" (louder and slower)

(no response, no icon),

reboot iphone 13 pro

"hey siri?"

works

spookthesunset · on Nov 23, 2022

“Did you mean ‘bathroom LED’ or ‘bathroom’?”

Because god help you if your device names are similar to your room names…

bombcar · on Nov 23, 2022

I’ve taken to naming my lights things like Greg, The Beacons, etc.

And I added scenes so I can say “Gondor calls for aid” and the beacons will light.

ghaff · on Nov 23, 2022

Yes. And it may be worth noting that Zork is literally something like 50 year old parser technology.

ics · on Nov 23, 2022

Not to take away from your point (I'd like the magic list too) but to some degree, this can be worked around using Shortcuts. If you use inputs, Siri will prompt for them which is a bit slow but you could even use a dictate text and parse yourself if desired.

everdrive · on Nov 23, 2022

I highly doubt there is "a" magic list. I'll bet the magic list changes constantly.

bombcar · on Nov 23, 2022

I noticed a drop in usability about the time they went with ML.

ASalazarMX · on Nov 23, 2022

Same with the predictive keyboard, it feels more random now.

wkdneidbwf · on Nov 23, 2022

i don’t know that you can do exactly all these things, but is this the use case for custom routines in the amazon ecosystem.

you great the prompt and add one or more actions to take.

albertzeyer · on Nov 23, 2022

On the other side, humans have been fine using natural language to delegate commands to each other.

So maybe it's just that the subfield of natural language understanding is still too early to be really useful. Speech recognition itself has gotten really good but then understanding the context, the intent, etc, all that is natural language understanding, and that is often the problem.

moffkalast · on Nov 23, 2022

> have been fine

Citation needed, there's a lot of disagreements and misunderstandings (some have cost lives) that could've been avoided if we didn't have 10 different ways to say the same vague thing that can be interpreted in 20 ways. You think the military uses a phonetic alphabet and specifically structured communications for fun? Or the way planes talk to ATC for example. Where precision and unambiguity is crucial, natural language always gets ditched for something more formal.

_jpys · on Nov 23, 2022

This is actually an interesting point. In the Army, we used terms that limited ambiguity thereby increasing efficiency. Even if one eliminates the complexity of language, there's still a specification problem.

I only use voice assistants to set alarms. I cannot imagine voice as a primary input. Then again, many have opted out of owning desktops and laptops in favor of mobile phones. That also seems terribly inefficient.

ghaff · on Nov 23, 2022

>Then again, many have opted out of owning desktops and laptops in favor of mobile phones. That also seems terribly inefficient

A lot of people don't need computers in the general purpose sense. I admit my mind boggles a bit when co-workers tell me their kids don't want a computer to do their school papers because their phone is fine. But, then, I'm used to keyboards and what we think of as a "computer" and have been using one for decades--and grab one when I can for any remotely complex or input-heavy task.

em500 · on Nov 23, 2022

> A lot of people don't need computers in the general purpose sense. I admit my mind boggles a bit when co-workers tell me their kids don't want a computer to do their school papers because their phone is fine.

I grew up in the 1980s, when handwritten papers were still the norm. I do see the advantages of using a word-processor for writing papers, but don't see why it would be a necessity (at least, until University).

icapybara · on Nov 23, 2022

I think the implication is that the kids use a word processor on their phone.

moffkalast · on Nov 23, 2022

It sounds ridiculous, but I'll admit that when you've got something like Dex that lets you dock the phone for usb and hdmi out and gives you close to a full desktop OS I'd imagine it really is enough for the casual user.

ghaff · on Nov 23, 2022

I certainly know colleagues in the industry who travel with just a tablet and external keyboard. No, they're not running IDEs etc., but they find it OK for emails, editing docs, taking notes, etc. Personally I'll spend the extra few pounds to also carry along a laptop. But I can imagine not needing/wanting a dedicated laptop when I travel at some point.

iso1631 · on Nov 23, 2022

Is a tablet and keyboard really much lighter than a laptop?

https://www.theverge.com/2020/4/20/21227741/apple-ipad-pro-m...

Suggests a keyboard and large tablet is heavier than a laptop

ghaff · on Nov 23, 2022

I'm usually carrying a tablet anyway though for entertainment/reading purposes. So it's usually a choice of tablet + laptop vs. tablet + keyboard. (I admittedly don't really have a weight optimized travel laptop these days either.)

I actually do wish there were good Mac or Chromebook choices for a travel 11" or so laptop but the market seems to have settled on a thin 13" as the floor and, admittedly, the weight/size difference isn't huge.

mark_l_watson · on Nov 23, 2022

While I am mostly a Mac person, for travel I often prefer a tiny and cheap Lenovo Chromebook that does everything (a bit poorly): Linux containers for light weight programming and writing, consume media like books, audiobooks, and streaming.

In response to a grandparent comment about weight for tablets: I prefer Apple’s folio old style of cases/keyboards because of weight. I have one for both my small and large iPad Pros. Whenever I travel, I usually just take one of my iPads if I don’t need a dev environment [1].

[1] but with GitHub Codespaces and Google Colab, development on an iPad is sort of OK.

moffkalast · on Nov 23, 2022

I still don't see the point of tablets. It's just a smartphone with a larger screen, and practically all people already carry phones.

Might as well go for the laptop at that point given that it can actually do far more imo, unless you ditch the phone and go for one of those half phone half tablets I guess.

ghaff · on Nov 23, 2022

I'd rather watch movies, read, play certain games, etc. on my tablet than on a phone. (Obviously there are also specific use cases like digital art.) That said, I mostly use my tablet when traveling and it's a distant third in necessity compared to either a laptop or a phone--and only somewhat more useful than a smartwatch.

everdrive · on Nov 23, 2022

Watching movies on a tablet is terrible, though. All methods for propping the device up so you can watch the movie are inferior to the way a laptop screen props itself up via hinges and a base.

ghaff · on Nov 23, 2022

On a plane I'd rather use the tablet in my lap than have to put the tray table down. And in a hotel room I'm watching on the couch if there is one. (I do also have an attachment for my tablet that will let you prop it up on a table but I mostly don't use it because it adds weight.)

For reading, I'm probably bringing my Kindle along if I don't bring my tablet.

mod · on Nov 23, 2022

I bought a surface for that reason. I like the portability, and it is just a normal PC with a pretty bad keyboard.

brewtide · on Nov 26, 2022

If you do not have one, buy a dock! I have a sp6 and 4 , and having the dock makes it quite the device. Speakers, multiple external monitors, keyboard, mouse -- a full desktop setup, I can grab and either stick a keyboard cover on or just use as a reading device on the couch.

Back to work? Sit on table, one cable and it's back to a desktop and charging up again.

Makes the whole thing make far more sense.

pfdietz · on Nov 23, 2022

How old are you? Because larger screens become really nice as your eyes go bad. And I don't need the full size of a laptop for things I'd want to do on a tablet.

everdrive · on Nov 23, 2022

The obsession with being lighter definitely has diminishing returns. At some point another few ounces doesn't make any difference in a real, practical sense. I think have just started to associate "lightness" == "better" despite there being no actual benefit past a certain threshold.

galaxyLogic · on Nov 23, 2022

Right at some point. But at the current point my tablet is too heavy to hold in hand for more than 20 secs perhaps. Phone is ok. Tablet is not (for me). I only use tablet by placing it on table or a stand. Then actually using a laptop is much better than a table.

The killer-tech will be when we have a tablet that is as light as phone.

groestl · on Nov 23, 2022

Thanks for that. A lot of energy is currently sunk because of natural language, and I'd argue gains from employing software (instead of human processes) for various tasks is in part due to scaling up the results of many confusing discussions in natural language about what a specific process actually comprises.

b112 · on Nov 23, 2022

This is part of the reason Google search sucks more and more.

Around when Android appeared, and the first voice searches began, Google suddenly started to alias everything.

Search for 'Andy', 'Andrew' appears. Search for 'there', and 'they're' appears.

This has been taken further, now silly aliases such as debian .. ubuntu exist, and as google happily drops words in your search, to find a match, this makes precision impossible.

But, that's the only way to make voice search remotely work, so...

jefftk · on Nov 23, 2022

I don't think this is to support voice search: Google generally knows whether a query was initiated by voice or typing. Instead, I think it's because most users find what they're looking for faster with it.

If you have terms you don't want interpreted broadly you can put them in quotes.

Zach_the_Lizard · on Nov 23, 2022

Google "helpfully" ignores the quotes sometimes too. They're not the hard and fast rule they used to be.

I preached the Gospel of Google when the competition was composed of web rings and Altavista, but Google in its infinite wisdom has abandoned the advanced user with changes of this nature.

jvolkman · on Nov 23, 2022

Pretty sure quote support has improved recently.

https://blog.google/products/search/how-were-improving-searc...

b112 · on Nov 24, 2022

Considering the article lies, and tries to claim quotes always are respected, I wouldn't put much faith in it.

thfuran · on Nov 23, 2022

So what is the gospel de jour, or are we forsaken in these benighted times?

bluGill · on Nov 23, 2022

Most people are not precise enough in their terminology.

galaxyLogic · on Nov 23, 2022

I find voice-assistant often useful for using the phone such as opening a given setting, say make the display brighter. Trying to navigate the settings pages is very error-prone. There seems to be no universal standard as to where each setting should be found.

b112 · on Nov 24, 2022

The real problem is people keep reorganizing where the settings are found.

numpad0 · on Nov 23, 2022

There is a widely accepted and straightforward thinking that humans has ideas, which are expressed in languages, and that languages being ambiguous is problematic: this I'm starting to have doubts on.

Maybe we don't have clear intentions in the first place, maybe languages are not just ambiguous, but only meant to narrow realms of valid interpretations down to a desired precision, rather than intended to form a logically fully constrained statements. Maybe this is why intelligent entities are needed to "correctly" interpret natural language statements, because an act of interpretation itself is a decision making and an action.

Just my thoughts but I do think there are more to be said than "natural languages are ambiguous".

stubish · on Nov 23, 2022

> On the other side, humans have been fine using natural language to delegate commands to each other.

Using language to instruct humans goes wrong all the time. Just a short while ago on British Bakeoff I saw 2 of the contestants make white chocolate feathering on their biscuits by making actual feathers out of white chocolate and placing them on their biscuits. And I'm sure that will confuse quite a few people reading this too. It certainly confuses image searches. Language is a fuzzy interface. Compare to interface like clicking on a button that does the thing I want done.

Closi · on Nov 23, 2022

How would you (easily) describe the concept of chocolate feathering to a computer without using natural language? (e.g. if you wanted the computer to generate an image, or search for an image of / recipe with chocolate feathering).

missjellyfish · on Nov 23, 2022

> On the other side, humans have been fine using natural language to delegate commands to each other.

And that's why all of aviation has moved to a tight phraseology, such that delegated commands are universally understood and their meaning is set in stone.

Natural language has cost many lives.

denton-scratch · on Nov 23, 2022

> humans have been fine using natural language to delegate commands to each other.

Not always resulting in unambiguous instructions:

"Lord Raglan wishes the cavalry to advance rapidly to the front, follow the enemy, and try to prevent the enemy carrying away the guns." ~Lord Raglan, Balaclava

"I wish him to take Cemetery Hill if practicable." ~Robert E. Lee, Gettysburg

heavyset_go · on Nov 23, 2022

> On the other side, humans have been fine using natural language to delegate commands to each other.

On the other hand, legalese exists and is the lingua franca of telling people what to do, and math exists.

ska · on Nov 23, 2022

> On the other side, humans have been fine using natural language to delegate commands to each other.

I think this is really a characterization. Mostly human communication is full of errors and problems.

What is true is that when it is important enough, humans have come up with ways that minimize communication errors and frameworks to deal with ambiguity - mostly these involve training and effort though, it really doesn't come naturally.

ska · on Nov 23, 2022

"really a problematic characterization"...

marcosdumay · on Nov 23, 2022

> humans have been fine using natural language to delegate commands to each other.

Every time we try to minimize errors, we formalize a language. I don't even think people use natural language to issue commands often. Commanding people is often considered rude.

psadri · on Nov 23, 2022

I agree with this. We have evidence that natural language works well enough to run most of the world. AI will eventually get there.

pjc50 · on Nov 23, 2022

The problem is that it's not actually a conversation. To significantly improve it, you'd want to:

- identify users by voice

- ask them clarifying questions

- remember the answers on a per-user basis

- understand "no, that was the wrong answer"

If you're going to provide a formal interface to the computer, you also have to provide teaching in that formal interface, which is far more of a burden to the user than the cost of the device. And we've completely moved away from that model (not necessarily a good thing, but that's what the market has chosen).

enobrev · on Nov 23, 2022

Calling it a burden is an assumption that ignores and belittles the end user. Sure, there are people who won't want to train their personal ai.

But I imagine there are significantly more who would appreciate clarifying requests by a teachable assistant capable of interacting with the entire digital world on their behalf, efficiently and intelligently.

michaelbuckbee · on Nov 23, 2022

I think you're right. There are glimpses of this in the voice interfaces right now. For example, Alexa will distinguish between voices and preferentially take actions for me, saying "Play Music" plays Spotify, and for my kids, it plays Amazon music.

4b11b4 · on Nov 23, 2022

An example backing this is voice assistants that DO work, e.g. Talon voice. But these require defining a language, and then they are very accurate and powerful.

I don't see why a voice assistant for the masses couldn't "train it's own users", for example suggesting the language it does expect. But even then, most times people are talking in noisy environments or talk to fast or don't have an understand of how the machine might work. Regardless, who cares. They ruin the audio environment of a home. They're good for setting timers while you're cooking, that's about it.

tsss · on Nov 23, 2022

Car voice assistants do this, but they're still clunky and it takes them forever to list their options. Voice interfaces just like CLI suffer from extremely bad discoverability and presentation compared to GUIs and thus will always be limited to specialty applications. CLIs at least have a league of try-hards and hobby linux users to keep them alive.

Al-Khwarizmi · on Nov 23, 2022

They're also fantastic at playing soothing music while your hands are busy holding a crying baby.

Thlom · on Nov 23, 2022

Only thing I use Siri for as well.

version_five · on Nov 23, 2022

Right - natural language works for people because we have minds that are communicating. A virtual assistant has a list of things it can do, and uses language as an interface to them. So the language just becomes obfuscation instead of allowing clarification.

I've said before, I would prefer a voice assistant that optimized for traversing its menu system, in response to unambiguous noises (could be high and low pitch hums or whatever) that lets me bypass the guessing game and use the menu it's hiding

klibertp · on Nov 23, 2022

Like this: https://www.youtube.com/watch?v=8SkdfdXWYaI ? Here you traverse the AST, but the idea is similar, I think.

foobarian · on Nov 23, 2022

The problem is that it doesn't make money.

Otherwise, it works great :-) We love the hands-off usage mode because we cook a lot, so adding things to shopping lists or looking stuff up doesn't require cleaning hands in the middle of prep. Also the speakers are pretty darn good for the size and work well for music.

Doing complicated things is right out though. But the simple stuff works fine.

Ajedi32 · on Nov 23, 2022

I'm just waiting for someone to finally release a voice assistant built around an actual language model, like GPT-3 or LaMDA.

It would be more error prone in a lot of ways, which is probably why nobody's done it yet, but it would also be a _lot_ more powerful, and fulfill the vision of conversational AI in a way the current rules-based assistants do not.

I think if powerful language models were easily accessible to normal people (in an inexpensive and completely unrestricted fashion, like with Stable Diffusion) we'd already see this happening in the open source world. Companies are going to be a lot more hesitant to try it though until they have a way to 100% prevent the models from making mistakes that could reflect poorly on the company, which is going to take _way_ longer to achieve.

RupertEisenhart · on Nov 23, 2022

Are you trying to say, Alexa should be funding the synthetic language nerds over at Lojban[0] or the Universal Networking Language[1]???

That would be a fun universe.

[0] https://mw.lojban.org/index.php?title=Lojban&setlang=en-US

[1] https://en.wikipedia.org/wiki/Universal_Networking_Language

gernb · on Nov 23, 2022

Natural language conveys information to other people just fine. So the problem isn't that "Natural language is a fundamentally wrong vehicle to convey information to a computer". The problem is getting the computer to understand natural language to the same level as a human.

darkerside · on Nov 23, 2022

The problem is both

ClumsyPilot · on Nov 23, 2022

> we shouldn't regard formal language as a burden, but rather as a privilege

What the hell? Is riding public transport or riding a bike either a burdain or a privilidge? Is Driving a car?

I am trying to control shit in my home, it should be neither.

duggan · on Nov 23, 2022

Dijkstra's full essay[1] is a bit more illuminating, but essentially it's about how, for example, developing a system of symbols and formal language around mathematics has allowed "school children [to] learn to do what in earlier days only genius could achieve".

1: https://www.cs.utexas.edu/users/EWD/transcriptions/EWD06xx/E...

em500 · on Nov 23, 2022

I think his argument even generalizes to literacy in general. Remember that reading and writing skills don't develop naturally (as opposed to spoken language). They require a large educational investment, and used to be reserved for the wealthy and the privileged.