Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Isn't Aria there to describe the structure of the page so that say visually impaired users gain the same information as any other user? ie the interpretation of what that page then does, and so the appropriate action to take is largely left to the human user post description - just as web you load a page and look at it - the human brain works out what to do based on those visual and textual clues.

This leaves agents trying to work out page intent, allowed values for text fields - parsing returned pages for working out success or failure etc.

I'm assuming that's why they want what is effectively an in page API - that massively improves machine accessibility and can piggy back on browser authentication systems so the agent can operate on the users behalf.



The website is the API though. HTML is one of the few RESTful systems people still use today, build semantics into the page and humans and LLMs can understand how to use it.

A11y specs and APIs are just a way of presenting those semantics differently, often for those who can't see the page, whether visually impaired or in this case an LLM.

At least in my view, we should expect anything claimed to be artificial intelligence to be able to interact with things much like a human would. I'm not going to build an MCP for a CLI tool, for example, I'll just make sure it has a useful man page or `--help` command.


I think you are confusing two things.

- the semantics of a form and a button and the resulting http POST/GET - and what the page actually does!

So I can have two pages - both with html forms - what they actually do on submission might be completely different - one buys a potted plant the other submits a tax return.

ie the meaning of the action is in the non-semantic elements - the free text, the images, the context.

This is the stuff that's hard for the agent to easily determine - is this a form for submitting a tax return or not?

If what you said is true then there would be already agents out there that use ARIA info to seemlessly operate the web. As far as I can see people have tried to use that information to improve agents use of the web - but have met limited success - and that's for well annotated sites - not because sites aren't ARIA enabled.


A human needs to be able to distinguish the buttons though, both visually and via accessibility tools.

I would hope those two buttons and forms include labels, description text, indicators for required fields, etc. All of that should live in the HTML and includes attributes as needed for a11y. LLMs can use that, they don't need yet another API to describe it.


> they don't need yet another API to describe it.

WebMCP isn't accessibility support for humans, it's accessibility support for agents, which despite all the hype, are less capable than humans in working out what's going on, and find functions and data schema's easier to understand than a web page designed for human ( whether that's a partially sighted human or not ).




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: