IO handling is just one of the uses of monads in Haskell. List comprehensions (like in Python) are another use. They are also used for parsers, asynchronous programming, exception handling and continuations.
You also do a good job summarizing why IO is hard in Haskell, in that you need to understand a lot about the language to understand how IO works. I guess it's up to the individual to decide whether FP is worth getting past that initial hump.
I found the way Clean deals with IO simpler to learn than Haskell's Monads. (But then, I learned Clean before I learned Haskell. They have a nice Jump-n-Run in their example directory.)
Clean employs linear logic, that is you can only have one copy of the 'World' in your program. Look at the printing function as an example: It takes your old World and transforms it into a new World, that has something printed to the screen.
Of course such a view of the outside world is quite autistic. But it works as long as you manage to have only one World around in your program.
The IO Monad in Haskell could have been implemented with linear logic. (And in GHC it is, I think.)
I wrote a answer on stackoverflow which tries to explain monads with some practical examples: http://stackoverflow.com/questions/44965/what-is-a-monad#194...