If we're only using unit tests to see if line N gets called, I think we're doing it wrong. Instead, we want to use unit tests to tell us if the answer we get is correct -- while exercising line N.
This lets you verify that this particular branch (when bar=12) is executed, and that your results are as you expect. If you change some of the underlying calculations, things can break (as you get different answers), but then you at least have a test that lets you ensure that changing the answers is what you want to do. Sometimes, you want to change the way you calculate something and get the same answer, after all.
So the question is, what if bar = 12 means "retrieve foo with ID 12 from the FooService", and we want to make sure that get_foo does something sensible when the server returns an error code.
So what do you do? Typically one writes a mocked up service that returns the expected results, pass it in somehow, and write your tests like you did above. So if you have a lot of services, you end up doing what I said - testing lines of code. Here you're testing that FooService gets called. If it also called BarService, you'd be testing that BarService gets called. And so on.
But then later, we decide that FooService is no good - we want, nay demand - a FooService2, with a new API. However, get_foo is to behave the same. So what now? Only one choice - update all of the tests to have a mocked up FooService2.
As the code base grows, I find this becomes annoying to maintain (although to some extent a necessary evil, of course).
The alternative I'm saying is that instead of spending a ton of energy unit testing get_foo, hammer on it and write good state validators. Your unit test shows that get_foo for bar = 12 returns a unit_price of 97.12. Great, but I don't think that's as interesting as building a fake database (so we know what the expected returns are), and then making 50,000 requests per second with random data for 12 hours, and then through runtime assertions and as a post-process validating that the service actually did what it was supposed to.
If writing tests (or production code) is boring, then you probably have design problems. Boredom is a sign of duplication--not cut-and-paste duplication, but the more insidious duplication caused by poor abstractions.
In the example you gave (of a mocked up service), what would happen if you didn't mock the service? That would mean you couldn't call the service in your tests, right? (Except the tests that tested the service integration explicitly.) How could you change your design to make your code testable under that constraint?
The answer depends on your situation, but one way I've seen this play out in the past is to make more value objects and create a domain layer that operates on value objects rather than the service. The service is responsible for creating and persisting the value objects, but the domain layer handles the real work of the application.
As a result, testing the important part of the application can be done completely independently of the service. No mocks required. Now I can upgrade to FooService2 without changing any tests except for FooServiceConnection's. My value objects and domain layer remain unchanged.
If you're bored while programming, look for the duplication hiding in your design. In the case you're describing, it's hiding in the oft-repeated dependencies on FooService.
Interesting? Testing isn't about programmer enjoyment, it is about correctness.
A 12 hour stress test is going to catch different things than a unit test. A simple suite of regression tests can be automatically run by a continuous integration tool to flag if the build is broken and alert the dev/team so things can be fixed quickly.
In this example it seems you are thinking of writing tests as a separate step from writing the code, which is part of why it seems like a chore. Make small changes, update the tests until everything passes & new code is exercised, commit, repeat. (Or if you are better disciplined than I, update your tests first and then write code until tests pass, repeat) Monster refactors, while sometimes necessary, are best avoided when possible.
What I meant by interesting is that they tell me more about the health of the system, not that they're more interesting to write.
Suppose there's a race condition in some library that you're using. No unit test in the world is going to catch that. Now, unit tests certainly have their place - but my point is that from what I have seen, unit tests catch the boring bugs, while integration tests catch the interesting ones(by interesting here I mean obscure or subtle - deadlocks, invalid state, etc), while at the same time inferring the boring bugs (ie. add(5, 3) returns 7 instead of 8), so that hour for hour, especially with limited resources (ie. a small startup), integration testing has the potential to give you a lot more value.
However I would still add that simple suite of regression tests (in my CRUD app these are almost entirely integration tests), often speed up development by more than the time it takes to write the tests in the first place. So to say a startup doesn't have time for them seems shortsighted.
e.g.: def test_foo(self): result = get_foo(bar=12) self.assertEquals(result.category, 'FOO') self.assertEquals(result.name, 'my-foo-12') self.assertEquals(result.unit_price, 97.12)
This lets you verify that this particular branch (when bar=12) is executed, and that your results are as you expect. If you change some of the underlying calculations, things can break (as you get different answers), but then you at least have a test that lets you ensure that changing the answers is what you want to do. Sometimes, you want to change the way you calculate something and get the same answer, after all.