If you've been away from Ocaml for a while, this would be a great time to check out what's changed. A package manager, a new optimizer, and we'll soon have modular implicits (think Haskell-style type classes, but better) and multicore. Ocaml is really a first-rate programming language and I wish more people knew about it.
If I'm not mistaken, this release is not the multicore release[0]. Also, if anyone's keen on reading the paper behind modular implicits in OCaml, its a fun read - http://arxiv.org/abs/1512.01895
Any time I want to look further into OCaml, it's because the purist in me wants a better language than the ones I'm already using. And it always comes down to Haskell vs OCaml for me, and Haskell always wins out because OCaml has lots of edge cases built into the language and the standard library that they justify and excuse rather than deprecating and cleaning up. Then again, Haskell also has Prelude. But it feels like OCaml has way more legacy cruft.
I was prompted to try an OCaml tutorial by this HN post. I soon saw that strings and numbers each have a different print function. Is this part of the cruft you mention? Why have two print functions?
When it comes to functions, there's ad hoc polymorphism, and there's parametric polymorphism.
Parametric polymorphism is when you can define a function without knowing the precise type of its parameters. However, this limits what you can actually do with the parameters.
For instance, a function concat([a], [a]) -> [a] which takes two lists containing items of type "a" knows enough about the type of its parameters (they're lists) to do its job, but it doesn't know everything about them. This is the type of polymorphism OCaml supports. Think of it as "one implementation, many types". That's why OCaml has more than one print function, and why it has integer and floating point versions of operators like + and *.
Ad hoc polymorphism is when you can define implementations for a single function for whatever types you like. So you can implement add(Int, Int) -> Int, as well as add(Float, Float) -> Float. So this is "many implementations, many types." Haskell supports a version of this through typeclasses. OCaml doesn't currently support it, but the new "modular implicits" feature, as I understand it, will provide largely the same functionality.
This is incorrect. Haskell and OCaml both support parametric polymorphism, but Haskell also supports ad-hoc polymorphism, whereas OCaml does not. Modular implicits, it seems, are OCaml's solution to the problem of reducing the set of valid types in a polymorphic function.
Haskell: concat :: [a] -> [a] -> [a]
OCaml: val concat : 'a list -> 'a list -> 'a list = <fun>
Both are parametrically polymorphic. However, in Haskell, I can use a type class to support ad-hoc polymorphism:
concat :: (Num a) => [a] -> [a] -> [a]
Now concat works forall a, as long as a is an instance of Num. This is important for arithmetic operators:
(+) :: (Num a) => a -> a -> a
Now I can have (+) be defined in a meaningful way, since Int, Integer, Float, etc. are all instances of Num, and implement (+). An easy way to think of this (since it's all it is, really) is operator/method overloading. In Haskell, (+) is overloaded for every type that you would want it to work for (the details of this are actually probably different, but are not important for learning).
In OCaml, if I have
val (+) : 'a -> 'a -> 'a = <fun>
There is actually only two implementations of that function:
let (+) a b = a
or
let (+) a b = b
Since I cannot restrict the set of types to the function, I can't do anything with the arguments. I don't know what the types are! Ad-hoc polymorphism allows for restricting the set of types, which allows for Haskell to use + for any Num-like thing.
Hm, after re-reading a couple times, I think I merely misinterpreted how you said "OCaml supports parametric polymorphism" and "Haskell supports ad-hoc polymorphism." Based on your wording I thought you were saying that Haskell only supports ad-hoc polymorphism and OCaml only supports parametric polymorphism. Surely there was something in my head at 2am that made me think you were off in your description, but I don't know what it was. Sorry! I would edit it if I could.
EDIT: I think the line that stood out to me as being odd was "That's why OCaml has more than one print function, and why it has integer and floating point versions of operators like + and *." which was after you explained what parametric polymorphism was. So my interpretation was "OCaml supports parametric polymorphism, which is why there are multiple addition functions for different types." Of course, the real reason isn't due to supporting parametric polymorphism, but specifically because OCaml lacks support for ad-hoc polymorphism.
Interestingly, OCaml does have sort of pseudo-ad-hoc polymorphic implementations for comparison operators.
It's annoying I guess, but you get over it in a few hours. Interestingly it's only annoying from an ideological standpointβit's never bothered me when reading or writing code.
Anyway, the reason, along with having separate sets of arithmetic operators, is basically because Ocaml βgenericsβ can't be specialized like C++ templates can. Doing it this way keeps the whole language fully type-inferable.
It is actually annoying from a very practical point of view. You cannot, for example, write an algorithm that will work an any sequence type, and then apply it to a list. This is why, for instance, you have List.map and so on
You can, that's precisely why we have functors and for this kind of use cases (abstracting a module over another), they are much nicer than type classes.
The Ocaml std library is a bit lackluster and people often replace it with something more featureful.
BTW, I think https://github.com/c-cube/sequence is more in closer to the generic iterable datatype you are looking for. You still need to convert to and from the seq type but its pretty close...
Multicore will take more time before it's stable, but it will be available via opam as another alternative OCaml toolchain until then. It's basically the result of KC's research: http://kcsrk.info/
Note that multicore includes two separate "features": the algebraic effects abstractions, and the multi-core runtime (domain heaps). From what I understand, OCaml's multi-core effort will include both of these as well as some improvements to garbage collection.
It's hard to say which is more exciting for me, but it probably comes down to effect handlers in a production-capable (if not production-ready) language that can work across different compilation target. Lots of interesting tooling that can be built on top of that!
OCaml is a great language. I'm so glad I was forced to learn at in university. Apart from the language itself, the ecosystem surrounding it is very mature as well.
- OCamlbuild is the bee's knees. Configuring complex projects requires just one line in the Makefile
- OPAM works wonderfully. No configuration required. Everything just works.
- Merlin (https://github.com/the-lambda-church/merlin) is incredible. If you use Vim / Emacs, you owe it to yourself to set it up and have better syntax highlighting, indentation and type information right within your editor.
- Approachable online guides to get started. RWO[1] is a fantastic resource and teaches you the good parts right from the start.
If you use Core, after reading RWO, and don't like the large executables, rest assured that there is an effort by one of the flambda devs to improve dead code elimination as well, but that will also take a while before it's available. I wish it would happen soon, now that flambda has landed.
To provide an alternate viewpoint, OPAM isn't that great of a package manager if you've come from other languages or have used a Linux distro's package manager. It's incredibly slow, partly because you're supposed to use an external constraint solver. When it crashes, it leaves lots of things in an inconsistent state. It's environment handling is really picky and gives warnings about not being set up properly even when it is, because it used to (maybe still does) expect itself to be the last thing in a lot of environment variables. It is something I try to avoid when possible.
EDIT: Sorry, realized I replied to the wrong comment. Meant to reply to the grandparent.
I'm really wondering to which package manager you are comparing, because except cargo, all the other language package managers are either worse, or just straight crap.
Linux distro package managers are usually decent, but most of them are binary package managers and don't offer the same set of feature, so it's a bit different (and they are a fair bit older too).
Better to not cause the damage in the first place, consolidate the parts of the transaction that were usefully completed into an isolated location, and provide a way to resolve the issue and resume the transaction. I think at least Pacman and Yum do this.
Calling failed builds "damage" is rather overstating it. If building a specific package fails, OPAM continues with as many other packages as it can. Arguably that's not even "transactional" in the atomic sense, but the behavior is pretty much what you said.
Then it gives you the error, and gives you a command to roll back everything to the previous state -- something I cannot do with eg, apt, as far as I know. But that's a bit apples and oranges -- apt, yum, pacman, etc are all binary package managers, with a different set of tradeoffs.
I really miss the possibility of configuring dependencies per project in a easy way. OPAM will allow this using OPAM switches, but it is complex and error-prone, while most any other package manager (sbt, leiningen, maven, nimble, cargo, npm...) will do this by default.
For some reason, OPAM defaults to installing dependencies globally and does not provide an easy way to support different workflows
> Any programming language is well-suited for some programming tasks and awkward for others.
Well ... with 50 shades of gray in between. You can do pretty much anything with Python f.x. But can you write a web service in Haskell without having to hire a team of computer science PhDs? Not so sure ... but luckily you're nephew will happily realize it with PHP during holidays for a couple of bucks :E
I hope after this release they will start improving OCaml support for Windows too (mostly smoothless opam experience, like the package managers for other languages) https://github.com/ocaml/opam/issues/2191
The first part is major changes, like the addition of GADTs in 3.12 -> 4.00.
The second number is also kinda major features, but not as large. Kinda similar to the Python release cycle. New features get introduced here, so things like multicore, flambda, effects, modular implicits would land here. It has a rather long release cycle but I heard there are plans to shorten the release cycle to something like roughly 6 months. These are 2 digits zero-padded for no real reason except maybe easier string sorting, so 4.03 would translate to 4.3 and 3.10 would be 3.10 and not 3.1.
Then the last digit is for patches and regression fixes that do not introduce new features.
Don't want to start a flamewar but... why choose OCaml over Haskell nowadays?
Want purity, elegance and best community resources? Go with Haskell. Want Functional+OOP+kitchensink? Go with Scala. Want .NET? Go with F#. Want a strict/eager Haskell? Apply for a job at Standard Chartered (they say they have an in-house strict and better optimizing Haskell compiler).
What would X be, so that you would be able to say "Want X? Go with OCaml"?
On a practical level, I'm very fond of the strict evaluation and the lack of purity. Its easier to write "traditional" imperative algorithms when you don't need to use monads or monad transformers everywhere.
When it comes to resources, I don't think Haskell is strictly superior. opam, merlin, and utop are pretty nice... I'm also very fond of the menhir parser generator.
On a theoretical level, Ocaml does have some neat things missing from Haskell, like its module system (with functors), polymorphic variants, etc...
---
BTW, I wouldn't dump Ocaml in the "object oriented" side of things together with Scala. Nowadays the OO infrastructure is more commonly used for polymorphic variants than for OO, to be honest.
I would hesitate to call polymorphic variants "part of the OO infrastructure". They are really orthogonal to OO, but they do make OO less necessary. Ironically, OCaml has a really nice OO system but it's rarely used because with modules and functors there is little need for it. Put differently, a lot of the conventional use for OO is as a poor man's module system, and when you have a really good module system, you only use OO when you absolutely need the OO kind of polymorphism, and those cases are surprisingly rare (like existential types in Haskell).
Well, maybe "writing a compiler"? or "building a multimillion-dollar Wall Street financial services firm"? :-)
In fact, I love and use both Haskell and Ocaml, and I view the two languages as cousins. Both are functional. Both have features the other doesn't. Ocaml has the best module system in the business, with functors for module abstraction (unbelievably useful). This is much better than the current Haskell module system, though Backpack may narrow the gap. Ocaml also has polymorphic variants, which are surprisingly useful. Ocaml has some convenience features like named/optional arguments that Haskell doesn't. And oddly enough, the lack of purity in Ocaml (if used judiciously) can be a real win. You can do imperative programming without jumping through a lot of hoops. You can have code which is purely functional from the outside but which uses imperative idioms internally for efficiency (yes, I know about unsafePerformIO in Haskell, but it's much easier to do this sort of thing in Ocaml). And Ocaml is usually faster both in compilation time and run time.
My overall take on it is that I prefer Haskell for small-scale programming but I prefer Ocaml for large-scale programming. YMMV.
You're right, but ST is also implemented using unsafePerformIO. The larger point is that you don't need to dress up a type signature with ST or IO just to use imperative code internally which can't leak out to the user level. Whether this is good or bad is a matter for debate, but used judiciously I think it's good.