I think the most important point here is that you should hire engineers from your domain, but not necessarily your exact technology stack -- in their case it's simple because almost no engineers know Erlang well, so they have to hire "database people" and convert them, but I think it's more generally applicable.
Personally, I would much prefer to hire a "web person" regardless of the stack they're used to than, say, a "Java person" who has never written a website. But even more specifically, I would look to find somebody whose interests are in my domain, as they will always have better ideas and motivation than a strictly mercenary engineer.
Not to mention the fact that most experts will spend their time on concept on design rather than code cutting. I think Rob Pike said it well when he said he didn't event have commit privileges at Google while working on Go because he still coded in K&R C.
It's a pity they didn't talk about Erlang's strengths and weaknesses in a bit more detail.
I like Erlang a lot, and pushed to use it in a project I'm working on - it is the right tool for the job in this case.
But programming languages aren't like hammers or saws. You can do a ton of things with any one of them. This particular project probably could have been made to work one way or the other with Java, Javascript, Tcl, Ruby, Python, Perl, PHP, Haskell or any number of other things. And this is, in some ways, a problem for Erlang. It's really, really good at a few things, ok at a number of things, but feels a bit spotty for other things where other languages have something in place. At least that is my impression.
I saw the article as more of a reaction to all of the technical (and often times inane) blog posts about why company x uses technology y.
The main points I took away from the article were:
- They did their homework, and
- They knew that, even though it is not the most widely known technology stack, they were seeking the kind of engineer who would not only be able to pick it up, but understand their rational for choosing it
And it seems as though they've been successful! As engineers, most of us are interested in the many interesting technologies available to us, so much so that, when examining a company like Basho, many of us ask "Why Erlang?" before we ask "What are you making and why?"
Without putting words in their mouth, it is my impression that Basho sought to create an incredibly powerful and easily scalable database. They chose Erlang because it was, in their opinion, the tool for the job. As I see it, 10gen got the ball rolling towards approachable scaling/development and now Basho and the team behind RethinkDB are trying to improve by creating systems free of MongoDB's many limitations.
I'm the founder @ Rethink. Just to clarify, we don't see approachability and scalability as a dichotomy, we think one could do both really well. There are of course some tradeoffs involved (which is why the [awesome] team @ Basho chose a dynamo-style system and we chose a master-slave type system), but in the grand scheme of things, high scale and approachability are definitely possible and we (and possibly others) are going to make it happen.
Yeah! Exactly! Sorry if it was unclear, but what I meant was that I saw the latest wave of DBs (Rethink + Riak) as finally letting the administration & scalability side catch up to the programmer approachability side. Anyone with significant experience knows sharding MongoDB can be a massive pain. Sharding Cassandra is easier, but it certainly lacks the juicy programmer interfaces provided by more modern solutions. I'm very excited about the work you guys are doing.
I think the biggest pluses for Erlang as a language are that fault-tolerance (Erlang "let it crash" + OTP supervision trees) and distribution of tasks (process interact via message passing, be it across procs or nodes) are built-in language features. They aren't libraries or afterthoughts. This subtle shift manifests itself in all Erlang code. There's rarely a fear that some esoteric library you want to use has a memory leak because it's keeping state in some static hashtable, or that semaphores are used incorrectly causing a race condition.
Quick question: I don't know much about Erlang compared to other functional languages, but how is Erlang superior to another language which claims to have some of these traits, such as Go?
There was a really good post earlier this year about how Erlang does scheduling (http://jlouisramblings.blogspot.fr/2013/01/how-erlang-does-s...). I think one of the important takeaways is this: "[Erlang] values low latency over raw throughput, which is not common in programming language runtimes". It's worth a quick read.
Ultimately I find that Erlang's strength is not one thing or another, it's a combination of all of its traits: from the functional nature, to OTP for providing a consistent application structure, to its VM; all of these things put together make it interesting and useful.
I've yet to get into Haskell (I will eventually) but I really do find that constructing applications in a functional manner, with state being stored in independent processes within the running application, is a superior way to build systems.
One final note: I really like Go as well, for very specific reasons. Go's resource usage (both in terms of CPU consumption and RAM) is soooo nice, especially coming from Ruby. I was never a C programmer but I can wrap my head around Go and use it to build low resource usage apps with specific purposes, and sometimes this is exactly what is required.
It's been a few years since I've done any Erlang work, but there's a couple of places of the top of my head where Erlang take fault tolerance to a higher level as compared to Go.
While Go's goroutines give you similar sort of concurrency, they don't work in the distributed sense in the same way as Erlang. Go let's me fiddle around with the goroutines on my computer, but that's it. With Erlang, I can natively talk to the routines on any of the computers in my cluster. Just prepend a function call with an IP address and I can call a function on a completely different computer. While you could emulate this with XMLRPC or JSON, it's not backed into the language and it's not treated as a first class function the way it is in Erlang.
Also, Erlang's basic libraries were built with hot code loading in mind. You shouldn't have to stop and restart your software just to fix a bug. Now, I've seen discussion on how it's possible to do this with Go, but Erlang keeps the idea so pervasive that almost every beginner's guide include how to do it. As a side note, being a dynamic language has some advantages with hot code loading, though this can be a matter of taste.
The Netchan project I think showed how that particular limitation -- distributed goroutines -- will be overcome in time and, i think, built in to the language runtime.
This is just idle speculation but it seemed to me, reading the mailing list ages ago when this was discussed, that netchan worked in many use cases, but not others, and it just was not the right time in the development of the language to be distracted by a tertiary feature like that. Now, I could be mis-reading the whole thing and maybe it will never be a language feature. But if I were a betting man...
Adding on to what other's have said, I'll mention that while Go's concurrency features match Erlang's concurrency primitives pretty closely (`go routine()` compared to `spawn(fun routine/0)`, Go channels compared to `Pids ! {message}` and `receive`), almost nobody who does work in Erlang uses those primitives. Most concurrent/distributed systems in Erlang are built on OTP, the set of libraries built on those primitives, which has it's own set of higher level tools and abstractions (supervisors, gen_servers, FSMs) that provide a lot of the failsafes and profiling built-in. Basic Actor concurrency is simple in principle (but like most things in distributed computing) extremely hairy in practice, and OTP's libraries and templates take care of almost all of it (a reasonable analogy might be naked C++ vs. C++ with the standard libraries and Boost -- you CAN implement your own raw pointer-mungling data structures, but almost nobody does).
The wikipedia article has some confusion about whether Erlang uses the actor model for concurrency. If you have some spare time, it would be nice to clean it up.
* Fault tolerance comes first. Built in from ground up.
* High concurrency.
Yap that is it. Ok, it has many other features but they amazingly fall from those one. Here are the secondary features and how they relate to the primary ones.
* Fault isolation. If a system is to be fault tolerant, faults must be isolated. If one process crashes it shouldn't bring down the whole system.
* Easily parallelizable. Because of isolated processes and the desire for high concurrency. It was easy to build in a sane scheduling algorithm that can spread the load across multiple CPU cores.
* Functional. Functional programming discourages handling large states. This a large Java class with 50 instance variable that could be modified by a 20 different methods. Functional programming encourages passing the state along explicitly.
* Built in distribution. It is hard to make a fault tolerant system (ok impossible in practice, to be more precise) without redundant hardware. Servers will fail but your service must not. You must have more than one server read to take over. Distributing your application across multiple physical machines is built it. You send a message to a local Erlang process like this:
Pid ! Msg.
Here is how you send a message to a process running in another data center. Maybe half way across the world.
Pid ! Msg.
That is pretty nice.
* The system is responsive. A non-response system can be considered a failed system in some domain. Think about a mail server. If the user click on a message and it take 5s to return a response and open it, maybe the person would consider the system as broken. This also comes out of concurrency and fault isolation. As the load increases instead of throwing errors everywhere the system gracefully absorbs the load while still staying responsive.
There are at least 2 major differences relevant to message processing:
1) Erlang runs in a VM which provides preemptive scheduling and builtin message flow control (sending a message costs proportional to the receiver mbox size)
2) Unlike other VM languages, Erlang VM provides per-process heap, which helps to eliminate VM pauses. None of JVM based solutions can provide low latency processing unless you manage the heap by hand.
As people said before, Erlang combines several traits in a nice package, which is hard to beat.
There are other differences in the language as well. Bing a functional language with TCO it allows you to implement a lot of algorithms which will be cumbersome in imperative languages. Powerful pattern matching capabilities allow one to structure the program in a clean way, and pattern matching on binaries makes most binary protocol parsers one-liners.
I've barely touched either language myself, but my understanding is that you can do almost anything you could do in C in Go, whereas Erlang is much more restricted in how it lets you communicate between processes, so there's more of a guarantee of fault-tolerance in Erlang. I could be completely wrong, though, so please don't take this at face value.
"It's a pity they didn't talk about Erlang's strengths and weaknesses in a bit more detail."
I feel the exact same way. Erlang is a very peculiar language; quite different than, pretty much everything there is out there. More details about why Erlang is/is not a good choice for a specific job are always very welcomed.
I'll be presenting on using Erlang for an authoritative DNS server at StrangeLoop this year (https://thestrangeloop.com/sessions/erlang-for-authoritative...). If you can make it hopefully you'll get some good tidbits on why I selected Erlang and what kinds of benefits and drawbacks became evident as the system developed. If not then perhaps the sessions will be recorded (I believe they are supposed to be, but I'm not 100%).
This is a phenomena worth understanding, if you don't already:
"I had an entertaining and ironic conversation about this recently with a manager at a large database company. He explained to me that we had clearly made the wrong choice, and that we should have chosen Java (like his team) in order to expand the recruiting pool. Then, without breaking a stride, he asked if I could send any candidates his way, to fill his gaps in finding talented people."
I found Erlang to be much easier to digest the second time through. The first time I looked at Erlang, I knew of its benefits, but didn't have much experience with functional programming. The unusual syntax and unfamiliar paradigm led me to put it aside.
Later, after I learned me some foldr and flatMaps (mostly thanks to Martin Odersky's Scala/FP course on Coursera), I revisited the language when I saw an article here about an Erlang based CMS (http://zotonic.com/). It seemed pretty cool, I host some websites for local businesses, so I set up it to evaluate it and realized that I now had no trouble at all with Erlang and the syntax made more sense. Searching around for some other web stuff I found Chicago Boss (http://chicagoboss.org/), a Rails-inspired web framework with ridiculously easy Comet/WebSocket/"real time" support.
Now I'm really excited about it and I can't wait to do some cool things with it.
My experience was quite similar, although it took me three tries to get it. I think it was http://learnyousomeerlang.com that finally helped me understand enough to really start to enjoy Erlang.
I've also found similar - I wonder if there's something about functional programming that means the language syntax /has/ to be obtuse...
(Similar with Haskell too -- learning that at university, I loved the concepts, and I'm sure it's technically great, but the APIs seem almost deliberately horrible; one-character variable names, array append and prepend being called "cdr" and "car" or something like that, stuff like that all throughout the standard library and tutorials. I don't care that there's a historic reason for that based in the PDP's Lisp-to-assembly compiler, if the function's job is to append it should be called "append()" -_-;; </rant>)
The large database company example is especially telling. The company I work for chose Java mainly for the benefit of a large talent pool, but we have great difficulty hiring people. I think it's because we're in the middle of nowhere and don't pay Silicon Valley wages. I think if we had chosen a more "offbeat" language (to borrow mosburger's adjective) we would have had a novelty factor that would increase interest in the job. After all, we're doing interesting stuff, but our technology decisions make us seem very dull.
Edit: Please note I don't have the authority to make a technology change and see what the effect would be on our hiring, despite the good ideas of those replying to me.
You could start migrating to Scala, which is simple with a Java-based stack. Scala is sufficiently hip at the moment to attract programming enthusiasts.
Similarly Clojure is JVM based and it draws interest of many Java gurus, because they do not need to learn the platform from scratch (the Java library is available).
It's a slightly different conclusion (use best tool for the job, earn respect of developers vs. use esoteric tool for the job, attract developers eager to learn something off-the beaten-path), but with similar ends. Using an offbeat language can have its benefits.
it would be interesting if they covered some of the technical problems they've had with using erlang in a database. writing a db in a VM language like erlang or java you inevitably end up fighting with the VM for some workloads. i noticed that recently one of their engineers tried to submit a patch to erlang to force it to not put schedulers to sleep when there is work to be done (http://erlang.org/pipermail/erlang-patches/2013-June/004109....). it had an interesting flag name '+zdnfgtse' which looks like z-do-not-f-go-to-sleep-ever :)
I do believe that Erlang is the right tool for the job when you're building a distributed database, especially for the distribution part. For a company like Basho it's understandably a good investment to take expert distributed systems/database engineers and let them take their time learning Erlang.
If, however, you, like most of us, are not building a distributed database, think carefully before you choose Erlang, especially if you're in a startup.
Most startups don't have a distributed database as their core technology, and for many startups, it's important that developers can push production code from day one.
What distributed application can not be "philosofed" into essentially being a distributed database, and I feel like I'm just repeating your question.
With that in mind, I believe it boils down to where is the logic located, with the data, or not. Because a lot of "databases" are just used as a datastore.
So let me try to create some scenarios:
1. To push out a product(Startup/Prototype/POC): Develop in the technology you know, and let the stake holders know you are acquiring Technical Debt.
After startup/prototype product is a success. Explain to stake holders why you need to move to a better technology stack that supports distributed systems "better".
2. If you have a design or product, develop directly in a distributed technology stack.
It boils down to use the correct tool for the correct job, but you will mostly only know if you used the correct tool later in the development of the product.
Jeez, and here I was hoping someone would have an answer ready to hand. Instead, I get 13 upvotes (at time of this writing) from people who are presumably wondering the same thing. I guess this might be something worth doing a bit of research and writing an article about. :)
We use Erlang at 2600hz and I feel like this is an Echo Chamber of the kinds of conversations we have around the table. Erlang is a beast, but if you're doing hard work it's more about the idea than the tools.
Lots of wisdom on this blog post. The truth is that if you're passionate about doing something hard, and communicate that passion, people are drawn to it.
I like that they focus on the experience related to DBs and distributed systems instead of hiring engineers that can program in X. You can pick up a language relatively faster since you'll need to build the other skill slowly with real world experience.
"While it’s theoretically a nice bonus for someone to bring knowledge of all the tools we use, we’ve hired a significant number of engineers that had no prior Erlang experience and they’ve worked out well."
Nice to hear a company understand that talent and abiliaty does not equate to n-years using a specific toolset ('not “expert hammer wielder”').
The only thing I didn't like about Erlang is its syntax - but I am sure I could adapt. I just don't like too much syntax in general, one reason why I like Scheme.
Now, Erlang's runtime seems fantastic.
I wonder if they (Basho) investigated Elixir recently and their opinion of it.
Elixir has some good things about it (hygienic macros, protocols, and so on) but the syntax is not enough by itself to make us switch. "Ugh, Erlang syntax sucks" is a refrain heard around the Internet, but it has not hindered us from getting good engineers who are happy to develop with it. I daresay that the syntax is the least of your problems when learning Erlang; the message-passing concurrency and fault-tolerance models, and functional style are more difficult to understand. Those problems don't go away when you use Elixir instead.
If you want something Scheme-like on the Erlang runtime, there's LFE (Lisp-Flavored Erlang): https://github.com/rvirding/lfe
> I daresay that the syntax is the least of your problems when learning Erlang; the message-passing concurrency and fault-tolerance models, and functional style are more difficult to understand.
That's my feel as well. I hear tirades about commas and periods. What are they going to do when they hit a net-split.
Bashing Erlang and stopping at syntax is like bashing a new battle tank because it doesn't have a leather interior. Yeah it is nice if it had leather seats and a mini fridge but if that is really the main criteria used in picking it, one has to wonder...
Do other people have much experience cross-training people? I am struggling to find decent Senior Perl developers at the moment in London. I am open to the idea of hiring a Senior X Dev and cross-training them ... I'm just not sure how well it'd work out.
I would rather have a wise and experienced developer X than a developer with Y experience in Z.
As long as you are hiring people with skills in solving similar problems to the ones you are having or are anticipating on having.
Edit: However that may not work for you. I think that a senior Perl developer would be more than competent in any language you point towards. While the opposite may hardly hold true. The presumption is that one believes that a senior Perl developer really exists.
There are some really good perl coders that are not working with perl because its hard to get a perl job.
This situation happens in brasil.
Consider telecomute =)
Personally, I would much prefer to hire a "web person" regardless of the stack they're used to than, say, a "Java person" who has never written a website. But even more specifically, I would look to find somebody whose interests are in my domain, as they will always have better ideas and motivation than a strictly mercenary engineer.