I would argue that *all* deployments (no matter how small) should have configura...

csdvrx · on Nov 2, 2021

> I would argue that all deployments (no matter how small) should have configuration management.

I would argue that in most cases, they don't need anything but some documentation explaining what was installed, and why.

I will take a word file with screenshots over a broken script in an obscure language every single time.

> but there's no reason not to use something like Ansible that's designed for infrastructure automation,

There's a big one: my time isn't free.

If someone is willing to waste money on that, sure, I'll be happy to bill them for their extravagant tastes (but only after having done my best to explain them it's a waste of money)

And still, I will think about the next person that may have to maintain or tweak whay I wrote, so I will also leave a document full of screenshots in case they don't know ansible or whatever new fashionable tool that the client may have specifically requested.

> it's faster and the resulting system is of higher quality.

Not everything needs to be of high quality.

Forgive me if I'm assuming your gender, but I see a lot of black-and-white thinking among male sysadmins/devops: it's good or it's bad, it's high quality or it's not.

I prefer to have a "sufficient" degree of quality: if a checklist is enough, I will not waste time writing a script. If a shell script is enough, I will not waste time writing proper code - and so on.

> Over time you will also accumulate a library of bits and pieces that you can copy over to new setups, further improving your speed and quality

Except you assume a continuous progress, without any change of scope or tools, and with the tools themselves never evolving. It doesn't work like that: over time, you will accumulate a bunch of useless code for old versions.

Even small inconsequential changes (like unbound in debian 11 requesting spaces before some options, which wasn't the case before) will take some time and effort. Why waste your energy one one shots?

The do-nothing approach argues that you should avoid premature optimization, which strikes me a good approach in software in general.

brulard · on Nov 3, 2021

Such a needless dismissive use of "male" word here. Devops guys I work with are very reasonable and I have not seen a female one in my whole career.

loxias · on Nov 3, 2021

> Forgive me if I'm assuming your gender, but I see a lot of black-and-white thinking among male sysadmins/devops: it's good or it's bad, it's high quality or it's not.

Male lifetime linux nerd here, who started as a sysadmin, checking in just to say that I agree with everything "policy related" in your comments on this article. Knowing where to tune the knob between "high quality"/"good architecture" vs "can i just get this done now and move on?" is difficult, at least I don't know how it could be taught other than experientially.

IME, the predilection to see things as black-or-white is more correlated with age, than gender.

Anyway, "not all men". :P

csdvrx · on Nov 3, 2021

> checking in just to say that I agree with everything "policy related" in your comments on this article

Your nick seemed familiar - now I remember, I read your great comment in "I just want to serve 5TB" earlier today!

I also agree with everything you wrote about simplicity in software development: I'll take almost every time some dirty php scripts running baremetal over Docker + Golang + Kubernetes + Terraform + Gitlab + Saltstack + Prometheus + the new fashionable tool because with so many parts now begging for attention, nothing will get done quickly - if we're lucky and something gets done.

Knowing where to tune the knob is indeed very difficult, and I'm afraid most people now are just doing a cargo cult of whatever google does, except they are not google, and they don't understand the tradeoffs or the possible alternatives.

But at the scale of most companies, it's a folly to sacrifice flexibility and simplicity to some unachievable desire for software perfection!

It's also a very costly hubris: I have been asked way too often to improve the performance after having thown very expansive hardware at the problem, that still performs miserably due to missing the big picture.

The solution was almost always removing the useless parts, or when trying to disentangle the architecture astronaut fancy mess would have been too costly, start from scratch with a saner design: most recently, I replaced a few hundreds java files (and test and stuff) by about 10 lines of bash, and 20 lines of awk.

My work is not fancy, but it works, unlike the previous solution that was going to be ready the next month, every month, for almost a year...

To all those who want to do things like google, maybe apply there instead of over engineering/polishing your CV with fancy keywords at your employer or client expense?

> IME, the predilection to see things as black-or-white is more correlated with age, than gender.

I had noticed this weird pattern, and it was my best explanation even if I didn't like it much, because it's sexist.

But your version seems more plausible (Occam's Razor!), so thanks a lot for taking the time to post!

chousuke · on Nov 2, 2021

I do not think automation is "premature optimization", nor do I that think everything needs to be high-quality; I did not say that. I do think, however, that everything you do should be of acceptable quality.

And for me, having configuration management is the minimum level of acceptable quality. It's simply not possible to have acceptable quality of a system without some form of configuration management. I can't recall a single instance where I (or anyone else involved) ever said "wow, this unmanaged mess sure works well" :P

In some cases, the management can be as simple as a comment in some script explaining some part of the process was done manually, or simply a periodic snapshot backup of the server that can be restored when the configuration is broken, but the point is that a process must exist and it must be consciously chosen.

Free-form documentation is not an alternative to configuration management either; if you can document your configuration in a wiki, you might as well put it in a git repository in the form of a script or a template.

When done properly, It's the exact same amount of effort, except when you use automation tools, the documentation is in a format that's not ad-hoc and can actually help do the things it documents instead of requiring a human to interpret them (possibly introducing mistakes). "Setting up" Ansible requires literally nothing but a text file containing your server's hostname, and SSH access, which you already will have.

Also, I don't know where you got the idea that I would somehow assume unchanging scope? I am the first person to throw away useless code and tools; I consider code my worst enemy and it's practically my favourite activity to delete as much of it as I can. If some piece of automation is no longer fit for purpose, it gets rewritten if necessary. Throwing away code is no big deal, because the tools I use allow me to get things done efficiently enough that I can spend time refactoring and making sure the computer properly handles the tedious parts of whatever I'm working on.

Your unbound example is something that is trivially solved with configuration management. After an upgrade, you notice your configuration does not work, navigate into your git repository, update the template, and then deploy your changes to any server that happened to be running unbound using that same template (because you might have redundancy, if you're running DNS). If you make a mistake, you revert and try again. There is no manual ad-hoc process that comes even close to allowing you to manage change like this, but it is trivially enabled by existing, well-understood automation tools.

csdvrx · on Nov 3, 2021

Your definition of "acceptable quality" is my definition of "overengineering".

It does not take the same amount of effort, if only cause you mention how for unbound, you have to update the template.

For one shots, this is overkill.

randomswede · on Nov 3, 2021

For "truly one-shot", you're right. But a "truly one-shot" is not a production machine, it is a test bed, informing what the eventual production machine should look like.

Because even if you will only ever have a single production machine, it will have something go horribly wrong with it and need recreating from fresh hardware (or from a fresh VM or whatever).

I guess, if you're cloud-based, you could turn your finely tuned test box into a template, then you have something that is (effectively) scripted.

chousuke · on Nov 4, 2021

Leaving aside all the other benefits and even if you never need to rebuild your system, having some sort of IaaC automation in place allows for extremely powerful change management. When your system is defined as code[0], change over time can be reviewed with a "git log -p", which definitely beats searching through ticket comments or ad-hoc documentation and attempting to reconstruct the history of change.

It's a no-brainer nowadays that software should be developed with version control. I don't see why infrastructure should be treated differently.

[0] Ansible playbooks are code, no matter what some people may think. It's a declarative'ish programming language with a silly syntax.

chousuke · on Nov 3, 2021

There's no such thing as an oneshot if you're creating a system that someone will actually use and depend on.

All systems have a lifecycle, and even on a "trivial" system you have backups, access, monitoring, logging and security maintenance to worry about even before you consider how installing any useful software affects those things.

There are exceptions to any rule, of course, and I did in fact create a system where the configuration management is a snapshot backup just two weeks ago; but that system has no data on it, its lifecycle is expected to last for less than a year, and if/when it breaks, a backup restore can be performed without any additional considerations. It was also an emergency installation into a network that's not easily accessible with SSH, which is why I did not just use Ansible from the start.

I thought it would be a oneshot, but I did end up having to create a second instance of the system a few days later, fortunately with less emergency :P

Still even ignoring that, I fail to see what could possibly be overkill about literally 3 small files in a git repository. You call "overengineering" what is to me "5 minutes of effort with extremely relevant upsides". That's literally how much time it would take me to create a playbook for unbound if I already know what the configuration needs to look like; probably less, but most of the time will be lost to context-switching overhead.

My point being, most of the time will be spent actually configuring the software and the automation overhead is nothing in comparison compared to the value you get from it, and that's why I generally automate things by default: It provides more value than I put in effort.

When you start of learning configuration management and infra automation tools, there's a learning curve; in the beginning, you will be "wasting" time learning (what a silly statement) how to use your tools effectively, but with practice, you will learn how to effectively use the tools and where to apply them and how to approach managing specific kinds of systems such that over time, using the automation tools is simply easier and faster than doing it manually. That's what I meant when I mentioned "higher quality" earlier; you get it for free, with no effort, once you've put in a bit of practice first. It just sounds to me like you're arguing against doing things well in favour of doing things with strictly inferior tools.

rgj · on Nov 3, 2021

> Not everything needs to be of high quality.

But stuff connected to the internet needs to be or it will be compromised before you even finished installing it.

minetest2048 · on Nov 3, 2021

Rant/question:

The word software configuration management can mean 2 related but different thing:

1. Configuration management in system engineering sense, which is is a process to systematically manage, organize, and control the changes in the documents, codes, and other entities during the Software Development Life Cycle (guru99).

2. Something to manage your config files, from something simple as python/bash scripts to full infrastructure-as-code solutions such as terraform and ansible

When I think about configuration management, I (and the parent) thinks about the second meaning, but if I googled that, all of the search results points to the first meaning

chousuke · on Nov 3, 2021

Both are important. A good CMDB is key in finding your documentation that points you at the configuration management used for the actual system.

Let me tell you, just a wiki with a search is not enough beyond a certain size, and you hit that faster than you'd think.

crispyambulance · on Nov 3, 2021

> [...] all deployments (no matter how small) should have configuration management [...] Ansible [...] with zero effort [...]

No.

It's a powerful tool, Ansible, but let's not get carried way. There's a ton of complexity behind the scenes. If you over-do it you end up with a ream of ugly yaml and you're fighting with the tool as much as you are any real problems.

Arch-TK · on Nov 2, 2021

This is assuming you already know Ansible.

chousuke · on Nov 2, 2021

Sure, but then again, typing shell commands into text files is assuming you already know shell commands. You have to spend time to learn your tools at some point.

For simple configuration management, Ansible is a straight upgrade to most shells because of idempotency alone, never mind the fancier features like the more advanced modules, multi-node orchestration, or encrypted vaults. The YAML syntax is dumb and it has its issues, for sure, but it still does even the simple things much better than plain old shell.

Anyone who has any familiarity at all with UNIXy systems can learn Ansible from zero well enough in a day or two for it to start becoming truly useful, and if you don't have the foundation for that... why on Earth are you setting up a web server? I mean, it's of course fine to tinker with things for learning, but I was assuming a real deployment scenario.

Arch-TK · on Nov 2, 2021

> Sure, but then again, typing shell commands into text files is assuming you already know shell commands. You have to spend time to learn your tools at some point.

Don't I still need to know shell programming for Ansible? Or at least know all the systems I want to manage with it inside out?

Yes, I need to learn tools at some point. But as I see it, I am not a system administrator of anything but my own network of 8 infrastructure hosts. The effort required to recreate this with ansible (and I don't think ansible can actually idempotently handle ALL of these devices, not without serious limitations) seems far greater than maintaining a few scripts and keeping backups. Also, I already know bash (unlike ansible).

> Ansible is a straight upgrade to most shells because of idempotency alone

So, as I said, I know nothing about Ansible. But idempotency implies that Ansible always starts from nothing and builds from there. Does this mean that every time I want to change my server I have to wait 15 minutes for it to re-install the distro and re-configure everything? Do I have to keep my state on a different server? I don't see how this can't be achieved with just as much hassle with a script?

Surely I misunderstand this. But if I did, then surely it's not THAT idempotent.

> Anyone who has any familiarity at all with UNIXy systems can learn Ansible from zero well enough in a day or two for it to start becoming truly useful

My problem with this is that every time I've looked into Ansible, it didn't look like a day of work. It looked like a week of work converting my entire infrastructure to it, for very little benefit, in addition to having to change the way I do a lot of things to fit the Ansible blessed method of doing them. It may take a day to learn Ansible but it probably takes even more time than that to learn it to a standard where I would consider the knowledge reliable. It would require making mistakes and lots of practice before I felt like I could quickly recover from any mistake I could make using it as well as avoid those mistakes. Not just that, but because of my nonstandard setup I would likely have to spend extra time learning Ansible well enough that I can actually replicate my nontrivial setup.

jjnoakes · on Nov 3, 2021

> idempotency implies that Ansible always starts from nothing and builds from there

No, it doesn't. In Ansible you say something like "make sure apache is installed" and if apache is installed, nothing happens. If it isn't, it gets installed. Then you say "make sure apache is running" and if apache is running, nothing happens. If it isn't, it is started.

Arch-TK · on Nov 3, 2021

Okay, this is a rather limited form of idempotency. I don't see the advantage. My system's package manager and service manager already perform this function.

jjnoakes · on Nov 3, 2021

You should really spend a little time learning ansible before you critique it. Ansible isn't perfect, but the things you describe aren't how ansible works in general, so they aren't even valid criticism.

For example, it has idempotent modules for all sorts of things - contents in files, files and directories in the file system, etc - things that you COULD script in an ad-hoc and verbose way, but things which come built-in as one-liners in ansible.

It's quite convenient.

Arch-TK · on Nov 5, 2021

There are no resources which are seemingly suitable for my environment. If you're going to claim that I'm missing something, rather than telling me that I have things to learn (no shit sherlock), you could tell me specifically which initial impressions are wrong.

jjnoakes · on Nov 8, 2021

I did, a few comments up. This:

> idempotency implies that Ansible always starts from nothing and builds from there

...is wrong. It might be true that Ansible is unusable in your environment for some reason, but that's quite different fromage this specific false claim.

Here are a few more quotes that imply you should learn about Ansible before critiquing it for your use case (or, if you don't have time, then refrain from critiquing it in general):

> Don't I still need to know shell programming for Ansible?

No, Ansible uses a custom non-shell syntax and python modules. You can dip into shell scripts but you don't have to. Examples are everywhere in the Ansible documentation.

> Does this mean that every time I want to change my server I have to wait 15 minutes for it to re-install the distro and re-configure everything?

No. Ansible will examine your existing system and apply the changes you configure. Idempotency does not imply or require a functional-like OS or rebuilding from scratch; Ansible is more imperative.

Too · on Nov 3, 2021

> Don't I still need to know shell programming for Ansible?

No. Ansible has its own built in functions for creating files, managing systemd, docker and so on. These are built with idempotency in mind.

You can however call out to shell for situations where there is no built in. There are a lot of people who only ever use this role, and just see ansible as the fleet orchestration layer. Which imo defeats most of the benefits of using it, you might as well ssh a full script in that case.

As a side note I wouldn’t actually recommend Ansible for server management. Like you say learning all these blessed roles feels like relearning basics you already know and the syntax and directory structure is messy. It has no place if you use containers.

Arch-TK · on Nov 3, 2021

> Ansible has its own built in functions for creating files, managing systemd, docker and so on. These are built with idempotency in mind.

Do I still get idempotency if I do not use systemd or docker?

> You can however call out to shell for situations where there is no built in. There are a lot of people who only ever use this role, and just see ansible as the fleet orchestration layer. Which imo defeats most of the benefits of using it, you might as well ssh a full script in that case.

So it sounds like I wasn't entirely wrong in my first impressions that it would be useless for my situation where I don't think any of the "built ins" would really be suitable. Of the 8 machines on my network, only one has systemd (and I'm in the process of phasing it out because systemd seriously struggles to deal with services with dependencies on specific network interfaces being "UP", these issues are documented by freedesktop[0]).

> As a side note I wouldn’t actually recommend Ansible for server management.

Given the background of my infrastructure being a mixture of FreeBSD, OpenBSD, non-systemd Linux and systemd Linux machines. What would you recommend?

[0]: https://www.freedesktop.org/wiki/Software/systemd/NetworkTar...

rgj · on Nov 3, 2021

You seem to have a very strong opinion about Ansible while you keep emphasizing that you don’t know anything about Ansible at the same time.

As a result, all your arguments against Ansible seem to be based upon assumptions, some of them completely false.

Arch-TK · on Nov 3, 2021

I have opinions of my limited experience of trying to look into ansible once.

Why don't you tell me which "arguments" (correction, they're my opinions) are based on false "assumptions" (correction, they're my impressions) rather than just giving me this blanket statement to work from?

chousuke · on Nov 3, 2021

If you already have something that works, by all means, stick with it.

If you want to learn Ansible, you don't even have to throw away your scripts; Ansible is a perfectly good way to run ad-hoc scripts if that solves your problem better than writing a full-blown playbook or even a custom module.

Ansible is weird and annoying in the beginning, but it's still a good tool to learn on top of your existing knowledge, because it provides extremely useful features beyond what's possible with plain old shell, and more importantly, it's a common language for system administration tasks that anyone can learn and understand without having to figure out how your specific scripts accomplish the things that Ansible gives you for free. The same applies to any management tool like Terraform, Puppet or even Kubernetes manifests. I put my expertise in my Ansible scripts and provide an easy interface to them such that a more junior person can, say, upgrade an Elasticsearch cluster by issuing a documented "make upgrade" (I like to use Makefiles to provide a neat interface for "standard" operations. "make help" and anyone can get going.) command that does everything correctly even though they have no idea how to actually upgrade it manually. If they wanted to learn, they have all the resources available required to read and understand my playbooks and figure it out without me being there to teach them the particulars of whatever unholy custom script setup I might have used instead.

Ansible is also mostly useful once you already have a server up and running but with 0 configuration; it's pretty bad at actually installing new servers, and I'd recommend using better tools for that part (Terraform, kickstart, or maybe just a script that clones an image). Just a manual next-next-next install is also perfectly acceptable way to get the base OS installed if the defaults are fine, though beyond a few servers it's a good idea to have a better process.

My perspective is that of someone who works with very varied systems daily, ranging in size from one to hundreds of nodes. I can manage that kind of scale alone because I use automation, and Ansible in particular is a tool that fits extremely well in the 1-20 "size range" for an environment; It is extremely lightweight and low-investment and can be used for even single nodes to great effect; once you get beyond a couple dozen, something more "heavyweight" like Puppet will start showing its usefulness.

As for idempotency, it's a very useful feature for automation: basically "Only do something if it is required". With a shell script, you have to implement manual checks for everything you run such that if you re-run a script on a system where it's already been run once, you won't accidentally break things by applying some things twice. A side benefit of this is that you can run your playbooks in "check mode", ie. "Tell me what you would do, but don't actually do it". Extremely useful and very error-prone to implement manually (Ansible doesn't always get it right either).

csdvrx · on Nov 3, 2021

> With a shell script, you have to implement manual checks for everything you run such that if you re-run a script on a system where it's already been run once, you won't accidentally break things by applying some things twice

Using tools like grep + basic logic like || and && goes a long way...

I'm not saying there is no place for ansible, but in my personal experience, it's a very small one.

> Ansible is also mostly useful once you already have a server up and running but with 0 configuration; it's pretty bad at actually installing new servers, and I'd recommend using better tools for that part (Terraform, kickstart, or maybe just a script that clones an image).

Agreed!

More recently, I've found zfs clones of base installs surprisingly flexible.

Now I only with there was a way to do some kind of merge or reconciliation of zfs snapshots from a common ancestor that haven't diverged much in practice, spawning the differences into separate datasets per subdirectory (ex: if /a/ hasn't changed but only /a/b/c/d1 and /a/b/c/d2 differs, move d1 and d2 off to create a separate d dataset mounted in /a/b/c/ so you can keep the common parts identical )