Journey to Infrastructure as Code
Thu, 23 Mar 2023 14:32:49 -0000
|Read Time: 0 minutes
Infrastructure as code (IaC) has become a key enabler of today’s highly automated software deployment processes. In this blog, let’s see how today’s fast growing IaC paradigms evolved from traditional programming.
On the surface IaC might just imply using coding languages to provision, configure, and manage infrastructure components. While the code in IaC itself looks very different, it is the way code works to manage the infrastructure that is more important. In this blog, let’s first look at how storage automation evolved over the last two decades to keep up with changing programming language preferences and modern architectural paradigms. And then more importantly differentiate and appreciate modern day declarative configurations and state management.
First came the CLI
The command line interface (CLI) was the first step to faster workflows for admins of almost any IT infrastructure or electronic machinery in general. In fact, a lot of early electronic interfaces for “high tech” machinery in 80s and 90s were available only as commands. There were multiple reasons for this. The range of functionality and the scope of use for general computer controlled machinery was pretty limited. The engineers who developed the interfaces were more often domain experts than software engineers, but they still had the responsibility of today’s full stack developers, so naturally the user experience revolved more around their own preferences. Nevertheless, because the range of functionality was well covered by the command set, the interfaces left very little to be desired in terms of operating the infrastructure.
By the time storage arrays became common in enterprise IT in the mid-90s (good read: brief history of EMC), the ‘storage admin’ role evolved. The UNIX heavy workflows of admins meant a solid CLI was the need of the hour, especially for power users managing large scale, mission critical data storage systems. The erstwhile (and nostalgic to admins of the era) SYMCLI of the legendary Symmetrix platform is a great example of storage management simplicity. In fact, the compactness of symcli of the legendary Symmetrix platform easily beats any of today’s scripting languages. It was so popular that it had to be continued for the modern day PowerMax architecture, just to provide the same user experience. Here is what a storage group snapshot command looks like:
symsnapvx -sid ### -nop -sg mydb_sg establish -name daily_snap -ttl -delta 3
Then came scripts (more graphical than GUI!)
In the 90s, when graphical user interfaces became the norm for personal computers, enterprise software still was lagging personal computing interfaces in usability, a trend that persists even today. For a storage admin who was managing large scale systems, the command line was (and is today) still preferred.
While it is counterintuitive, many times for a complex, multi-step workflow, code blocks make the workflow more readable than documenting steps in the form of GUI snapshots, mouse clicks, and menu selections.
Although typing and executing commands was inherently faster than mouse driven graphical user interfaces, the main advantage is that commands can be combined into a repeatable workflow. In fact, the first form of IT automation was the ability to execute a sequence of steps using something like a shell script or Perl. (Here's some proof that this was used as late as 2013. Old habits die hard, especially when they work!) A script of commands was also a great way to document workflows, enforce best practices, and avoid manual errors - all the things that were essential when your business depended on the data operations on the storage system. Scripting itself evolved as more programming features appeared, such as variables, conditional flow, and looping.
Software started eating the world, RESTfully
REST APIs are easily one of the most decisive architectural elements that enabled software to eat the world. Like many web-scale technologies, it also cascaded to everyday applications as an essential design pattern. Storage interfaces quickly evolved from legacy CLI driven interfaces to a REST API server, with its functionality served with well-organized and easily discoverable API end-points. This also meant that developers started building all kinds of middleware to plug infrastructure into complex DevOps and ITOps environments, and self-service environments for developers to consume infrastructure. In fact, today GUIs and CLIs for storage systems use the same REST API that is offered to developers. Purpose built libraries in popular programming environments were also API consumers. Let’s look at a couple of examples.
Python phenomenon
No other programming language has stayed as relevant as Python has, thanks to the vibrant community that has made it applicable to a wide range of domains. For many years in a row, Python was at the top of the chart in Stack Overflow surveys as the most popular language developers either were already using or planning to. With the run-away success of PyU4V library for PowerMax, Dell invested in building API-bound functionality in Python to bring infrastructure closer to the developer. You can find Python libraries for Dell PowerStore, PowerFlex, and PowerScale storage systems on GitHub.
Have you checked out PowerShell recently?
Of late, PowerShell by Microsoft has been less about “shell” and more about power! How so? Very well-defined command structure, a large ecosystem of third-party modules, and (yes) cross-platform support across Windows, Linux, and Mac - AND cross-cloud support! This PowerShell overview documentation for Google Cloud says it all. Once again, Dell and our wonderful community has been at work to develop PowerShell modules for Dell infrastructure. Here are some of the useful modules available in the PowerShell Gallery: PowerStore, PowerMax, OpenManage.
Finally: infrastructure as code
For infrastructure management, even with the decades of programming glory from COBOL, to Perl, to Python, code was still used to directly call infrastructure functions that were exposed as API calls, and it was easier to use commands that wrapped the API calls. There was nothing domain-centric about the programming languages that made infrastructure management more intuitive. None of the various programming constructs like variables and control flows were any different for infrastructure management. You had to think like a programmer and bring the IT infrastructure domain into the variables, conditionals, and loops. The time for infrastructure-domain-driven tools has been long due and it has finally come!
Ansible
Ansible, first released in 2013, was a great benefit in programming for the infrastructure domain. Ansible introduced constructs that map directly with the infrastructure setup and state (configuration): groups of tasks (organized into plays and playbooks) that need to be executed on a group of hosts that are defined by their “role” in the setup. You can define something as powerful and scalable like “Install this security path on all the hosts of this type (by role)”. It also has many desirable features such as
- Not requiring an agent
- The principle of idempotency, which ensures task execution subject to the actual configuration state that the code intends to achieve
- Templating to reuse playbooks
The Ansible ecosystem quickly grew so strong that you could launch any application with any infrastructure on any-prem/cloud. As the go-to-market lead for the early Ansible modules, I can say that Dell went all in on Ansible to cover every configurable piece of software and hardware infrastructure we made. (See links to all sixteen integrations here.) And here are a couple of good videos to see Ansible in action for Dell storage infrastructure: PowerScale example, PowerMax example.
Also, Ansible integrations from other ITOps platforms like ServiceNow help reuse existing workflow automation with minimal effort. Check out this PowerStore example.
Terraform
Terraform by HashiCorp is another powerful IaC platform. It makes further in-roads into infrastructure configuration by introducing the concept of resources and the tight binding they have with the actual infrastructure components. Idempotency is implemented even more tightly. It’s multi-cloud ready and provides templating. It differs the most from other similar platforms in that it is purely “declarative” (declares the end state or configuration that the code aims to achieve) than “imperative” (just a sequence of commands/instructions to run). This means that the admin can focus on the various configuration elements and their state. It has to be noted that the execution order is less intuitive and therefore requires the user to enforce dependencies between different component configurations using “depends on” statements within resources. For example, to spin up a workload and associated storage, admins may need the storage provisioning step to be completed first before the host mapping happens.
And what does Dell have with Terraform?…We just announced the availability of version 1.0 of Terraform providers for PowerFlex and PowerStore platforms, and tech previews of providers for PowerMax and OpenManage Enterprise. I invite you to learn all about it and go through the demos here.
Author: Parasar Kodati