Careers in Operations¶
Taking a step into the world of Operations can be daunting. At present there are few professional degrees, and the ones that exist focus on specialized topics such as systems administration, network engineering or security.
However, the Operations landscape is significantly larger than that. This page aims to provide a detailed view on the careers you can move to as you enter the field.
Deciding a career path¶
Around the end of the 101 level of Ops School, you will want to start thinking about which career path you want to take. The paths tend to overlap in places, and both paths can lead to similar places down the road.
Imagine two roads that run straight, and occasionally get close or overlap, but generally head in the same direction.
Your two options are usually:
- Operations generalist
- Operations specialist
The one you choose, should be the one you feel most comfortable with. Over time, as your skills grow, both paths should lead to similar opportunities: team leadership, senior technical leadership, and management.
The way you travel each path is considerably different, and it isn’t uncommon to switch from one path to another once or twice.
Generalized career paths¶
Persons in generalized careers are often in high demand by employers. While the adage states “Jack of all trades, master of none”, an operations generalist is very much expected to be “Master of almost all trades.”
It’s quite common to see a good Operations Engineer who is well-versed in systems administration, database administration, storage engineering, network engineering, security engineering, and other topics.
Similarly, Operations managers are expected to know a wide variety of subjects at a reasonable enough level to make good technical decisions. Often, people who choose to become Operations managers begin their careers as either generalists, or specialists in one or more fields.
Generalist roles are often most highly prized in technical companies and startups, where it is beneficial to be able to work with multiple technologies and groups well.
Operations engineers are expected to be able to wear any of the following hats (and sometimes all of them, at the same time):
- Database administrator
- Systems administrator
- Network engineer
- Security engineer
- Performance engineer
- Part-time software engineer
- Storage engineer
- High Performance Computing engineer
The role can be summed up appropriately as this: When somebody wants to know about any production system, no matter what it is, they will ask the Operations engineer. Your job as the Operations engineer, is to know the system well enough to be able to answer any question, or know how to find the answer quickly.
In the medical field, you would be a Doctor of Internal Medicine. In culinary, you would be an Iron Chef.
Operations managers are similar in many ways to operations engineers. If you have read the 21 Irrefutable Laws of Leadership, then this is a leadership role, not a management role. An Operations manager works to bring their team and other teams closer together. A good reference on managing an operations team, is Michael Rembetsty’s PICC 2012 talk on DevOps Management. It covers moving from traditional Operations to DevOps, and then developing and growing an Operations team.
Specialized career paths¶
Unlike generalists, specialists are often hired to take care of certain components of larger systems. While generalists tend to focus on increasing both the breadth and depth of their knowledge over time, specialists work to become deep subject matter experts. Common areas of focus are databases, networking, security, storage, capacity planning, project management, training, and more. In almost all cases these require at least the 101 level of understanding, and fully understanding through the 201 level is better.
The Systems Administrator is the classic and probably most recognized Operations role. Key responsibilities usually include managing desktops, servers, operating systems, databases, middleware and applications.
Systems Administrators can range from “jack of all trades” with knowledge of multiple systems and platforms to specialists who focus on one system or platform, for example Microsoft Windows or Linux.
Whilst perhaps more “general” than some of the other specialist roles, Systems Administrators tend to focus on managing individual hosts, usually desktops or servers, rather than looking more broadly at infrastructure like Operations generalists.
Database Administrators, or DBAs, are specialists in managing the performance, security and stability of database systems. Once a common role, they are less frequently seen today and much of the work involved in this role has been replaced by more advanced database systems, automation and the growth of these skills in related roles.
DBAs usually have specialized skills in managing the performance of database systems, are often experts in understanding database features like stored procedures and are called upon to improve the performance of database systems using techniques like query analysis.
Network Engineers are Operations people who focus on network devices and network management. Network Engineers manage the provisioning, configuration, security and availability of networking infrastructure.
Network Engineers are able to architect and design networks both internal to organizations and between organizations and their customers, for example Internet-facing infrastructure. As a result their skills often overlap with Security Engineers in technologies such as firewalls, proxies and gateway services like Virtual Private Networks (VPN).
They are expected to have a deep understanding of the OSI model and its components especially physical networking technologies like Ethernet and transport and session components like TCP/IP, UDP, and SSL. They are often called to identify and fix problems with applications and their connectivity and hence have strong skills in diagnosis, log and data analysis, and troubleshooting.
Whilst seen by many as a separate discipline, Security Engineers are Operations people with a focus on security and security technology. Security Engineering roles can include:
- Traditional Systems Administrators who maintain security equipment like firewalls and proxies
- Specialists who design and manage complex cryptographic systems
- Penetration testers who attempt to identify security vulnerabilities in infrastructure and applications
- Engineers with a focus on Identity Management who manage complex authorization, access control and authentication systems
- Analysts and incident response personnel who respond to security events and incidents
Security Engineers usually have many of the same skills as their more mainstream Operations colleagues but often include deeper skills in fields such as Compliance Management (ensuring companies maintain compliance to industry and government regulations), Risk Management (identifying, documenting and managing Risk), education (teaching people about how to stay secure), Cryptography, and related areas.
Seen largely in enterprise-scale organizations, Storage Engineers focus on managing storage technologies such as disk arrays, Network Attached Storage (NAS) devices, Storage Area Networks (SANs), Tape and Media management systems and related backup technologies.
Storage Engineers provision, configure and manage this infrastructure which then provides storage for web and file servers, database systems, applications and backups.
They usually have strong skill overlaps with Network Engineers (with so much modern storage being network-attached in some manner) and usually have strong skills in capacity planning and performance management of infrastructure.
High Performance Computing (HPC) involves large scale computing infrastructure, which is at the cutting edge of presently available technology, oftentimes making it at TOP500.ORG
Typically, HPC platforms are used for scientific computing, big data, complex models and may have applications in fields as diverse as physics, finance, medicine, defence, economics etc. for instance: Meteorology and Climate prediction may appear first as a tiny slice of the bigger picture yet they tend to be applicable to several aspects of human life and scientific efforts to improve it.
HPC engineers are expected to master an array of high-end technologies in fields such as networking (InfiniBand, multi-Gigabit Ethernet), computing (several computer architectures), parallel storage/filesystems/filers (Lustre, GPFS, Isilon, NetApp, PanaSAS), as well as be able to give advice on a number of software components (gnu/intel/pgi compilers, debuggers, mpi stacks, FFTW, linear algebra and other optimized math libraries etc etc).
Most importantly, HPC engineers should be able to interface with other operations engineers, each a specialist in their own field, in order to let all systems run both at top performance and within the maximum range of their reliability envelope, since HPC downtimes come at a high cost.
How to become an operations engineer¶
Employers look for a number of things when hiring junior engineers and admins:
- An understanding of the basics of Unix-style and/or Microsoft Windows operating systems, including installing the operating system, installing and configuring packages and editing files. You can find these in the Unix fundamentals 101 MS Windows fundamentals 101 sections.
- Knowledge of common internet protocols and systems, and how to implement and manage them, including DNS 101, SMTP 101 and Networking 101.
- A solid grasp of how to troubleshoot problems.
- Repeated success in completing the Labs exercises.
These are only the beginning, and the bare minimum you should expect to know as a junior level engineer. While demand for operations engineers continues to grow at a fast pace, you will still find there is competition for positions. The more you know, the stronger your chances of finding a job.
Simply reading the 101 sections of Ops School is not sufficient, you must understand it. As an example: The DNS section explains there are 13 root name servers. In addition to knowing this fact, you have to understand why there are 13 root name servers and be able to explain it confidently to others.