Engineering Software

Posted 10 months, 3 weeks ago | Originally written on 14 Jun 2023

My thoughts on software are taking a new direction. I now think that producing high quality software involves three key stages:

  • Design
  • Engineering
  • Operations, which can further be broken down into Production Operations and Support Operations

Design

The design stage results in a design---an architectural representation of the proposed system that takes into account the required use cases together with the project design. This is a purely discovery phase with the goal of accurately understanding the users' needs. This involves heaps of anthropological studies in which the design team immerses itself in the user environment to intimately understand the needs together with any actual constrains. The actual design is represented by fine-grained requirements, mockups, service metrics, infrastructure diagrams and so on which fully describe the system to be built. However, not a single line of code should be written during this stage. The end of the design phase should consist of a Go-NoGo meeting based on the risk analysis of the project design.

Engineering

With a clear scope of what should be built, the next important step is the breakdown of constraints into actual implementation plans possibly involving experimentation. This is where limited implementation of core functionality should be carried out. For example, if the design calls for a computation step which has a upper time limit of 30s then an experiment will need to carried out to determine the feasibility of how to accomplish this given the proposed infrastructure and underlying technology. The result will be a set of experimental code each of which demonstrating working non-production code, which will be relied on during the actual production phase as part of the operations stage. Again, it may prove infeasible to deliver on the intended constraint, in which case the team may need to return to the design phase to assess any necessary changes followed by another Go-NoGo meeting.

Operations

This stage is supposed to be routine. In fact, the root word for operations (opera) simply means work. The work carried out in this stage should not require specialised skills and should be linearly scalable. There are two parts of operations:

  • Production operations are involved in producing new output and are only done once per product progression. This means that the work done during production operations should aim to never be repeated and hence done only once. In other words, production operations create output. Production operations tasks depend on the engineering stage: if a production task required an engineering assessment then the goal during production operations will be to make an appropriate integration of the experimental output of production standard i.e. proper factoring, testing and so on. Effectively, the work of production operations is purely technical and the skill requirements to fulfil production operations tasks can be rudimentary.
  • Support operations are all those activities required either for production operations to succeed (e.g. provision infrastructure etc.) and are sustaining. They are also to be carried out for the lifetime of any code in production. Personel who fulfil support operations require substantial skill in performing support operations such as administering servers, installing and configuring services, performing analytics and so on.

Some principles to bear in mind:

  • There is a bank of workers who perform operations, both produciton and support operations. They have a minimum skill level required to do their work but should be able to work on any task. However, because support operations tasks are task-specific it may take a while before most operations staff can be considered expert in their use.
  • There are two work queus within operations, one for each of the types of operations. Operations staff pull tasks from the work queue and are assessed on how well they complete them to well-defined standards.
  • There is a bank of workers who perform engineering tasks. These should be individuals highly skilled in engineering because they will often need to work to tight constraints and tight deadlines.
  • Operations tasks associated with a new project can only move to operations queues following a combination of a design go-nogo meeting and engineering go-nogo meeting.
  • Project management. Once a project is approved, all the operations tasks (both production and support) associated with it are loaded into the operations queues and work can commence on them.
  • Coupling of tasks. As with any project, operations tasks are coupled to one another, defining a sequence in which the tasks must be completed. For example, the task to create the necessary infrastructure (VMs, DBs etc.) will need to precede development tasks. The pace at which tasks are completed will determine the project progression.
  • Generation of implied tasks. Good designs will be exhaustive but for new or challenging projects, it may be necessary to generate implied tasks.