Built-in Quality: An all-rounder's view on hardware, software and embedded systems
The third guest in my interview series on Built-in Quality is my esteemed colleague and a true all-round talent, Christoph Schmitz. He is a quality manager, software/system architect, project manager, requirements engineer for software, embedded software and combined hardware/software systems.
(The following conversation has automatically been translated from German).
Welcome, Christoph! I am very pleased that you have taken the time to talk to me about quality today.
Please introduce yourself briefly - who are you, what do you do?
First of all, thank you very much for the invitation. I think it is a very good idea to discuss the topic of quality in this way.
My name is Christoph Schmitz, I studied computer science and also did my doctorate in this field.
During my training, I was mainly concerned with formal methods, i.e. how can I develop correctly functioning software and subsequently also show that it works. At Zuehlke, my focus was and is software or system quality.
Quality is my passion. That became apparent early on in my career.
During my studies, I was the one who took the longest to write his software. When it came to running the whole thing afterwards, my software was the one that ran the best. I invested the same amount of time in testing as in developing.
That's when I learned that it pays to test early, already during development, and thus avoid mistakes.
At Zuehlke, I have almost without exception developed safety-relevant systems of software and hardware in combination. As a result, I have repeatedly come into contact with quality problems that occur as soon as you combine hardware and software.
In my private life I am married and enjoy my life here in Switzerland (Christoph is originally from Germany).
When I'm not working, I like to relax by cooking, baking, reading or exercising. Reading is probably obvious if you look behind me.
I really own a lot of books, on all subjects. From crime novels to classics, history books, psychology, technology, economics and so on. There is hardly anything that doesn't interest me.
We both know each other through Zuehlke, for whose client we worked together on a quality strategy last year. This project was specifically about the smooth interaction of hardware and software in the field of building automation. I found the collaboration with you super exciting and so the idea came up to interview you for my blog series.
When you think of quality, regardless of software or hardware, what does that mean to you?
When is a product of good quality?
Quality is not an absolute term. For me, quality is relative and depends a lot on the customer. If we all demanded perfect quality, we would no longer be able to afford many things because they would become far too expensive. So you have to make concessions here. Quality also depends on the area in which the product is used. Personally, I have lower quality standards for a video game than for an insulin pump or a pacemaker. When I travel by plane, I have higher quality expectations of the plane's engines than of the seats.
That means I don't always have to have maximum quality, but I have to have a quality that is good enough for the purpose. Quality that fulfills the requirement and with which the customers are satisfied in the end.
You have many projects in the safety-critical area. I'm sure everyone can identify with the aircraft example. Can you share something more from particularly critical areas?
Gladly. When I'm working in a critical environment such as railway technology, there are regulations that determine how often a system may fail. They regulate the time span from which faults are allowed to occur. That can be a few thousand to several million operating hours.
If a system is not quite so critical, it may be acceptable for a dangerous failure to occur statistically every 10 years. In the case of extremely safety-relevant functions, however, statistically there may only be a dangerous failure 1 time in 10,000 years. With such orders of magnitude, we are therefore talking about completely different demands on quality. This also means that development and testing have to be done differently. The process also does not end with the finished development. Depending on the area of application, test organisations and authorities have to approve the system for use by end customers. This is to ensure that the product contains the required quality and that no unnecessary hazards emanate from the product.
I can see that this is a different house number than what I'm used to. I work a lot in the financial sector. It's regulated and risk management is very important, but at the end of the day, in the worst case, it's only about money that gets lost. It's not directly about human lives.
In transport or medical technology, human lives can be at stake very quickly. I recently read an article about a near accident on a railway line here in Switzerland. At least one system responsible for the warning signals on the line failed. Fortunately, the driver reacted in time. Although all the signals were green, a construction vehicle was standing on the track where the driver had just arrived with his train.
With your experience, do you have any suspicions about what might have happened?
That is difficult to say from the outside. I don't want to speculate. I know that all signalling systems have to comply with the highest safety level. Normally, as soon as something is unclear, as soon as something is not working as it is supposed to, the signals are automatically set to red. This is a good example of how important it is that these systems work reliably.
I had once worked on one of the software for controlling marshalling yards. I had developed a part that remotely controls the locomotive on the layout. When we were testing it on the layout one weekend, my colleague said: "So, Christoph, you wrote the control system... And does it work?" "Yes, I'm sure it works." "Good. Are you sure enough to stand here? According to your software, the locomotive should stop just before."
Phew, tough question...did you do it?
I did not do it. I shouldn't have done it at all because of the safety regulations, but the locomotive would have stopped in time. I would have survived (laughs with relief).
These are things that make you realize how important correctly functioning software can be.
Yes, absolutely. I'm glad you can give us this insight. Especially in an area that many of us take for granted. We don't even think about how much human work, precision and conscientiousness goes into making the software do what it should. Also in combination with the hardware. So that everything works together correctly and no one is run over by the train.
I recently had an exchange with a colleague on the topic of fault tolerance in medical technology. Specifically, we talked about the use of AI (artificial intelligence) in the detection of cancer. He trains AI models to detect cancer on scans. Now the question is whether it is OK if the model fails 1 time even though it is cancer. Is it OK if the model alerts incorrectly 3 times? So it is not cancer after all. This has to be considered carefully.
Do you also have experience in this regard?
In general, it is always the case that the manufacturer of a product must consider exactly what the target group is and what is to be treated or diagnosed. The manufacturer must define the corresponding performance characteristics. The rule of thumb for this is: It must not get worse than before.
If I want to bring a medical product onto the market in Europe, I have to prove that my product is at least as good or better, i.e. that it brings progress, than the products that are already on the market.
Discretionary issues often come into play. Take the example of cancer patients: Is it justifiable to unnecessarily panic 500 people in order to avoid not informing a single person about their cancer? There is a lot of weighing to do here and a lot of ethical issues involved. I am very glad that I don't have to decide this.
That just fits perfectly with my next question. How much room for manoeuvre do you have in a normal project in the medical field? Is everything exactly predefined or do you have a certain flexibility depending on the role?
Things are very clearly specified, so it's different from typical projects in the web environment.
Clearly defined requirements are a must. I can't be vague about that. Other things are at stake here if a requirement is not precisely defined.
In some cases, it is also specified what may be used for software development. For example, certain programming constructs may not be used because they are not secure enough. Special caution also applies to the choice of development tools. Development tools are in principle freely selectable, but they must be validated. This means that it must be ensured that the tools do exactly what they claim to do.
In effective programming, one typically follows so-called programming guidelines. These are guidelines for developers that specify what may or may not be done or used. Apart from that, developers are not restricted any further. So not everything is prescribed down to the last detail.
A clean architecture and design are created and documented on the basis of which the development can work.
Everything is then checked by verification and formal testing. At the unit test level, this is done by the developer him/herself. At the higher test levels, such as integration test or system test, the code must always be checked by another person. This means that no one tests what he or she has written. This reduces the risk of operational blindness.
For me, these are not severe restrictions, but rather guard rails that show me in which area I can move safely without running the risk of falling into the abyss.
I like to say right from the start of a project, especially if it's a safety-related one: In everything you do, remember two things:
if you make a mistake, in the worst case it can cost someone their life.
would you have your child treated with this product?
These two points increase quality awareness immensely. They turn an abstract project into something tangible.
That fascinates and motivates me, especially in the medical field. I always ask myself what good I can do with this product.
Very well said.
In the safety-critical area, how do you prevent mutual blame in the event of errors?
I observe this regularly in far less critical industries. Many make it easy for themselves and blame others, often people in the QA/tester role.
The danger is there, of course. What I do in projects is to work in my team according to the principle that we develop something together and test it as well and extensively as we can. At a certain stage, from a regulatory point of view, the product has to be passed on to independent people for formal verification.
Independently of this, our goal in development is always to ensure that what we deliver also functions fully. We want to avoid errors as far as possible, find them ourselves and not have them reported by the downstream verification.
Of course, it becomes more difficult when several teams work on one system or several systems come together. Here it is important to have a project team that includes all sub-areas, so that one sees oneself as a whole. It is essential to have a common understanding. We built the whole system together and together we get the system to work as a whole.
Often the technical challenges here are less than the human ones. Collaboration and communication in particular often pose hurdles.
Yes, I see it that way too. For me, a lot stands and falls with how the employees can or are allowed to work together.
How do you deal with it when there is formally a joint team, but the big picture is not known because the overall view is not documented? For example, there is no architecture or design that maps the overall context, the process end-to-end.
I have seen this more often and it is a very current problem. Many companies used to have product A, product B and product C. With the advent of the internet and the IoT (internet of things) and smartphones, existing products need to be more connected and additional apps need to be provided for control. All of a sudden, systems that were meant to be independent are thrown together. You then often start working on things and puzzle everything together before you have defined the big picture. What do we actually want? What is our big goal, our vision, our battle plan? Where do we want to go? Often it's just: we're doing this now...everyone's doing it now. Look, our competitors already have something...we have to have that too.
You're referring to companies that used to focus entirely on hardware and are now faced with the increasing complexity of IoT.
Exactly. Here, of course, worlds collide.
Hardware parts have completely different life cycles than smartphone apps. If a smartphone app doesn't receive an update for half a year, for example, I wonder if the company still exists at all.
Whereas with an appliance I buy for my home, I expect it to work for years, not for someone to have to come round and update it all the time.
It's different with smartphones. The software should always be new and up-to-date, and so should the device. When my smartphone is 3 years old, there are already three new generations that are much better.
Today, the traditionally long-lived hardware world meets an almost hyperactive, short-lived digital world. Uniting these two is a great challenge.
The difficulty often lies at the interface of both worlds.
I once experienced in a project that a hardware team wanted to check the integration with a software team and asked for their requirements. However, the software team had no formal requirements, only user stories without any traceability. This is a real problem.
It is a process that many companies are currently facing. You have to find each other and define together how you can work together in the best and most meaningful way, so that there is a rewarding collaboration for all that binds everyone together.
And how can I now test such a complex system-of-systems consistently? Who does what? Who is responsible for what?
Do you have any tips on how to organize this, how to distribute responsibilities?
The first thing is, and I assume this, that each team tests its own system or its own part as best as possible.
There are different approaches for further integration. One can integrate everything at once, i.e. a "big bang". However, with high complexity, this is a call to disaster for me.
In my experience, it is always better to start with smaller units. That is, integrate small units that work closely together. Then connect several of these units and so on until the whole system is integrated.
This has the advantage that I can recognize problems between individual systems or subsystems more quickly and assign them better.
The teams working on a common interface are responsible for integration and verification. It must be clear to everyone that it is no longer just about their own system, but about their own system together with other systems. That is the core idea.
However, the prerequisite for this is that the organisation gives the employees the freedom to work in this way. It is not possible for team A to be located in department A and team B in department B with completely different guidelines. Conflicts of objectives are inevitable here.
Exactly, the organisation as a whole has to adapt. Otherwise you will always have these conflicting goals.
I see you often fullfil different roles and sometimes several roles in the same project. You develop, you are in quality assurance, you are a tester. I also see you as an advisor on how to make the overall structure better and how to improve the collaboration.
Where do you see yourself most likely? If you had to choose, which role do you prefer?
That is a very good question. I could now say, typically German, the "jack of all trades device".
Joking aside. I don't want to be purely advisory, I want to get my hands dirty. I want to see if what I recommend really works. What I don't like is to write a nice concept and then close the door and leave. I want to be there. Whether it's setting up a test environment or bringing together the individual disciplines, such as requirements engineering, testing and development.
We are very similar in that respect. I think that's why our cooperation worked out so well from the very first moment.
Finally, I would like to ask you what you think are the 3 most important success factors for quality or built-in quality?
For projects in the hardware and software sector, these are for me:
As a developer always remember that the hardware is not as flexible to change as the software. In addition, the hardware is usually available later than the software. This means that I have to build up a lot of infrastructure to be able to test my software well. Sometimes it is necessary to create a complete simulation of the target system. These efforts must be taken into account from the beginning and planned accordingly.
Whatever your role, look not just at your own backyard, but at the system as a whole. As a developer take responsibility and proactively address concerns or vulnerabilities.
Built-in quality only works if we work together as a team, across roles, and support each other. For example, if I am a developer in the project, I ask myself how I can make life easier for the tester. We have to give up our egoism and see ourselves as a team. Only together can we be successful.
Very true words. Thank you, Christoph.
Is there anything else you would like to share with our readers?
Not all hardware is the same. I developed on a system that had 32KB of available memory for software. Hardware can be anything from tiny with hardly any memory to a huge system, like an aeroplane, with large bandwidth. There is no one system of software and hardware.
In the embedded sector, one is often forced to write very small programmes. Hardware memory is expensive. Software developers often do not understand this. They are not familiar with such small programmes because it is not possible to write such small programmes with common languages such as Java or C#.
It's a different challenge that can be very exciting.
I hope we have made a few people out there curious to learn more about it. I find it extremely exciting and have really enjoyed the project with you.
Thank you very much, Christoph.
Thank you for the interview and the good cooperation.