Nowadays system resources are widely available. Ordinary high-level application software developers have to follow minimum system requirements. So, don't expect them to optimize the code to reduce the resource capacity of an application. Unless the business owner wants this.
We published and translated this article with the copyright holder's permission. The author is Pavel Kirienko (pavel.kirienko@zubax.com). The article was originally published on Habr.
And this makes sense — human resources are the most expensive resources in process control. Moreover, if developers don't have to think about bytes, they pay attention to high-priority tasks, such as ensuring the functional correctness of the program.
There's another rarely discussed problem. It usually occurs among a small number of embedded system developers, including those who work with increased fault tolerance. There's a reason to believe that the early experience of using MCS51/AVR/PIC may be psychologically traumatic for some developers. These "martyrs" continue to count bytes throughout their careers, even when there are no objective reasons for this [RU]. Of course, this does not apply to cases when strict price restrictions control the computing platform (microcontroller) resources. This is reasonable if the computing platform costs less than the whole product, its development, and the verification of its non-trivial software. This usually happens in the transport industry and complex industrial automation. In this article, we are talking about complex industrial automation.
One could argue: "You... bastard! Haven't you heard about MISRA? What about AUTOSAR? Don't you follow HIC++ standards? We have a serious business here. We can't make errors — otherwise, you might get hit by a crane!" Here we need to understand the difference between adequate software design and functional support in corresponding systems. If all your software is designed according to the V-model, perhaps this article is not for you because you are dealing with architecture design — that's a different thing. As for other embedded developers, with this article, I urge them to change their approach to work.
So, getting back to the standards mentioned above. Here are the rules that you have to follow:
All this applies to cases when an unexpected failure costs more than investing in high-quality software and an adequate platform. By the way, standards pay relatively little attention to testing. It is a slightly different discipline.
Do you see the rule "ignore the well-known principles of complex information system design"? Me neither. But repressed memories of a cross-compiler for weak single-chip microcontrollers make developers suffer. This does not allow them to fully comprehend their actions and long-term consequences. As a result, we get spaghetti code without architecture. It's impossible to maintain and test such code. And yet it is MISRA compliant — (dubious) evidence of quality.
Unfortunately, I happened to work with some real-time embedded software that requires high reliability. And I repeatedly felt my hair stand on end. So, I'm no longer surprised by an old story about errors in the Toyota Prius control system, or a slightly newer tale about Boeing 737MAX. Soon every system in our brave new world will become software defined. And that's (truly) great — it opens the door to complex problem solving with fewer resources. However, something must be done with the widespread software quality problem.
If we take a typical embedded system of adequately high complexity, we can highlight the following clots of logic:
And here are the results: module testing is impossible because there are no modules; helpful support is impossible because everything is complicated; it is difficult to discuss functional correctness guarantees — at first, we should learn how to make a reasonable task decomposition instead of dealing with everything in a single loop. This is often presented as an inevitable thing — there is no other way to solve it because it's not a desktop system with unlimited resources, you don't understand — it's different.
Some tool vendors indirectly add fuel to the fire for embedded software developers: Mbed, Arduino, etc. These tool vendors are obsessed with low-level hardware management. Moreover, their marketing strategies encourage beginners to focus on the same. So, here I am, looking at my desktop where I opened CLion. It displays a project for one embedded system; the project contains more than a hundred thousand lines of code. Peripheral drivers occupy about three thousand lines, the rest is used by business logic and some maths. Even my limited experience suggests that the target business logic of the application is much more complex than a part of logic that works with hardware, except for simple devices.
One fine day, I was discussing the details of my open-source project (closely related to embedded) with a developer from a third-party company; let's call him Ilya (the name has not been changed due to deanonymizing purposes). He was working on a feature for my project that the employer asked him to do. Ilya was very accurate — he worked carefully and slowly. Ilya regularly phoned me to discuss the best way to do this and that. Once, his boss came to him, opened his mouth, and said:
Look! I've found a new awesome system! It's called Mbed, so, it must be for embedded developers. Check out how quickly you can build prototypes! A few clicks and an LED is blinking! See? Like on the video. Ilya, you've been working on your CAN filter optimization algorithm for a week! That's not right, let's switch to Mbed.
Of course, I don't mean that these products are totally useless. They are extremely useful when business logic is simple and is focused on the integration of multiple components. To see an example, just click on one of their ads — this is what they are created for. But it drives me nuts when a manager is promoting an embedded framework claiming that the developer will hardly spend 1% on peripherals and debugging.
That's complete and utter disaster for those who work in this industry. Things get even worse when a low-level software developer starts designing distributed systems without proper training.
Earlier, I published a long article about our open-source project UAVCAN (Uncomplicated Application-level Vehicular Computing And Networking). It allows you to build distributed (hard) real-time computing systems in onboard networks over Ethernet, CAN FD, or RS-4xx. This is a publish–subscribe framework. It is like DDS or ROS, but it is focused on predictability, real-time and verification. It also supports bare-metal environments.
To arrange a distributed process, UAVCAN offers a domain-specific language — DSDL (Data Structure Description Language). It helps a developer to specify data types in the system and basic contracts, and then build business logic. It works like REST web endpoints, XMLRPC, and something like that. Imagine an ordinary backend developer — a person exhausted by service-oriented modeling and by complex support of distributed systems. If you explain the idea of real time to them, they will produce good, user-friendly interfaces on UAVCAN in a short time.
A classic example is the integration of an air data computer or at least one airspeed sensor. A backend developer who had a hard time developing and maintaining production, will ponder the following question: "what business task are we solving?"
Let's say the test subject answers something like "measure the airspeed, the barometric height, and the static pressure". And the developer brings the following DSDL lines into the world:
# Calibrated airspeed
uavcan.time.SynchronizedTimestamp.1.0 timestamp
uavcan.si.unit.velocity.Scalar.1.0 calibrated_airspeed
float16 error_variance
# Pressure altitude
uavcan.time.SynchronizedTimestamp.1.0 timestamp
uavcan.si.unit.length.Scalar.1.0 pressure_altitude
float16 error_variance
# Static pressure & temperature
uavcan.time.SynchronizedTimestamp.1.0 timestamp
uavcan.si.unit.pressure.Scalar.1.0 static_pressure
uavcan.si.unit.temperature.Scalar.1.0 outside_air_temperature
float16[3] covariance_urt
# The upper-right triangle of the covariance matrix:
# 0 -- pascal^2
# 1 -- pascal*kelvin
# 2 -- kelvin^2
We get a complete network service that provides data from the air data computer (of course, we can't say that this is an example of a complete service, but it's enough to understand the idea). If users want, for example, to know the barometric height, they simply subscribe to the corresponding topic.
Those who are familiar with the physics of flight will ask: how does the data terminal equipment (which supplies the air data signals) know the calibrated airspeed parameters? After all, this assumes that the sensor is aware of its own position on the aircraft and of its aerodynamic properties. The encapsulation principles and the division of responsibility state that the corresponding parameters are configured on the air data service provider (i.e., the sensor node). This helps to hide the details of the service implementation from the consumers.
In some types of UAVs, autocalibration is used. In these types, a sampling from the pitot-static system is compared with ground speed over a significant period of time. This helps to determine the calibration parameters empirically. As part of the service-oriented architecture, this is solved by turning the air data computer into a consumer of ground speed data. Simply put, our network node that measures airspeed subscribes to a topic that has data on the ground speed of the aircraft. Thus, it gets access to the necessary context to perform autocalibration.
An experienced information systems architect would say — "That's obvious! There's no need to explain this. We have a service, we have a dependency — connect them, and let's fly". But my unfortunate experience proves that embedded developers do not know about these obvious things. Experienced embedded developers try to solve the same problem in a completely different way. They value the means to achieve the goals, not the goals. In other words: the main question is not what we do, but how we do it. As a result, instead of a service, we get a single topic of the following type:
uint16 differential_pressure_reading
uint16 static_pressure_reading
uint16 outside_air_temperature_reading
Of course, we cannot use this directly. That's why our end device turns into a passive sensor that reports measurements to a central node. Then, the node performs calculations and publishes the results to the network in highly specialized formats, one by one, for the end user. For example, if the gyro suspension and the slat drive need airspeed, then a separate topic of its type will be attached to each of them. Saw it with my own eyes.
As a result, we get all the same spaghetti code with a God object. However, instead of the object, we have a central node, and instead of spaghetti — hundreds of topics without architecture. Obviously, this approach can also increase data delivery time and network load. At the same time, it may reduce fault tolerance to centralize processes.
Don't think that I want to send all embedded developers in a bioreactor. After all, I am one of them. But I'm inclined to think that it's easier to make a qualified embedded developer from a good application programmer rather than to wait for reasonable code from an embedded developer.
Dear colleagues, think it over.
I see how embedded developers hammer a nail with a screwdriver. I guess there's much more of this that I do not know. Last year, my team and I were so desperate that we published a nano-manual explaining how good network service looks like. Here it is — UAVCAN Interface Design Guidelines. Of course, this is a drop in the bucket. One day I'll translate it into Russian to improve professional competence.
Misunderstanding of distributed computing basics makes it difficult to introduce new standards to replace outdated approaches. The target audience doesn't accept our groundworks within the DS-015 standard (created in collaboration with the notorious NXP Semiconductors and Auterion AG) because they find them strange. However, the key principles of our groundworks have been known in IT industry for decades. We must narrow this gap.
If you want to take part in the movement for architectural cleanliness and common sense, you can join the uavcan_ru telegram channel or forum.uavcan.org.