KSystemStats, The New Backend for System Monitoring

By ahiemstra, 8 January, 2021

This is the first in a series of articles where I will talk about some of the technology behind Plasma System Monitor. They will be quite technical.

About two years ago, a project was started to create an alternative to ksysguardd, the process that does the actual statistics collection for KSysGuard. Initially, this was intended to power a new set of system monitor applets for Plasma, but while we were working on it we concluded that it would also be a good idea to build a new system monitor application on top of this. The result of that is Plasma System Monitor, which had a preview release at the start of November.

Image
Plasma System Monitor Overview Page
Plasma System Monitor, powered by KSystemStats

But Why?

Now, the first question anyone is going to ask when someone says they will replace some working piece of code is "Why?". Why replace working code with something untested and new? To answer that, let me first exaplain how the old ksysguardd worked.

ksysguardd is a binary that gets launched when KSysGuard (or another application that wants system data) gets launched. It implements the actual data collection side of KSysGuard, using a custom protocol over standard input to communicate with the application. It has different code paths for different operating systems, which each operating system "backend" exposing a number of sensors that read system data.

Custom Protocols

One of the first reasons to create something new was the requirement of ksysguardd that it needs to be started separately for each process that wants to do something with the statistics. This has much to do with ksysguardd using a custom protocol for communication. While twenty years ago writing a custom protocol was probably about the only way to get something like this to work, these days we tend to make use of a more sophisticated IPC mechanism: D-Bus.

D-Bus allows us to create a process that can run as a stand-alone service exposing a more robust RPC interface to applications that want to use this data. This in turn means we do not need to start a separate instance for each process that wants to do something with this data. It also means that the underlying code can now be changed to deal with proper data structures rather than writing just about anything to a text stream, which is what happens in ksysguardd code.

Image
Memory Usage displayed in both Plasma System Monitor and Plasma Desktop
Memory usage being displayed by both Plasma System Monitor and Plasma Desktop, with KSystemStats providing the same data to both.

Code Reuse

ksysguardd is written completely in C. While this is fine for certain cases, in this case that means it prevents us from reusing code that we already have to implement some of the functionalities. For example, the partition usage sensors need to know which partitions exist and then query those for usage amounts. In ksysguardd this is implemented by reading /etc/mtab. That file, however, contains everything that is mounted, including things like /proc, cgroup and a lot of other things that are really not partitions. So the code has to filter that list, which leads to issues where we either do not list everything a user considers "a partition", or we show too much.

The thing is, we already have a solution for this problem. There are plenty of places in Plasma and other KDE software where we need to provide a list of partitions. For example, the device manager applet needs to show these, as well as the places panel in Dolphin. These all make use of the Solid framework to do this, which uses udisks2 on Linux these days. So it would be nice to be able to reuse that code, since it provides a much better way of listing partitions, that is well tested on multiple platforms. However, Solid is C++ code. While it is technically possible to interface with that from C code, it is not going to be the nicest of solutions and would clutter up code that already is not the most readable.

There is an additional reason to move away from C code: Most code written as part of KDE software is C++, if not an even higher level language like QML. We are used to having all the facilities that modern C++ provides us, in addition to all the things that come from Qt. Having to write things in C is cumbersome, to say the least, making it a lot harder to make changes to what kind of data we expose.

All APIs are Equal, But Some are More Equal than Others

All that said, most of those issues are not so much architectural problems. However, there is one fundamental issue in ksysguardd that is very much architectural and which we did not even solve in KSystemStats for a fairly long time. This has much to do with what can be considered the "API" of a separate service like this.

As I mentioned above, in ksysguardd, the system specific "backend" deteremines which sensors there are and what kind of things they expose. However, this means that, on different platforms, the set of sensors exposed by the service can be different. When you then have an application that makes use of some sensors to display data, that application suddenly has to deal with the differences between these sensors, which makes things a lot harder for the application. Moreover, this problem is not even limited to different platforms, the same platform can change what sensors are exposed since there is absolutely nothing that enforces a structure.

In a way, these sensors can be considered the "public API" of the service. And like a good public API, they should not change at the whim of whatever the underlying system decides, but be mostly stable. Therefore, in KSystemStats, we decided to restrict things a lot more. First, everything is part of a subsystem. These are mostly meant for categorisation and include things like "CPU" or "Memory". Each subsystem can contain one or more "sensor objects", which represent more concrete objects in the system, like a CPU core or a GPU. Finally, each sensor object has a number of sensor properties, which are the objects that provide the actual system data. Sensor properties include things like CPU core usage and the amount of memory used.

One additional important aspect is that sensor properties represent not only the data value but also several bits of metadata about those properties. This includes a name and description, but also its unit and maximum value. Since the sensor properties are mostly static and defined up front, we can actually provide proper translated names for everything. In addition, with the extra metadata the client can make decisions about how to display that information, like how to format the data value or providing a reasonable default range for a line chart.

All this should lead to a much more stable "public API" for the sensors, which means that the experience when running an application that makes use of KSystemStats should behave a lot more like it is intended, regardless of the underlying system. This in turn means that the application can be polished a lot more to provide a good experience to users.

Image
Plasma displaying several different data sources.
Plasma displaying several different data sources.

Plugins and Modularity

A change that was made fairly early in the project is that sensors are no longer hardcoded directly in the service itself, but are provided by plugins. This enforces separation between the service code that exposes things on D-Bus and the code that is reading values from the system and between the different subsystems. It also means that it becomes a lot simpler to add new sensors to the system, since that simply means writing a new plugin. Longer term, I hope this will lead to more things being supported. It already helped a lot when creating the [GPU integration].

In addition, it helped us transition to the new system, as we could create a plugin that uses ksysguardd behind the scenes, but then maps that to the structure as we defined it for KSystemStats. While not entirely painless, this allowed us to build on top of the new infrastructure and later on replace the underlying data collection code. That meant we couldto ship KSystemStats and the improved Plasma widgets in Plasma 5.19, while the data collection code has been mostly replaced for Plasma 5.21.

Closing Notes

This has become a pretty long story, but I hope it highlights some of the reasoning and design ideas behind the KSystemStats service. In the next blog post I will talk about rendering charts and the KQuickCharts framework.

Tags

Comments11

Nico

2 years 6 months ago

Thank you very much for the insights and showing the advantages of the new system over the old. I'd like to know how your ansatz compares to eg the one of NetData which collects tons of metrics with almost no overhead in the background. Is it similar? Is it something which could have been used here as well?

We did not really look at other solutions, we set out to improve the one we had. I'm afraid I do not really have experience with many other systems, expect for a number of server monitoring solutions, which are much too heavy for this task.

That said, from looking at the website of NetData it also seems more geared towards server monitoring. This is quite a different use case from what we're doing here, which is running a user service to provide data for monitoring applications. Usually the server platforms require quite a bit more configuration and setup compared to what we're doing.

Filzmaier Josef

2 years 6 months ago

Just tested the new system monitor. This is exciting and already very polished - thanks for the great work.

One concern i have is with startup time and resource usage. Such tools are often used in situations where the system is low on resources to kill services or applications that do not behave correctly. The system monitor should therefore be as light as possible (however i also enjoy the configurability). In my test the system monitor takes about 2 to 3 seconds longer to load and uses about 3 times as much memory as ksysguard. Is this a limitation of kirigami applications, or can the application still be further optimized in this regard (or both?)

Thanks

Some work has already been done for Plasma 5.21 to improve certain cases, but there will be more areas that can be improved. We will actually be looking at this during the upcoming Plasma 5.21 beta phase, to see what else we can scrape off.

Chris

2 years 6 months ago

Plasma System Monitor looks just great. The UI/UX is way better than what KSysGuard has.

Nico

2 years 6 months ago

Is it possible to remote monitor several KDE- and Non-KDE Systems (e.g. headless RaspberryPis) with the new Plasma System Monitor over a network?

Not yet, but I do have some ideas for this. The new daemon actually allows us a bit more flexibility there, with KSysGuard you replace the entire backend to monitor a different system, which means it is not possible to monitor multiple systems. With KSystemStats we can instead create a number of plugins that can communicate with different remote monitoring solutions, allow these to be setup and then have multiple systems available.

That's actually why it worked the way it did (communicating over stdin/stdout, e. g., because that works over ssh), and ksysguard used to have a dialog for "connect to remote host".

Martin Sandsmark

2 years 6 months ago

In reply to by Martin Sandsmark

A screenshot of the connect dialog in the original ksysguard in KDE 2: https://i.imgur.com/m9PNcyy.png

Also why it was written in C, because it was much easier to compile and run on weird servers.

Martin Sandsmark

2 years 6 months ago

In reply to by Martin Sandsmark

Sorry for all the comments replying to myself, but to use it e. g. with raspbian just run apt install ksysguardd on your rpi, and in KSysGuard go to File -> Monitor Remote Machine and type in the IP

Michael

2 years 5 months ago

Thank you for going into detail about this, I was curious about the inner workings when I first heard about the project. Can't wait to use it!