Rethink Telemetry
Remove Current Code
There was some experimental code added by @codesardine to gather information about user systems in bfd94ed0. This functionality was only meant for some personal experiments (should therefore not have been added to master), was disabled with a second commit (687b7a6b) and was never shipped afaik.
The code was later changed up by @fhdk to send the data when a config option is set (off by default): ae3d5dd6
The question is now though: Why do we have the telemetry code at all?
- Even if a user does opt-in, we don't use the data at all.
- pacman-mirrors is not the right tool to hook into because it won't be available on future immutable images.
- And to be honest an opt-in telemetry is basically useless as the data is highly skewed.
Therefore the code should be removed.
Alternatives
But I still think some more intelligence in our data gathering would be useful. There are two goals with it though that need to be distinguished.
Improving current installs counting
Currently we count installs via the requests from NetworkManager to ping.manjaro.org. This can be simulated like this:
curl -A "" -I http://ping.manjaro.org/check_network_status.txt
The problem is that this only gives us a ping + IP on our backend side (we use Matomo) to identify unique installs/machines what is very inaccurate. All Manjaro computers in the same lan would be only counted as 1. For example we have here at home 3 laptops and 1 computer with Manjaro on it, but because they are in the same lan, it will only be counted as 1. On the other side if I travel with my laptop it will be counted as a new one if I go online via my mobile hotspot.
Processing more data to identify unique machines would improve this a lot. For that the data does not need to be shared. For example we can build a hash from some unique machine data (MAC address plus CPU name) and send this hash as a payload to distinguish machines.
This seems not to be possible though with NetworkManager's simple ping mechanism. And as argued above pacman-mirrors is not a suitable replacement either. Is there some other user process that would be suitable for that?
On the backend side we need to find out if Matomo can rely on such a directly provided hash for its User IDs generation.
Gather more hardware data
Of course having more data about the systems that Manjaro is running on, would always be helpful for us so we better know what we should prioritize in our development efforts. But neither NetworkManager nor pacman-mirrors are suitable homes for that. Maybe that should actually go into a separate process/package. The data must be sent encrypted and only once after boot. It also only makes sense if it's opt-out.
To build a system for this kind of data collection is a more long-term project and the question is if it's worth the cost. For example I don't believe that Matomo is a suitable tool to interpret such data, so we would need to find or create a different system for our servers to run.
Replace install counting via NetworkManager pings
The user counting with hash payload, but definitely gathering more hardware data, makes it necessary to use a different process for the communication than changing the ping destination of NetworkManager. And in general I would like to not use the NetworkManager ping-mechanism for install counting as it does this every 5 minutes (I know it needs to ping "something" anyway, but it's a question of principle). It would be enough to send a single request with hash once after boot. Is there a process we could use for that? We could also just ship a systemd unit doing this once after boot when network is available, right? This seems like a cleaner solution to me.