Fedora 39-xfce update fails on R4.2 [CONNECT tunnel failed, response 403]

Still no official word on how to get updates again on R4.1
An obscure Forum thread (this one!) is not doing Qubes any good. Given the magnitude of the SNAFU, I consider that a News article, describing what happened and how to get back on track, would be required.

1 Like

@marmarek The packages that fix the qubes.UpdatesProxy issue should probably be uploaded to the stable repository manually. It looks like they will be uploaded automatically in a few days, but it seems critical that users can’t update unless they allow the testing repository on their system.

1 Like

Set out exactly what template you are using for the update
proxy, which templates are unable to update, and what error message you
see.

I’d have expected a statement as well, this is critical and many users are puzzled now.

1 Like

[irrelevant comment retracted]

Thank you Solene! So I think what you are saying is that it’s the testing updates that were pushed through. In that case, if I hold off for some time, the test. updates should become stable and that should work automatically…?

Yes, it’s supposed to work like that :slight_smile:

hello, just i have updated fedora 39 ( on template ) without problem…miro

1 Like

At the risk of getting more mildly abusive PMs, I dont follow.

The second post in this thread referred to the GitHub issue, and
referred to workarounds there. So, yes, the process was documented there.
If you follow the link to the issue you see that it was reported on
8th, diagnosed and fixed 9th, with updated package pushed to testing.

This arose from an upstream change, which affected Qubes.
Workarounds were provided and linked in this thread.
It was fixed - “f*ck Yes”

Well. Yes. This is the level of support this free project gets. You conveniently omit the fact that the github issue was only talking about R4.2.

Perhaps it’s because 4.2 is the latest stable release lol :zipper_mouth_face:

Ahem… :smirk:

Support for older releases

In accordance with our release support policy, Qubes 4.1 will remain supported for six months after the release of Qubes 4.2, until 2024-06-18. After that, Qubes 4.1 will no longer receive security updates or bug fixes.

I don’t think the level of vitriol in this thread is justified. Let’s step back and look at the situation:

  1. A severe bug was introduced in Fedora.
  2. The bug was properly reported by a user, and a couple of other users immediately commented on the issue confirming it and offering their own workarounds.
  3. Within hours, the issue was marked as a blocker (highest priority) and pinned.
  4. The devs pushed a fix for the issue one day later.
  5. The fix gets applied automatically without the user having to mess with the command line or any user workarounds from forum threads or GitHub (with the significant caveat that the update has to be applied twice, which I’ll get back to below).

A few remarks:

  • Bugs in software are inevitable. If our expectation is never to experience severe bugs, we will be forever disappointed. The right attitude is not to think that we will never run into problems, but rather to expect them and try to handle them as best we can.

  • The issue was reported, triaged, and fixed in about a day. That’s an extremely good response time, especially for such a small project. Multitrillion-dollar corporations routinely do much worse. If you still felt it was too slow, I really think you just need to be more patient.

  • Some people were complaining about the lack of an official news post while the fix was still in testing. I think a news post at that point would have been a mistake for the following reasons:

    1. Many users never read news posts, so we shouldn’t rely on these for critical communications (unless we have no other choice). Even fewer use the forum and mailing lists.
    2. For every news post, some percentage of readers misunderstands, misinterprets, or gets confused. The more complex the content, the more likely this is to occur.
    3. Whenever you tell users to follow a set of instructions, some percentage of them will not follow the instructions correctly, accidentally mess something else up in the process, stop partway through because they got distracted, forget to do it at all, etc.
    4. Whenever possible, it’s much better to make the fix automatic so that users don’t have to perform manual steps.
    5. Most users are probably stable users. News post about fixes in testing aren’t relevant to them.
    6. When people get too many news posts that aren’t valuable to them, they stop paying attention to news posts.
    7. There are different levels of news posts. People subscribed only to qubes-announce wouldn’t even see this.
    8. The point of having a public issue tracker is so that everyone can immediately see whether the devs are aware of an issue and whether they have fixed it. This information is already embedded in the issue itself. Saying, “We are aware of this issue, and we are working on a fix (or have just fixed it)” would be redundant. Anyone who wants to know that can already see it on the issue.
    9. The standard procedure for every bug is something like: report → diagnose → fix → put fix in testing → migrate fix from testing to stable → done. Users automatically get the fix depending on whether they’re using testing or stable repos. There is no need to announce this standard procedure separately for this bug, as it’s the same for every bug.
    10. A workaround is not as good as a real fix. It’s a questionable use of scarce resources to take time vetting a workaround and officially communicating that workaround to the userbase when an official fix is only hours behind it. Better to just wait a little bit longer for the real fix. Otherwise, some users will get confused later and try to apply the workaround even after the real fix is already available.
    11. Many users do not update every day (but rather every 2-3 days or even longer), so they may not even need to be aware of the problem if it’s already resolved before it has a chance to affect them.
  • I think some users felt that the fix was in testing for too long and should have been moved to stable more quickly (or even right away). Maybe. It’s a judgment call. Sometimes, a fix can break more than it fixes, and testing can catch this before it affects the entire stable userbase. Other times, there may be a high degree of certainty that the fix will be a net benefit. It depends on the nature of the problem, the nature of the fix, and the situation as a whole. It requires expertise, judgement, and experience. In this case, @DVM tagged @marmarek in a comment, suggesting that the fix be manually moved to stable more quickly, which seems to have worked well.

  • Qubes OS 4.1 is, of course, still supported, which is why it has already received a fix for this, alongside the fix for 4.2. In this case, it looks like the level of support was the same for 4.1 and 4.2, as expected.

The main problem I see is that the official fix still requires some manual steps from the user. Quoting @marmarek’s comment:

B) in dom0. This is to unbreak the broken update mechanism of the Template. Does Qubes already have a mechanism for dom0 to apply hot fixes to Templates?

Yes, that’s exactly why we have our update tool instead calling apt/dnf directly.
One unfortunate thing is the fix will require updating twice:

  1. run template update once - this will apply the fix, but the update itself will fail, since the fix isn’t in sys-net at this point yet
  2. restart sys-net
  3. update again - now it will work

Here’s my comment responding to this:

It’s a real shame that Salt or the Qubes Update tool can’t handle this automatically for users. Can’t it at least display a message to users when they try to update, telling them that they have to perform these steps manually? We could make a separate announcement about it, but of course, we already know that far fewer users will see such an announcement compared to a message inside Qubes OS itself. What do you think?

I suppose that what’s likely to happen (for an affected user who isn’t aware of this) is something like the following:

  1. The user tries to update. The update fails. The user is puzzled.
  2. Some time later, the user restarts sys-net in the course of normal system usage.
  3. Some time later, the user tries to update again. This time it succeeds.

I suppose that’s not so bad. After all, it’s common for updates to fail even in the absence of bugs, especially when updating over Tor (e.g., network or mirror problems). And, when any kind of electronic device malfunctions, a good first step is to “turn it off and turn it back on again” before trying to use it again, which is exactly what will do the trick in this case. Still, this does indicate that the fix is technically incomplete, since it requires user action to take effect, even though those actions are routine and don’t make any special demands from non-technical users (update, restart, update again). Even notifying the user when a completed update requires a system restart wouldn’t be enough in this case, though it would help.

I just really wish there were a way for the update tool to say something like, “Update partially applied. Please restart [sys-net] and run this Updater again.” (Of course, it would be even better if the fix could be applied automatically without having to restart anything, but even mainstream OSes have to be restarted for most updates.) Let’s see what Marek thinks, but I guess if a news post is our only option to communicate the need for these manual steps, it might be warranted.

4 Likes

thank you very much for your long reply

I think what happens for a bunch of users is the following:

  1. the user tries to update, the update fails, the user freaks out
  2. they try to figure why this didn’t work or if they got hacked or broke something

A statement pinned on top of the forum “user support” maybe, saying that "if you have issue updating your templates, it will solve itself in a few days would have been good enough IMO.

Not only it broke the updates, but people were unable to install programs as well, so Qubes OS broke temporarily for them.

1 Like

I think that’s good for forum users (and, ideally, one of the active forum users in this thread would have flagged it with a message asking the mods to pin it), but most Qubes users probably aren’t on the forum (and they definitely shouldn’t have to check it), so it’s better than nothing, but still far from ideal solution, IMHO. I’m still strongly in favor of some kind of notification inside the OS itself.

this would be ideal indeed.

1 Like

[irrelevant comment retracted]

Perhaps. I’m sure people vary widely in their reactions and internal mental states.

I don’t think that not being able to install programs is synonymous with the system being broken. It depends on what you’re trying to do. You can still do quite a lot with your system when you’re not installing new programs. Also, installing new programs is probably less frequent than updating. But none of this is to diminish the severity of the bug. Again, it was triaged at the highest possible priority. As mentioned above, severe bugs can’t be eliminated. All we can do is try to fix them as quickly as possible when we discover them, which is exactly what happened.

1 Like

But that was just an unofficial workaround you figured out for yourself, not the official fix, wasn’t it? Are you saying you also tried the official fix (including the “update, restart sys-net, update again” part), and that still didn’t work for you? If so, you should definitely have reported that on the issue (#9025) so that the devs could fix it.

You can see for yourself that packages were pushed both for 4.1 (qubesos-bot comment + qubes-status issue) and 4.2 (qubesos-bot comment + qubes-status issue). So, I can think of a few possibilities:

  1. You don’t have testing repos enabled and haven’t gotten these packages yet. (Not sure if they’re in stable yet, but I don’t see a comment from qubesos-bot saying they are, so I guess not.) Can you confirm you actually have the packages with the correct version numbers that are supposed to contain the fix?
  2. The devs thought they fixed the issue in both 4.1 and 4.2, but for some reason the fix for 4.1 isn’t working. In this case, you need to say that on the issue so that they will be aware! They can’t fix a problem they thought they already fixed! Any time someone leaves a comment on an issue saying that they tried a fix and it didn’t work, we reopen the issue right away (unless we have good reason to believe it’s something else).
  3. Something unique to your system or user error, e.g,. if the fix works for other people on 4.1 but not for you.

FWIW, no bias here. I personally still use 4.1 exclusively and strongly believe older supported releases should always have full support right up through their EOL dates. (I’m usually the one arguing against stuff like deleting the older supported release documentation in order to replace it with the newer documentation and arguing that we should instead keep both as long as they’re both supported.)

Okay, I’m reopening issue #9025 right now with a link to your comment!

1 Like

[irrelevant comment retracted]