I’m new here but if you’re happy to define what’s needed then I could take some of the load.
I strongly support the idea of code auditing, especially as the number of sophisticated attacks against trustworthy open-source projects is increasing. Apart from the infamous xz-attack, which was averted at the last minute, there are now similar attacks reported on JavaScript development. While there is no guarantee that such attacks could always be detected in time, code auditing increases the probability of finding harmful injections into the code.
While I am very confident that the cautious approach of the Qubes team to including software in the project reduces the chances of such an attack significantly, any scheme helping to find possible malicious input will help to keep the current high quality of this system. This is especially important since just this quality may make Qubes a worthwhile target for such attacks.
Sorry my initial question about backup improvement was too wide from this specific topic goal of auditing. I mixed up a little the different purposes of both topics.
Eventual backup tool improvement stays in the previously mentioned topic. To narrow it down to the goal of this one, could a security audit of the code involved in native backup be of interest for you ? And perhaps a good candidate for a pilot community audit session ?
A reason why I made the link was :
In previously mentioned incremental backup topic Tasket is talking about some potential flaws (security issues ?) with native backup tool (run in dom0).
To which Rustybird replies :
There are maybe other components which have a higher priority, I remain interested in community auditing for any component.
Full-Disclosure: For the sake of transparency, I go by the name @quantumpacket here and on GitHub (profile). This is just my tongue-in-cheek secondary account. Illuminati confirmed!
That being said, I’m really excited at all the interest thus far. Hopefully that interest only grows and more QubesOS users are ready to dive in and help out. The only way this program will be a success is if we can keep people engaged and make it such that anyone will feel comfortable to participate regardless of their level of technical background or expertise. I suspect a lot of people including myself will learn a lot about how QubesOS works on the inside as a result of this.
I’m going to do all I can to get this all setup with @nokke over the coming weeks, but we’re going to need a lot of feedback from you all in making sure any barriers for participants are addressed, and all logistics are sorted out before a pilot launch. So stay tuned!
@nokke Awesome, thanks for stepping up and volunteering! I’m working on a preliminary outline for how the program would be run and will reach out to you via DM to collaborate on it before we post it here for feedback.
@adw Could you create a new forum Category called Code Audit, that would show up on the left pane and add the names @quantumpacket and @nokke to it? Let’s make it temporarily private while we get things setup. The plan is to use the new category to organize each component audit and to have a central place for the program resources and discussion. Thank you!
Could you link me to the source for the backup code? We will be starting off with the core-*
components as suggested by the QubesOS team. If it’s part of one of those components then the answer is yes.
I’ll be choosing a smaller component for the pilot launch and ramp up to larger ones as we get more community involvement. I’m afraid to bite off too much from the start and have engagement fizzle out due to people being overwhelmed.
Better to ask @deeplow about setting up forum stuff.
From what I understood, the tools like qvm-...
(ie: qvm-backup
) come from qubes-core-admin-client, see:
But I didn’t read that code, so not sure about it. I tried to understand qvm-device
once, so it can be another option… I don’t know if it’s in the scope of this audit, but if we want to start with something rather simple, the qvm-...
tools could be a good start? Like qvm-tags
?
What are the requirements to participate? I mean, I can read some python or shell code but I have no real knowledge of how to write secure code. Am I able to take part in this audit?
Great, it’s a qubes-core-* =D
The biggest file, restore.py is a 2,1k loc
wow, that’s a big file! Seeing how large some of these files can be, I may have to break things up. Where depending on the size of the component we choose a selection of files, audit them, and note them as checked, and repeat the process again with additional files. I want the amount of code to be digestible by participants, so as not to create any burnout and thus disinterest.
Anyone can participate, but we will be laying down some rules of etiquette so as not to clutter up the forum. More info on that will be provided on the outline. We’ll probably setup a Matrix room for basic questions and discussion.
Keep in mind it’s not the place to learn a new language, one should be reading the official documentation for that. We will be providing links in a pinned Resources post where one can read about security for different languages and that would be the best place for you to learn from. Different people with different areas and levels of expertise will have eyes on the code, so don’t feel like you have to know everything. The goal is to put our collective minds together.
I did a quick analysis of loc for the core-* components and main filetypes, which may help breaking things up, matching people to components, choosing the right size to get started with, etc. Because I used file suffixes and the shell code generally doesn’t use them, they’re not included. I used file suffixes to count most languages; shell data comes from: find . -type f -exec sh -c "file {} | grep -q \"shell script\" && echo {}" \; | xargs wc -l
core-admin
- C:215
- py:52247
- Makefile:459
- xml:323
- shell:996
core-admin-addon-whonix
- py:127
- Makefile:12
core-admin-client
- py:36511
- Makefile:260
- xml:599
- shell:99
core-admin-linux
- py:2796
- Makefile:50
- shell:992
core-agent-linux
- C:1710
- py:2390
- Makefile:734
- xml:484
- shell:5502
core-libvirt
- Makefile:69
- shell:60
core-qrexec
- C:7744
- py:13866
- Makefile:324
- shell:185
core-qubesdb
- C:4590
- py:21
- Makefile:185
- shell:57
core-vchan-xen
- C:1706
- Makefile:39
- shell:31
This sounds like an awesome idea! Although I’m not sure how the less tech-savvy could participate very much.
If you or anyone else can think of how, then I’d be more than happy to contribute.
If nothing else, a good explanation behind each portion of the OS that’s being looked at can lead to a better overall understanding of Qubes. I’ve certainly enjoyed my deep-dives. I know way more about linux now than I did running Ubuntu back in the day.
There’s a lot in the documentation that is either quite vague or outdated. Hopefully this project could also lead to deeper documentation updates.
This is interesting but quite unclear.
The OP talks about “code audit” (assuming all code is available, which is obviously not the case for blobs). Follow-ups talk about “security audit”.
Without a clear goal, it kind of makes no sense.
Even “security” is too broad.
@quantumpacket has been working on clear goals for a security audit.
Though I reckon there could also be value in newcomers just looking over and discussing different parts of the code together without clear goals.
There may be social bonding value in this activity, but “just looking over [some code]” is not of any value.
In my experience, there’s often plenty of value beyond the social, which I can take or leave. However, we’re going with fixed goals atm.
the idea is to audit the code of qubes os to find any vulnerabilities within the qubes operating system. This does not include blobs. This does not include an audit of debian, fedora, whonix etc. Just the qubes OS codebase
It’s not just social bonding. If people are looking over the code they will better understand how the system works and be less likely to misuse it. A lot of successful attacks happen simply because humans do the wrong thing even though the system is technically correct.
I agree with the notion that this activity is not an audit, but I frequently run in to people with attitudes that things like social factors are nice but not relevant because only the technical parts are relevant. This is a forum so I don’t have body language, etc to help me interpret your statement. Maybe you’re not dismissive of those things. But I think it’s likely some portion of people who read that comment will interpret it that way and either nudge them towards or reinforce that world-view in their minds, which I would prefer to avoid.
the idea is to audit the code of qubes os to find any vulnerabilities within the qubes operating system. This does not include blobs. This does not include an audit of debian, fedora, whonix etc. Just the qubes OS codebase
Vulnerability to what? - Data leaks? Data collision? Anything else? Hardware side-channels? As I said, this needs to be defined. Otherwise it is huge.
Also that code does not exist in vacuum on its own. It depends on other code (including the compiler or interpreter used, down to CPU microcode), i.e. it is vulnerable in a particular context, and if the context is the factor of vulnerability that should be considered too. So, it seems to me quite complex with all the moving targets that make the code functional. I wonder what entity is capable of such thorough analysis.
I think it would be easy to lose some of the benefits by over-defining early on. With a commercial audit, we’d be on the same page - it is huge. With a community like this, keeping options open is also keeping doors open to people wanting to participate, and OP’s definition seems to me like a good balance to start with. If people want to participate with a narrower scope for themselves, I don’t see any reason they can’t. We could try to narrow down the collective scope by polling upfront who’s available with what experience here, or we could get to know that with more confidence over a pilot period, or we could keep it broad which makes it easier if we don’t always have the same people around.
Unlike a commercial audit, I don’t think we should be too worried about providing false assurance that code’s been covered “thoroughly”, or preventing duplicated effort later on. We can set some guard rails (in progress) and go through systematically and track progress, which would all make the results more usable and help future auditing decisions. Getting some actionable findings and fixes, and some more experience with the codebases and working together in the community look like - and this is just my gut feeling after a couple of months on the forum so do speak up if you’ve got a better view - a sensible level of ambition here and worthwhile.
I wonder if there is some kind of code documentation, explaining shortly which code does what.