So i had some spare time this week, after i have migrated my infrastructure to a Virtual Machine host (VM) aka old deksktop hardware reused. One of the first recurring problem i faced was keeping the 5 gentoo guests up to date. I could log into to each guest (over ssh) and manually update each guest, and repeat those steps 5 times. That was getting pretty cumbersome fast.
I should mention that all the guest share have mounted a nfs share where all the binpgks are stored. So essentially every time some new package is emerged the guest will store a generate bin on the nfs share. Of course that requires that the guests haves the same configuration (useflags, chost, virtual cpu etc).
Combine that with i have a "internal" vm guest i use for internal testing purposes (and hence it is hooked up to my desktop over distcc) and because it is meant for various testing purposes there is allocated a lot of resources to it compared to a guest simply running public services such as apache(this webpage).
So the situation is that i have a fastVM that could generate binpkgs faster than each of the other vm`s by using distcc. But how do you generate binpkgs for packages that isnt installed locally but on some random vm? and how do know which packages to generate?
Well you need to generate some kind of list of packages that needs to be generated spanning all the VM`s but avoids duplicates. So lets throw ssh (or rather my old friend paramiko) at the problem and get a list of all new packages for each VM by parsing the emerge -NDpavu1 world output. Sound easy?
Yes, well kinda. The main problem by using the emerge output (which i heavily do in this script) is that when ever the output from emerge changes i am in a sub-optimal position. Though on the other hand i avoid having some python code on each VM that could interface directly with portage (since it is also written in python). Another little hitch (kinda expected) i ran into was, it can take a lot of time running emerge -NDpu1 world on each VM and add the found packages to a list and repeat those two steps for each VM.
Since i have done my fair share of reading into the whole GIL "issue" with python, I knew that threads was not the way to go, but processes would be the correct approach. But just for fun i tried using threads and it turns out it was much slower than starting a ssh (client) process per VM because of the GIL. There can be troublesome problems with sharing data across processes (need for locking etc). But since i do not really care which order the packages gets inserted in the queue and and the queue itself is multi process safe it is not a problem here. Furthermore i do not pop from the queue while the queue is getting populated, so It is all a matter of each process putting packages in the queue in a random order and then afterwards remove duplicates when all process are done (wait in the main process until all ssh processes have stopped).
Now we should have a nice clean list of packages that needs to be emerged on the fast vm. But you cant really build packages without building there dependencies right? Correct and we want an easy way to remove those dependencies afterwards. Cue the -o argument to portage aka build all dependencies (not on the system) for this package. While figuring out if there is any dependencies they (if any) is added to the depcleanList for later use when we want to remove the leftover cruft.
Next step is to generate the binaries them self which are now easily done since we know all the dependencies are in place. Que the -B argument to portage, the description of the argument is as following:
Creates binary packages for all ebuilds processed without actually merging the packages. This comes with the caveat that all build-time dependencies must already be emerged on the system.
So in other words generate binaries without installing the ebuilds on the local machine. Pretty awesome right.
Now all we need to do is to emerge all the freshly generate bins on the other vm`s and remove the dependencies on the builder.
There is a few problems with the current code, such as paramiko dosent support the new ssh ecdsa public keys, so for now i just auto accepts the public keys every time i connect which opens up for a man in the middle attack. While it isnt that hard to get paramiko to read the ecdsa keys (which i have running in a local branch) the problem starts when you need to verify the local stored key against the current key the server uses (something to look into when i have some more time). Another problem is that not all leftover cruft gets removed, think configuration files and init scripts and the like are not remove by portage when you remove a package from the system. So expect a lot of leftover files...
Otherwise the code should work as expected (do not expect it to work!) and do not expect anything but a quick hack. I might cleanup the code and make it pretty sometime in the future, but i just needed something simple that did the job. You can find the code in the git repository under gentoo-updater.