The Linux init process gets an error that looks like this (in the stderr output):
INIT: cannot execute "/etc/init.d/rc"
Several things may cause this error but in my case it was happening because the rc script had
#! /bin/sh on the first line, and /bin/sh was missing.
In /bin provide a link called sh, which points to some other shell which can run sh scripts e.g. to /bin/bash.
This was happening on an embedded system, booting (for debugging purposes), from an nfs target file system. The kernel starts up fine, but then the init process gets and error while trying to run one of the init scripts in /etc/init.d, which screws up the entire process.
This is what the error looked like (the errors are bolded):
[ 6.063682] PHY: 0:1f - Link is Up - 100/Full
[ 8.083514] Looking up port of RPC 100005/1 on 192.168.97.70
[ 8.096759] VFS: Mounted root (nfs filesystem) readonly.
[ 8.101024] Freeing unused kernel memory: 100k freed
INIT: version 2.86 booting
INIT: cannot execute "/etc/init.d/rcS"
INIT: Entering runlevel: 3
INIT: cannot execute "/etc/init.d/rc"
That did puzzle me quite a bit, as the live system (not using NFS, but basically using the same root directory tree and files) was running these scripts just fine.
Looking at the target’s /etc/init.d dir all files were ‘-rwxr-xr-x’, but they did not belong to root:root but to <me>:users. This happened, when I copied the target filesystem tree to my host PC and expanded it to my hard disk as <me> (i.e. as a user, not as root).
Knowing that my UID was unknown to my embedded development box I thought that the system refuses for some reason to run a file whose user is unknown to the system. This of course was pulled out of thin air, as the system does not care about whose file it runs as long as it has the x bit’s set. So I chowned the entire /etc to root:root which, naturally, didn’t solve my problem…
Ok, the next thing that came to mind was that for some reason my NFS mounting command did not mount the volume with exec permissions? Thinking back – this also did not make sense, since about 1000 things had already ran from that NFS volume before it got to the point when it attempted to run the rcS script, but hey- nothing like a nice little witch-hunt when your system refuses to obey you! 🙂
So I mounted the my hosted NFS from another PC and tried to execute some harmless script from /etc/init.d, e.g. sshd. Works. Hmm… What the…
cat /etc/passwd, looking for root’s entry:
At this point some thoughts (this time – real ones) began to form in my head. My embedded system uses BusyBox, with the ash shell (configured as sh) so why is root’s shell defined as “/bin/bash” in passwd? How does BusyBox deal with this? Probably just provides a link called sh in /bin?
Do we actually have a sh in /bin???
Turned out that when I generated my root FS on the host PC, the sh link in /bin somehow got broken. There was a file in there which was called sh, but it was not a link to a real shell and it was not an executable file. I wish I had dumped the contents of the file with hexdump at that time, but I was in a hurry (and all ready quite aggravated with this unexpected delay) so I just said “Aha!” and deleted the offending sh file and and created the proper link.
After that everything worked with no more trouble… 🙂
Trying to analyze this “series of unfortunate events” I am thinking that what had happened, most likely was, that while assembling my target file system (which was then used over NFS by my embedded box) I pulled some directories from another system – a VirtualBox virtual machine. It was most likely during this process, when my sh link somehow got broken, which resulted in init being unable to execute the rc and rcS scripts in init.d…
And one more note: as described above, the sshd script ran just fine, when the directory was mounted over NFS from another PC. It’s funny but now that I am thinking of this – this test actually didn’t prove anything: firstly because the sshd script started with /bin/bash, not /bin/sh and secondly, because on the PCs both bash and sh are pretty much always present, so of course the script worked 🙂
Overall this was an interesting exercise in analytical thinking and not once again taught me the important lesson that assumption is the mother of all screw-up 🙂