From: Rob Landley What are the goals of busybox?
@@ -172,6 +177,116 @@ applet is otherwise finished. When polishing and testing a busybox applet,
we ensure we have at least the option of full standards compliance, or else
document where we (intentionally) fall short.
Various things busybox uses that aren't particularly well documented +elsewhere.
+ +Password fields in /etc/passwd and /etc/shadow are in a special format. +If the first character isn't '$', then it's an old DES style password. If +the first character is '$' then the password is actually three fields +separated by '$' characters:
++ $type$salt$encrypted_password ++ +
The "type" indicates which encryption algorithm to use: 1 for MD5 and 2 for SHA1.
+ +The "salt" is a bunch of ramdom characters (generally 8) the encryption +algorithm uses to perturb the password in a known and reproducible way (such +as by appending the random data to the unencrypted password, or combining +them with exclusive or). Salt is randomly generated when setting a password, +and then the same salt value is re-used when checking the password. (Salt is +thus stored unencrypted.)
+ +The advantage of using salt is that the same cleartext password encrypted +with a different salt value produces a different encrypted value. +If each encrypted password uses a different salt value, an attacker is forced +to do the cryptographic math all over again for each password they want to +check. Without salt, they could simply produce a big dictionary of commonly +used passwords ahead of time, and look up each password in a stolen password +file to see if it's a known value. (Even if there are billions of possible +passwords in the dictionary, checking each one is just a binary search against +a file only a few gigabytes long.) With salt they can't even tell if two +different users share the same password without guessing what that password +is and decrypting it. They also can't precompute the attack dictionary for +a specific password until they know what the salt value is.
+ +The third field is the encrypted password (plus the salt). For md5 this +is 22 bytes.
+ +The busybox function to handle all this is pw_encrypt(clear, salt) in +"libbb/pw_encrypt.c". The first argument is the clear text password to be +encrypted, and the second is a string in "$type$salt$password" format, from +which the "type" and "salt" fields will be extracted to produce an encrypted +value. (Only the first two fields are needed, the third $ is equivalent to +the end of the string.) The return value is an encrypted password in +/etc/passwd format, with all three $ separated fields. It's stored in +a static buffer, 128 bytes long.
+ +So when checking an existing password, if pw_encrypt(text, +old_encrypted_password) returns a string that compares identical to +old_encrypted_password, you've got the right password. When setting a new +password, generate a random 8 character salt string, put it in the right +format with sprintf(buffer, "$%c$%s", type, salt), and feed buffer as the +second argument to pw_encrypt(text,buffer).
+ +On systems that haven't got a Memory Management Unit, fork() is unreasonably +expensive to implement, so a less capable function called vfork() is used +instead.
+ +The reason vfork() exists is that if you haven't got an MMU then you can't +simply set up a second set of page tables and share the physical memory via +copy-on-write, which is what fork() normally does. This means that actually +forking has to copy all the parent's memory (which could easily be tens of +megabytes). And you have to do this even though that memory gets freed again +as soon as the exec happens, so it's probably all a big waste of time.
+ +This is not only slow and a waste of space, it also causes totally +unnecessary memory usage spikes based on how big the _parent_ process is (not +the child), and these spikes are quite likely to trigger an out of memory +condition on small systems (which is where nommu is common anyway). So +although you _can_ emulate a real fork on a nommu system, you really don't +want to.
+ +In theory, vfork() is just a fork() that writeably shares the heap and stack +rather than copying it (so what one process writes the other one sees). In +practice, vfork() has to suspend the parent process until the child does exec, +at which point the parent wakes up and resumes by returning from the call to +vfork(). All modern kernel/libc combinations implement vfork() to put the +parent to sleep until the child does its exec. There's just no other way to +make it work: they're sharing the same stack, so if either one returns from its +function it stomps on the callstack so that when the other process returns, +hilarity ensues. In fact without suspending the parent there's no way to even +store separate copies of the return value (the pid) from the vfork() call +itself: both assignments write into the same memory location.
+ +One way to understand (and in fact implement) vfork() is this: imagine +the parent does a setjmp and then continues on (pretending to be the child) +until the exec() comes around, then the _exec_ does the actual fork, and the +parent does a longjmp back to the original vfork call and continues on from +there. (It thus becomes obvious why the child can't return, or modify +local variables it doesn't want the parent to see changed when it resumes.) + +
Note a common mistake: the need for vfork doesn't mean you can't have two +processes running at the same time. It means you can't have two processes +sharing the same memory without stomping all over each other. As soon as +the child calls exec(), the parent resumes.
+ +(Now in theory, a nommu system could just copy the _stack_ when it forks +(which presumably is much shorter than the heap), and leave the heap shared. +In practice, you've just wound up in a multi-threaded situation and you can't +do a malloc() or free() on your heap without freeing the other process's memory +(and if you don't have the proper locking for being threaded, corrupting the +heap if both of you try to do it at the same time and wind up stomping on +each other while traversing the free memory lists). The thing about vfork is +that it's a big red flag warning "there be dragons here" rather than +something subtle and thus even more dangerous.)
+