On 32-bit kernel we need ~26k per applet.
+Script:
+
+i=1000; while test $i != 0; do
+ echo -n .
+ busybox sleep 30 &
+ i=$((i - 1))
+done
+echo
+wait
+
(Data from NOMMU arches are sought. Provide 'size busybox' output too)
(see the rest of the file to get the idea)
This example completely eliminates globals in that module.
-Required memory is allocated in inflate_gunzip() [its main module]
+Required memory is allocated in unpack_gz_stream() [its main module]
and then passed down to all subroutines which need to access 'globals'
as a parameter.
ptr_to_globals is declared as constant pointer.
This helps gcc understand that it won't change, resulting in noticeably
-smaller code. In order to assign it, use PTR_TO_GLOBALS macro:
+smaller code. In order to assign it, use SET_PTR_TO_GLOBALS macro:
- PTR_TO_GLOBALS = xzalloc(sizeof(G));
+ SET_PTR_TO_GLOBALS(xzalloc(sizeof(G)));
Typically it is done in <applet>_main().
Since bb_common_bufsiz1 is BUFSIZ + 1 bytes long and BUFSIZ can change
from one libc to another, you have to add compile-time check for it:
-if(sizeof(struct globals) > sizeof(bb_common_bufsiz1))
+if (sizeof(struct globals) > sizeof(bb_common_bufsiz1))
BUG_<applet>_globals_too_big();
gcc doesn't seem to have options for altering this behaviour.
-gcc 3.4.3:
+gcc 3.4.3 and 4.1.1 tested:
+char c = 1;
// gcc aligns to 32 bytes if sizeof(struct) >= 32
-struct st {
- int c_iflag,c_oflag,c_cflag,c_lflag;
- int i1,i2,i3; // struct will be aligned to 4 bytes
-// int i1,i2,i3,i4; // struct will be aligned to 32 bytes
-};
-struct st t = { 1 };
+struct {
+ int a,b,c,d;
+ int i1,i2,i3;
+} s28 = { 1 }; // struct will be aligned to 4 bytes
+struct {
+ int a,b,c,d;
+ int i1,i2,i3,i4;
+} s32 = { 1 }; // struct will be aligned to 32 bytes
// same for arrays
char vc31[31] = { 1 }; // unaligned
char vc32[32] = { 1 }; // aligned to 32 bytes
+
+-fpack-struct=1 reduces alignment of s28 to 1 (but probably
+will break layout of many libc structs) but s32 and vc32
+are still aligned to 32 bytes.
+
+I will try to cook up a patch to add a gcc option for disabling it.
+Meanwhile, this is where it can be disabled in gcc source:
+
+gcc/config/i386/i386.c
+int
+ix86_data_alignment (tree type, int align)
+{
+#if 0
+ if (AGGREGATE_TYPE_P (type)
+ && TYPE_SIZE (type)
+ && TREE_CODE (TYPE_SIZE (type)) == INTEGER_CST
+ && (TREE_INT_CST_LOW (TYPE_SIZE (type)) >= 256
+ || TREE_INT_CST_HIGH (TYPE_SIZE (type))) && align < 256)
+ return 256;
+#endif
+
+Result (non-static busybox built against glibc):
+
+# size /usr/srcdevel/bbox/fix/busybox.t0/busybox busybox
+ text data bss dec hex filename
+ 634416 2736 23856 661008 a1610 busybox
+ 632580 2672 22944 658196 a0b14 busybox_noalign
+
+
+
+ Keeping code small
+
+Set CONFIG_EXTRA_CFLAGS="-fno-inline-functions-called-once",
+produce "make bloatcheck", see the biggest auto-inlined functions.
+Now, set CONFIG_EXTRA_CFLAGS back to "", but add NOINLINE
+to some of these functions. In 1.16.x timeframe, the results were
+(annotated "make bloatcheck" output):
+
+function old new delta
+expand_vars_to_list - 1712 +1712 win
+lzo1x_optimize - 1429 +1429 win
+arith_apply - 1326 +1326 win
+read_interfaces - 1163 +1163 loss, leave w/o NOINLINE
+logdir_open - 1148 +1148 win
+check_deps - 1148 +1148 loss
+rewrite - 1039 +1039 win
+run_pipe 358 1396 +1038 win
+write_status_file - 1029 +1029 almost the same, leave w/o NOINLINE
+dump_identity - 987 +987 win
+mainQSort3 - 921 +921 win
+parse_one_line - 916 +916 loss
+summarize - 897 +897 almost the same
+do_shm - 884 +884 win
+cpio_o - 863 +863 win
+subCommand - 841 +841 loss
+receive - 834 +834 loss
+
+855 bytes saved in total.
+
+scripts/mkdiff_obj_bloat may be useful to automate this process: run
+"scripts/mkdiff_obj_bloat NORMALLY_BUILT_TREE FORCED_NOINLINE_TREE"
+and select modules which shrank.