X-Git-Url: https://git.librecmc.org/?a=blobdiff_plain;f=docs%2Fkeep_data_small.txt;h=218d4f2eeb6bbae9019713dfc667f8240bc42c3d;hb=c804d4ec5cb222c842644bb99d9b077f5c6576f2;hp=55f4fc95a6816a7c728be54042d004b6c4165027;hpb=17a1526f9e2cb0a383fe0765ce803833be28773c;p=oweals%2Fbusybox.git diff --git a/docs/keep_data_small.txt b/docs/keep_data_small.txt index 55f4fc95a..218d4f2ee 100644 --- a/docs/keep_data_small.txt +++ b/docs/keep_data_small.txt @@ -43,13 +43,23 @@ takes 55k of memory on 64-bit x86 kernel. On 32-bit kernel we need ~26k per applet. +Script: + +i=1000; while test $i != 0; do + echo -n . + busybox sleep 30 & + i=$((i - 1)) +done +echo +wait + (Data from NOMMU arches are sought. Provide 'size busybox' output too) Example 1 One example how to reduce global data usage is in -archival/libunarchive/decompress_unzip.c: +archival/libarchive/decompress_gunzip.c: /* This is somewhat complex-looking arrangement, but it allows * to place decompressor state either in bss or in @@ -65,7 +75,7 @@ archival/libunarchive/decompress_unzip.c: (see the rest of the file to get the idea) This example completely eliminates globals in that module. -Required memory is allocated in inflate_gunzip() [its main module] +Required memory is allocated in unpack_gz_stream() [its main module] and then passed down to all subroutines which need to access 'globals' as a parameter. @@ -77,7 +87,7 @@ take a look at archival/gzip.c. Here all global data is replaced by single global pointer (ptr_to_globals) to allocated storage. In order to not duplicate ptr_to_globals in every applet, you can -reuse single common one. It is defined in libbb/messages.c +reuse single common one. It is defined in libbb/ptr_to_globals.c as struct globals *const ptr_to_globals, but the struct globals is NOT defined in libbb.h. You first define your own struct: @@ -89,11 +99,19 @@ and then declare that ptr_to_globals is a pointer to it: ptr_to_globals is declared as constant pointer. This helps gcc understand that it won't change, resulting in noticeably -smaller code. In order to assign it, use PTR_TO_GLOBALS macro: +smaller code. In order to assign it, use SET_PTR_TO_GLOBALS macro: - PTR_TO_GLOBALS = xzalloc(sizeof(G)); + SET_PTR_TO_GLOBALS(xzalloc(sizeof(G))); -Typically it is done in _main(). +Typically it is done in _main(). Another variation is +to use stack: + +int _main(...) +{ +#undef G + struct globals G; + memset(&G, 0, sizeof(G)); + SET_PTR_TO_GLOBALS(&G); Now you can reference "globals" by G.a, G.buf and so on, in any function. @@ -128,11 +146,9 @@ less readable, use #defines: #define sector (G.sector) - Word of caution + Finding non-shared duplicated strings -If applet doesn't use much of global data, converting it to use -one of above methods is not worth the resulting code obfuscation. -If you have less than ~300 bytes of global data - don't bother. +strings busybox | sort | uniq -c | sort -nr gcc's data alignment problem @@ -204,3 +220,46 @@ Result (non-static busybox built against glibc): text data bss dec hex filename 634416 2736 23856 661008 a1610 busybox 632580 2672 22944 658196 a0b14 busybox_noalign + + + + Keeping code small + +Use scripts/bloat-o-meter to check whether introduced changes +didn't generate unnecessary bloat. This script needs unstripped binaries +to generate a detailed report. To automate this, just use +"make bloatcheck". It requires busybox_old binary to be present, +use "make baseline" to generate it from unmodified source, or +copy busybox_unstripped to busybox_old before modifying sources +and rebuilding. + +Set CONFIG_EXTRA_CFLAGS="-fno-inline-functions-called-once", +produce "make bloatcheck", see the biggest auto-inlined functions. +Now, set CONFIG_EXTRA_CFLAGS back to "", but add NOINLINE +to some of these functions. In 1.16.x timeframe, the results were +(annotated "make bloatcheck" output): + +function old new delta +expand_vars_to_list - 1712 +1712 win +lzo1x_optimize - 1429 +1429 win +arith_apply - 1326 +1326 win +read_interfaces - 1163 +1163 loss, leave w/o NOINLINE +logdir_open - 1148 +1148 win +check_deps - 1148 +1148 loss +rewrite - 1039 +1039 win +run_pipe 358 1396 +1038 win +write_status_file - 1029 +1029 almost the same, leave w/o NOINLINE +dump_identity - 987 +987 win +mainQSort3 - 921 +921 win +parse_one_line - 916 +916 loss +summarize - 897 +897 almost the same +do_shm - 884 +884 win +cpio_o - 863 +863 win +subCommand - 841 +841 loss +receive - 834 +834 loss + +855 bytes saved in total. + +scripts/mkdiff_obj_bloat may be useful to automate this process: run +"scripts/mkdiff_obj_bloat NORMALLY_BUILT_TREE FORCED_NOINLINE_TREE" +and select modules which shrank.