|
Advanced disk usage
|
[README]
[Download]
[INSTALL]
[License]
[Contact]
[Man page]
adu creates a database containing disk usage statistics of a given
directory. This database can be queried to quickly retrieve, for
example, the number and the size of all files in a subdirectory owned
by a given user.
Four different output modes are available: global list, global summary,
user list and user summary. The format of the output may be customized
via format strings.
There’s an interactive mode which allows to quickly launch many queries
on the same database using different modes and different output files.
By default, adu uses the user-summary output format which looks like
this:
User summary
root 0 605 12K 267m
mysql 103 8 144 81m
postgres 113 19 506 31m
man 6 37 87 2m
syslog 101 1 54 1m
...
The user-list mode prints the largest directories of one or more users:
uid 0 (root):
55m 8 /var/cache/apt/apt-file/
43m 35 /var/lib/apt/lists/
27m 6K /var/lib/dpkg/info/
25m 4 /var/cache/apt/
20m 118 /var/lib/gconf/defaults/
Only the source code is available for download. Use
git
to clone the adu repository by executing
git clone git://git.tuebingen.mpg.de/adu
or grab the
tarball
of the current tree. If you prefer to download the tarball of
the latest release, select the corresponding snapshot
link on the
adu gitweb page
As adu is based on libosl, the object storage layer, you first have
to install libosl.
Adu’s command line parser and the interactive help are generated by
gnu gengetopt.
Hence the gengetopt package must be installed to compile adu from
source.
To generate the man page,
help2man must be installed.
adu is open source software, licensed under the
GNU
General Public License, Version 2.
Email: André Noll, maan@tuebingen.mpg.de,
Homepage: http://people.tuebingen.mpg.de/maan/
Comments and bug reports are welcome. Please provide
enough info such as the version of adu/libosl you are using
and relevant parts of the logs. Including the string [adu]
in the subject line is also a good idea.
NAME
adu - advanced disk usage
SYNOPSIS
adu
[,OPTIONS/]...
DESCRIPTION
adu-1.0.0
adu creates a database containing disk usage statistics of a given
directory. It allows to query that database to quickly retrieve
usage patterns of subdirectories and/or files owned by a given user id.
- -h, --help
-
Print help and exit
- --detailed-help
-
Print help, including all details and hidden
options, and exit
- -V, --version
-
Print version and exit
General options:
- -l, --loglevel=,level/
-
Set loglevel (0-6) (default=`4')
Log messages are always written to stderr while normal output
goes to stdout. Lower values mean more verbose logging.
Group: database
There are two ways to specify a database directory. You can either
specify a full path using the database-dir option or a root path
using the database-root option. In the latter case, a directory
structure matching that of the base-dir argument is created
below the given full path.
The advantage of using database-root is that the base-dir is
used to find the relevant database both in create and select mode
and you do not have to care for setting the database-dir explicitly.
- -d, --database-dir=,path/
-
directory containing the osl tables
Full path to the directory containing the osl tables. This
directory is created if it does not exist. It must be writable for the
user running adu in --create mode and readable in --select mode.
- -r, --database-root=,path/
-
directory containing directories containing the
osl tables (default=`/var/lib/adu')
Base path to the directory containing the osl tables. The real
database-dir is generated by appending base-dir. This
directory is created if it does not exist. When used in select
mode you have to specify the base-dir as well.
Modes:
Group: mode
adu may be started in one of three possible modes, each of
which corresponds to a different command line option. Exactly
one of these options must be given.
- -C, --create
-
Create a new database
Traverse the given directory and track disk usage on a
per-user basis. Results are stored in N + 1 osl tables where
N is the number of uids that own at least one regular file
in that directory.
- -I, --interactive
-
activate interactive mode
In this mode, adu reads commands from stdin. This makes it
possible to run different select queries without opening the
underlying osl database for each query (which is expensive).
In interactive mode, several subcommands are available, see
the end of this document.
- -S, --select
-
query a database previously created with
--create
This option prints statistics about matching subdirectories
to stdout, to an output file or pipes the output to a given
command, depending on the --output option. The output format
can be customized by specifying select options, see below.
Options for --create:
- -b, --base-dir=,path/
-
directory to traverse
The base directory to be traversed recursively. A warning
message is printed for each subdirectory that could not be
read because of insufficient permissions. These directories
will be ignored when computing statistics.
- -x, --one-file-system
-
do not dive into other file systems
(default=off)
Skip directories that are on different file systems from the
one that the argument being processed is on.
- --hash-table-bits=,num/
-
specify the size of the uid hash table
(default=`10')
Use a hash table of size 2^num for the uid entries. If more than
2^num different uids own at least one regular file under base-dir,
the command fails. Increase this value if you have more than 1024
users. Decreasing the value causes adu to use slightly less memory.
- -B, --bloom-filter-order=,order/
-
use bloom filters for hard link detection
(default=`23')
Allocate bloom filters of size 2^order bits. Each regular
file with hard link count greater than one is added to these
filters which allows to detect hard links on a per-user basis.
Greater values reduce the probability of false positives but
require more memory.
Values less than 10 deactivate this feature so that no hard
links are being detected.
- -N, --num-bloom-filter-hash-functions=,num/
-
number of hash functions for the bloom filters
(default=`10')
Cause each entry which is added to the bloom filter to set
"num" bits of the bloom filter.
Options for --select:
- -s, --select-options=<options>
-
Options for select mode
This option takes a string whose content is another set of
options as described below. Select options may be specified
either directly in select mode, in which case you have use
quotes to prevent the select options from being interpreted
as adu options, or via the "set" command in interactive mode.
Select options:
- -h, --help
-
Print help and exit
- --detailed-help
-
Print help, including all details and hidden
options, and exit
- -V, --version
-
Print version and exit
- -u, --user=,user_name/
-
users to take into account
This option may be given multiple times in which case all given
user names are considered admissible. See also --uid below.
- -U, --uid=,uid_spec/
-
user id(s) to take into account
An uid specifier may be a single uid, a range of uids,
or a comma-separated list of single uids or ranges.
Example:
Only consider uid 42:
--uid 42
Only consider uids greater or equal than 42:
--uid 42-
Only consider uids between 23 and 42, inclusively:
--uid 23-42
Consider uids 23-42, 666-777 and 88:
--uid 23-42,666-777,88
If no --user option is given and also --uid option is not given
(the default), all users are taken into account.
- -l, --limit=,num/
-
Limit output (default=`-1')
Only print num lines of output. If negative (the default),
print all lines. This option is honored in all select modes
except global_summary (which outputs only one single line).
- -p, --pattern=,regex/
-
only consider matching directories
Regular expression that must match the directory name for
the directory to be considered for the output of the query.
See regex(7) for details.
Depending on whether --print-base-dir is given, the absolute
directory name or only the part of the directory name below
the base directory is matched against "regex".
If this option is not given (the default) all directories
are taken into account.
If "regex" starts with '!', directories are matched against
the remaining part of "regex" and the sense of matching is
reversed.
- -H, --header=,string/
-
use a customized header for listings/summaries
This option can be used to print any string instead of the
default header line (which depends on the selected mode).
In user_list mode the header is a format string which allows
to include the uid and the user name in the header. See the
--format option for more details.
It is possible to set this to the empty string to suppress
the header completely. This is mostly useful to feed the
output to scripts.
- -T, --trailer=,string/
-
use a customized trailer for listings/summaries
(default=`')
This option can be used to print any string at the end of
the query output.
In user_list mode the trailer is a format string with the
same semantics like the header string.
- -m, --select-mode=<key>
-
How to print the results of the query
(possible values="user_summary",
"user_list", "global_summary",
"global_list" default=`user_summary')
user_summary: Print totals for each admissible uid.
user_list: Print a list for each admissible uid.
global_summary: Only print totals.
global_list: List of directories, regardless of owner.
- -s, --list-sort=<key>
-
how to sort the user list or the global list
(possible values="size", "file_count"
default=`size')
This option is ignored if select-mode is neither "user_list", nor
"global_list".
- -o, --output=,path/
-
file to write output to (default=`-')
This option is only useful in interactive mode. If stdin is redirected
from a script, and the script contains several queries one can use
this option to let each query write its output to a different file.
If the option is not given, or its argument is either "-" or the
empty string, stdout is assumed. The following conventions cause the
output to be written in a different way:
"path" may be prepended by '>' which instructs adu to truncate
the output file to length zero. If "path" does not start with
'>' and "path" already exists, the query is aborted. Otherwise,
the file is created and truncated. The output file name ">" is
considered invalid.
If the first two characters of "path" are '>', the output file
(given by removing the leading ">>" from "path") is opened in
append mode. It is no error if the output file does not exist. However,
as above the output file name ">>" is considered invalid.
If the first character of "path" is '|', a pipe is created and the
rest of "path" is executed with stdin redirected to the reading
end of the pipe while the query output is written to the writing end
of the pipe. Again, specifying only "|" is considered invalid and
causes an error.
See the manual page for examples.
- --user-summary-sort=,col_spec/
-
how to sort the user-summary (possible
- values="name", "uid", "dir_count",
-
"file_count", "size" default=`size')
It is enough to specify the first letter of the column specifier,
e.g. "--user-summary-sort f" sorts by file count.
- --print-base-dir
-
whether to include the base-dir in the output
(default=off)
If this flag is given, all directories printed are prefixed
with the base directory. The default is to print paths relative
to the base dir.
- -f, --format=<format_string>
-
How to format the output
A string that specifies how the output of the select query is
going to be formated. Depending on the chosen select-mode,
several conversion specifiers are available and a different
default value for this option applies.
adu knows four different types of directives: string, id,
count and size. These are explained in more detail below.
The general syntax for string and id directives is %(name:a:w)
where "name" is the name of the directive, "a" specifies
the alignment and "w" is the width specifier which allows
to give a field width.
The alignment specifier is a single character: Either "l",
"r", or "c" may be given to specify left, right and
centered alignment respectively. The with specifier is a
positive integer. Both "a" and "w" are optional.
One string directive supported by adu is "dirname" which is
substituted by the name of the directory. It is available
if either user_list or global_list mode was selected via
--select-mode.
Examples:
Print dirname without any padding:
"%(diname)"
Center dirname in a 20 chars wide field:
"%(dirname:c:20)"
The count and size directives are used for non-negative
numbers. The syntax for these is %(name:a:w:u). The "a" and
the "w" specifiers have the same meaning as for the string
and id directives. The additional "u" specifier selects a
unit in which the number that corresponds to the directive
should be formated. All three specifiers are optional.
Possible units are the characters of the set " bkmgtBKMGT"
specifying bytes, kilobytes, megabytes, gigabytes and
terabytes respectively. The difference between the lower and
the upper case variants is that the lower case specifiers
select 1024-based units while the upper case specifiers use
1000 as the basis.
The whitespace character is like "b", but a space character
is printed instead of a unit.
Two more characters "h" and "H" (human-readable) are also
available that choose an appropriate unit depending on the
size of the number being printed.
An asterisk prepended to the unit specifier prevents the
unit from being printed. This is useful mainly for feeding
the output of adu to scripts that do not expect units.
In order to print a percent sign, use "%%". Moreover, adu
understands "\n" and "\t" and outputs a newline and a
tab character for these combinations respectively.
Examples:
- Print size in gigabytes right-aligned:
-
"%(size:r::G)"
- As before, but use 5 char wide field:
-
"%(size:r5::G)"
- As before, but suppress trailing "G":
-
"%(size:r5::*G)"
The following list contains all directives known to adu,
together with their types, and for which modes each of
them may be used.
pw_name (string): user name. Available for user_list,
user_summary and for the header in user_list mode.
uid (id): user id. Available for user_list,
user_summary and for the header in user_list mode.
files (count): number of files. Available for all
modes.
dirname (string): name of the directory. Available
for user_list and global_list.
size (size): total size/ directory size. Available
for all modes.
dirs (count): number of directories. Available
for user_summary and global_summary.
Interactive commands:
- set
-
change the current configuration
- reset
-
reset configuration to defaults
- help
-
show list of commands and one-line descriptions
- run
-
start the query according to the current configuration
- source
-
read and execute interactive commands from a file
EXAMPLES
The following example creates a database containing the disk usage
patterns of the /var directory:
$ adu --create --database-dir /root/adu-var --base-dir /var
Here's a simple query that uses the newly created database to print
the user-summary:
$ adu --select --database-dir /root/adu-var
To print the one-line global summary instead, use
$ adu --select --database-dir /root/adu-var --select-options '--select-mode global_summary'
To sort the user summary by file count rather than by file size, run
$ adu --select --database-dir /root/adu-var --select-options '--list-sort=file_count'
The command below prints the five largest directories of the users root and
mysql:
$ adu --select --database-dir /root/adu-var --select-options '--select-mode user_list --user root --user mysql --limit 5'
The same, using short options:
$ adu -Sd /root/adu-var -s '-m user_list -u root -u mysql -l 5'
Again the same, but omitting /var/cache:
$ adu -Sd /root/adu-var -s '-m user_list -u root -u mysql -l 5 -p !^cache/'
A simple script for interactive mode:
set -m user_list
set -u root
set -o file-list.root
run
reset
set -m user_list
set -u mysql
set -o file-list.mysql
run
Run adu in interactive mode with the above script (adu-script.txt):
$ adu -Id /root/adu-var < adu-script.txt
SEE ALSO
du(1)