KDirStat'> ]> The KDirStat Handbook Stefan Hundhammer
sh@suse.de
1999-2005 Stefan Hundhammer &FDLNotice; 02/02/2002 0.1.01 &kdirstat; is a graphical disk usage utility, very much like the Unix "du" command, plus some cleanup facilities to reclaim disk space. KDE tdeutils utilities file system disk usage cleanup
Overview &kapp; is a graphical disk usage utility. It shows you where all your disk space has gone and tries to help clean it up. Screen Shot The &kapp; main window Main window screenshot Features Display Features Graphical and numeric display of used disk space Treemap display of used disk space Files kept apart from directories in separate <Files> items to prevent cluttering the display All numbers displayed human readable - e.g., 34.4 MB instead of 36116381 Bytes Reasonable handling of sparse files - only blocks that are actually allocated are added up to the total sums. Reasonable handling of (regular) files with multiple hard links - their size is divided by their number of hard links, thus evenly distributing their size over the directories they are linked to -- and, more importantly, not adding the same file up several times. Different colors in the directory tree display to keep the different tree levels visually apart Display of latest change time within an entire directory tree - you can easily see what object was changed last and when. Directory Reading Stays on one file system by default - reads mounted file systems only on request. You don't care about a mounted /usr file system if the root file system is full and you need to find out why in a hurry, nor do you want to scan everybody's home directory on the NFS server when your local disk is full. Network transparency: Scan FTP or Samba directories - or whatever else protocols KDE support. PacMan animation while directories are being read. OK, this is not exactly essential, but it's fun. Cleaning up Predefined cleanup actions: Easily delete a file or a directory tree, move it to the KDE trash bin, compress it to a .tar.bz2 archive or simply open a shell or a Konqueror window there. User-defined cleanup actions: Add your own cleanup commands or edit the existing ones. "Send mail to owner" report facility: Send a mail requesting the owner of a large directory tree to please clean up unused files. Misc Feedback mail facility: Rate the program and tell the authors your opinion about it. More Sceen Shots Configuring cleanup actions Configuring cleanup actions Configure cleanup actions window screenshot Configuring tree colors Configuring tree colors Configure tree colors window screenshot Feedback mail Feedback mail Feedback mail window screenshot Basic Usage Invoke &kdirstat; Start &kdirstat; from the TDE menu, right-click a directory in a Konqueror window or type kdirstat or kdirstat <directory-name> in a shell window or at KDE's Run command prompt (Alt-F2). Select a Directory &kdirstat; will prompt for a directory if you didn't specify one when starting it. You can specify local directories as well as URLs of remote locations - kdirstat /usr/lib works as well as kdirstat ftp:/ftp.myserver.org/pub. In any case, &kdirstat; will start reading that directory. That might take a while, but you can work with the program during all that time. Find out what Uses up all the Disk Space Look at the "Subtree Total" column or wait until a subtree is finished reading and look at the graphical percentage bar display to find out what directory subtee takes up how much disk space. Use the open / close icons (plus and minus signs or small arrows, depending on how you set up your KDE) or double-click an item to open or close it. Notice how files at any directory level are kept apart from subdirectories - there is a separate <Files> entry for them. This way, you can easily tell how much disk space the files are using up in relation to the subdirectories and their respective files. Do Something about it Once you found out where all your disk space goes, do something about it - this is probably why you are using this program in the first place. You have several options: Go to a computer hardware store and buy a new hard disk. This is probably not what you want. ;-) Tell the owner of that file or directory to please clean up. You can use &kdirstat; for that: Mark that file or directory (i.e., left-click on it) and select Send Mail to Owner from the context menu (right-click), from the tool bar (the envelope icon) or from the Report menu. A precomposed message will open in your favourite mail client. You can edit that text before actually sending it. The recipient of the mail is the user who owns the file or directory you marked - but of course you can edit that, too in the mail client. The mail will contain those items that are currently displayed open from the marked item on. If you want to include more items, open the respective directories; if you want less, close them. Of course you can always delete lines in the mail client if you find them irrelevant - there is no use complaining about some 367 byte files along with others that take several megabytes. Invoke a "cleanup" action. There are several predefined ones, and you can define your own. Use the context menu, the tool bar or the Clean Up menu to find out which are available. For some cleanup actions you will have to wait until the directory tree is completely read until you can activate them. If a cleanup action doesn't get enabled even then, the type of item you selected is inappropriate for that kind of action: Some actions can only performed on directories, while others can only be performed on files. Only very few actions work for <Files> pseudo entries since they don't have a real counterpart in the file system. Treemaps Quick Introduction to Treemaps What is it? The shaded rectangles you can see in the lower half of the &kdirstat; main window are called a "treemap". This is just another way of displaying items in a tree that each have a numerical value, such as a file size. Each rectangle corresponds to a file or directory on your hard disk. The larger the rectangle (or, rather, the larger its area), the larger the file. How to Use Treemaps Look at the largest rectangles. Click on one, and it is selected - both in the treemap view and above in the tree view (the list above). You can now see what file or directory that is - both in the tree view above and in the status line below. Find the largest rectangles, identify them, ant decide what to do with them: Keep them, delete them, whatever. Use the cleanup actions like in the tree view. The right mouse button opens a context menu that contains cleanup actions. The shading gives hints about which files belong together in directories. The bright spots indicate about where the center of parent directories is. Pros and Cons of Treemaps Treemaps are good for finding single large files, possibly very deeply nested in the directory hierarchy. They don't help very much if lots of small files clutter up a directory - use the tree view (the list) above the treemap for that. The treemap by itself view doesn't give away very much information other than relative file sizes. It can tell you where large files are, even if they are very deeply hidden in subdirectories. You always see all files at once, not only the relative sizes of subdirectories against each other like in the tree view. Click on a file to see more details in the tree view above. Bottom line: Both the treemap and the tree view have their strenghs and weaknesses. Use a combination of both to make best use of either's benefits. How to Get Rid of it If you need the screen space for the tree view (the list) or if you find it takes too long to update the tree view each time you delete a file, you can drag the splitter between the tree view and the treemap all the way down. The treemap doesn't get rebuilt below a certain minimum size, thus it doesn't eat performance any more. Alternatively, uncheck "show treemap" in the "treemap" menu or simply hit F9. Treemap Related Actions Mouse Actions in the Treemap A single click with the left mouse button selects the clicked file or directory both in the treemap and in the tree view. A single click with the middle mouse button selects the parent of the clicked file or directory. A single click with the right mouse button opens the context menu. A double click with the left mouse button zooms the treemap in at the clicked file or directory: The treemap is redisplayed with the near-topmost ancestor of the clicked file or directory as the root. A double click with the middle mouse button zooms out after zooming in. If the treemap is not zoomed in at all, it is simply rebuilt to fit into the available screen space without the need for scrollbars. This is mainly useful if automatic treemap resizing (the default) is switched off. Dragging the splitter above the treemap not only resizes the treemap subwindow, it also rebuilds the treemap and makes it fit into the available space. You can drag the splitter all the way down to deactivate the treemap alltogether. Below a minimum size the treemap will not be updated any more, so it doesn't cost any performance. Treemap Menu Actions Most treemap mouse actions have counterparts in the "treemap" menu. In addition to that, "show treemap" in the "treemap" menu toggles display of the treemap subwindow. If disabled, the treemap is really inactive and doesn't cost any performance. More Information about Treemaps How a Simple Treemap is Constructed In its most basic form, construction of a treemap is very easy: First, you need a tree where each node has an associated value. Directory trees with their accumulated file sizes are a very natural example. However, the tree needs to be complete with all accumulated values before anything can be done - that's why &kdirstat; doesn't display a treemap while directories are being read. Decide upon an direcion in which to split the available area initially. Since normally the treemap subwindow is wider than it is high, we first split horizontally. Split the area so each toplevel directory gets an area proportional to its accumulated size (i.e., its own size plus the size of all its children, grandchildren etc.). For each rectangle thus constructed, repeat the process for each directory level, but change direction for each level. For example, the second level will be split vertically, the third again horizontally etc. This basic algorithm as well as the idea of treemaps at all was introduced by Ben Shneiderman quite some years ago. Squarified Treemaps One major drawback of the simple treemap algorithm is that it usually results in lots of very thin, elongated rectangles that are hard to point at with the mouse and hard to compare visually against each other. This is why &kdirstat; uses "squarified" treemaps as described by Mark Bruls, Kees Huizing, and Jarke J. van Wijk of the Technical University of Eindhoven in the Netherlands. The basic idea is to improve the aspect ratio of the resulting rectangles, thus to make them more "square-like". Even though this doesn't always work out perfectly, it usually improves things a lot: There are normally very few (if any) thin, elongated rectangles in such a squarified treemap. The Shading: Cushioned Treemaps Squarifying a treemap comes at a cost: It makes the structure of the underlying tree even less obvious for the user. Where simple treemaps change direction for each level of subdivision, sqarified treemaps change direction within each level. The result are clusters of more or less square-like rectangles. The only hint about the tree structure that is given is that larger rectangles are near the left and at the top of each level. Thus, &kdirstat; uses a technique described by Jarke J. van Wijk and Huub van de Wetering of the TU Eindhoven, NL: "Cushioned" treemaps. This is the 3D-like shading you can see in &kdirstat;'s treemaps: It gives each rectangle within the treemap (each "tile") a cushion-like impression. This is not just for pretty looks, its main purpose is to group files optically together. &kdirstat;'s own Treemap Improvements The squarification algorithm requires items to be sorted by size. A Linux/Unix directory tree, however, usually has lots of items; a full-blown Linux installation can easily consist of 150,000+ (!) files and directories. The best sort algorithms (heap sort, quick sort) still have a cost in the order of n*ln(n), i.e. they are proportional to the product of the number of items times their logarithm. Likewise, the cushion shading algorithm requires relatively expensive floating-point arithmetics for each individual pixel of each treemap tile (even though, by the way, it is very efficient for a 3D-shading algorithm - no expensive sinus/cosinus etc. calculation required). On the other hand, most items in large directory trees are so tiny they cannot be seen at all. &kdirstat; simply omits everything that will result in treemap tiles less than a predefined (3*3 pixels) size - they are pretty useless for the purposes of &kdirstat;'s users anyway. Those tiny thingies may end up in some featureless grey space in the treemap display. So don't wonder if you click on some grey pixels and &kdirstat; insists they belong to a rather high-level directory: &kdirstat; simply means to tell you that those pixels correspond to some small stuff in that directory. Use the tree view (the list) above the treemap for more detailed information. Credits and Further Reading about Treemaps SequoiaView gave the inspiration for treemaps within &kdirstat;. SequoiaView is a MS Windows (that's the bad part) program created at the TU Eindhoven, NL. It introduced cushion treemaps and later squarified cushion treemaps. Its purpose is very close to &kdirstat;'s. If you are looking for a &kdirstat;-like program on that "other" ;-) platform, go for SequoiaView: http://www.win.tue.nl/sequoiaview . Needless to say, &kdirstat; users should easily be able to simply mount their MS Windows partitions and use &kdirstat; to clean up those as well. The only acceptable excuse ;-) for not doing this might be NTFS partitions (no reliable write access from Linux to those yet) or single-OS MS Windows machines. Ben Shneiderman invented treemaps - a truly intuitive way of visualizing numerical contents of a tree. For more information, see http://www.cs.umd.edu/hcil/treemaps/ . Jarke J. van Wijk and Huub van de Wetering from the TU Eindhoven, NL wrote a paper called "Cushion Treemaps: Visualization of Hierarchical Information". It is available in PDF format at http://www.win.tue.nl/~vanwijk/ . Mark Bruls, Kees Huizing and Jarke J. van Wijk from the TU Eindhoven wrote a paper called "Squarified Treemaps". It is also available in PDF format at http://www.win.tue.nl/~vanwijk/ . Alexander Rawass had written a previous implementation of treemaps for &kdirstat;. Even that part has been completely replaced for various reasons (performance, integration into the &kdirstat; main application, memory consumption, stability, user interface conformance, lack of maintenance), it had proven that treemaps are a useful addition for a program like &kdirstat;. Frederic Vernier and Laurence Nigay from the University of Grenoble, France wrote a paper called "Modifiable Treemaps Containing Variable-Shaped Units" (URL unknown, sorry). They also wrote a MS Windows programm called "parent" that uses a mixture of treemaps and file lists within individual treemap tiles. Personally, I don't like that approach very much - I find that display very cluttered and confusing (that is why I didn't adopt anything like that for &kdirstat;). But this is just my personal opinion that others may or may not share. Predefined Cleanup Actions &kdirstat; comes with a number of predefined cleanup actions. You can configure them all to your personal preference, and you can add your own. Here is what the predefined cleanup actions do: Open in Konqueror This opens the selected item in a Konqueror window. You can use Konqueror to delete it (but then, you can also do that more easily from within &kdirstat;), you can move it to another place or you can examine it more closely. If the selected item is a known MIME type, this will open the appropriate application. For example, if you invoke Open in Konqueror on a PNG image, Konqueror will immediately start an image viewer and display that image. This is the Swiss army knife of cleanup actions: You can do a lot of different things with it. Thus, &kdirstat; cannot know if and when it makes sense to re-read that directory - you will have to do that manually: Select Refresh selected from the context menu (right-click) or from the File menu. Open in Terminal This opens a terminal window in the directory of the selected item. Use this to issue a few shell commands in that directory, then simply close that shell window. You can easily open a new one in a different directory if you need one, so you might want not to bother to repeatedly type cd with lengthy paths - simply close that shell and open a new one at the new location (type Ctrl-T. As with the Open in Konqueror cleanup action described above, you must manually re-read the directory's contents if you make changes. Otherwise they will not be reflected in &kdirstat;'s display. Compress This compresses the selected directory to a .tar.bz2 archive. For example, a subdirectory /work/home/kilroy/loveletters will become a compressed archive /work/home/kilroy/loveletters.tar.bz2. The directory is removed once the compressed archive is successfully created (but of course not if that failed). Any existing archive of the same name will silently be overwritten. Remember that Konqueror and related utilities can use that kind of archive transparently; there is no need to unpack it if you want to read a file in that archive. Simply click into the archive in Konqueror. If you prefer .tgz archives to .tar.bz2, change the command line in the cleanup settings. With .tar.bz2, this is cd ..; tar cjvf %n.tar.bz2 %n && rm -rf %n For .tgz archives, change this to cd ..; tar czvf %n.tgz %n && rm -rf %n But you might think twice before doing that: "bzip2" (for .tar.bz2) compresses a lot more efficient than ordinary "gzip" (for .tgz or .tar.gz), and most systems support that just as well. It's just a matter of getting used to typing tar cjvf rather than tar czvf for creating an archive or tar xjvf rather than tar xzvf for unpacking it. make clean This issues a make clean command in the selected directory. This is useful if you build software from source frequently. After successfully installing the software (make install), there is no need to keep the built binaries around in the source directory any longer. On the other hand, people frequently forget to clean up those directories, so you can do that from within &kdirstat; with a few mouse clicks. Delete Trash Files This deletes files that are usually superfluous, such as editor backup files or core dumps in the selected directory and in any of its subdirectories. By default, the following types of files are deleted: Object files left over from compiling software: *.o Editor backup files: *~ *.bak *.auto Core dump files: core Of course, you can configure this to suit your personal preferences. Delete (to Trash Bin) This invokes the KDE standard "delete" operation, i.e. the selected file or directory is moved to the KDE trash bin. Even though this doesn't help to reclaim disk space right away, it is a safe method of deleting. Use this action for anything you want to get rid of and then review your actions by looking into the KDE trash bin. If you are really sure, select Empty Trash Bin there. Until that point, you can always move those items back to where they came from. You might consider not using this cleanup action if you are cleaning up a directory on a different file system than your home directory: In that case that "moving to trash" involves copying the items (and then deleting them at the original location) which might take a while. Notice that when moving an item to trash is not successful, &kdirstat; will still falsely display that item as deleted even though it's still there. Use Refresh selected from the context menu to update the display manually. Read here why. Delete (no way to undelete!) This is a real delete, not simply moving something to the trash bin. It's quicker, and disk space is reclaimed immediately, but there is no way to recover if you made a mistake. You will be prompted for confirmation when you invoke this. You can change the configuration to not prompt for confirmation, but don't blame me if anything goes wrong after you did that. As with Delete (to Trash Bin), you will need to manually re-read a directory if this went wrong (usually due to insufficient permissions) - &kdirstat;'s display is out of sync with the hard disk if that happens. Read here why. Configuring Cleanup Actions Select Configure &kdirstat;... from the Settings menu and switch to the Cleanups page: Configuring cleanup actions Configure cleanup actions window screenshot Reference Select the cleanup action you want to configure from the list at the left side. You might need to check Enabled before you can make any changes. Enter a title in the Title field. You should mark one of the characters in the title with an ampersand ('&') to provide a keyboard shortcut in the menus. Enter a shell command in the Command line field. The command will be invoked with /bin/sh, so you can use everything the default shell provides - including pipelines, logical 'and' or 'or' ('&&' or '||', respectively) or multiple commands separated by semicolons. Use '%p' for the full path (or URL) of the currently selected file or directory or '%n' for the name without path. '%t' will be expanded to the full path name of the KDE trash directory (usually ${HOME}/Desktop/Trash, but since this tends to change between different KDE versions it is safer to use '%t'). &kdirstat; will always change directory to the selected item, so there is no need to manually add a cd command to the command line. Commands are started in the background if possible, so don't add an extra ampersand '&'. Check Recurse into subdirectories if the command should be called for each subdirectory of the selected directory. Whether or not this is useful depends on the kind of command you entered: A make clean command usually takes care of that internally, while it's a lot easier to use rm -f *.bak and let &kdirstat; handle subdirectories rather than using a more complex find ... | xargs ... command. Check Ask for confirmation if you want to prompt the user for confirmation every time he invokes that cleanup action (but not for each recursive subdirectory!). But beware: Having to confirm a lot of such prompts tends to make users unattentive. They begin to blindly confirm everything out of habit. Thus, use confirmations only when really necessary. Check the category of objects this cleanup action works for. Not all commands make sense for both files and directories. <Files> pseudo entries are a very special case: They don't have a real counterpart on the hard disk. You can safely check the <Files> category for actions that require changing directory to somewhere and then execute a command there, but there is no use trying to delete such a <Files> entry. Choose between On local machine only ('file:/' protocol) and Network transparent (ftp, smb, tar, ...). Most commands run locally only. There are only a few exceptions: For example, you can open a remote location in many TDE applications, e.g., Konqueror. Select a Refresh Policy to tell &kdirstat; how to update its display after the cleanup action has been invoked: No refresh: Don't refresh the display. Either the cleanup action doesn't change the directory tree anyway or it is unknown when or how - or you don't care. Cleanup actions with this refresh policy are the only ones that can be invoked while the respective directory subtree is being read. All others can only be started once reading is finished. Refresh this entry: Refresh the directory branch the cleanup action was selected with. This is the most useful refresh policy for cleanup actions that delete a number of items from a directory tree, for example Delete Trash Files. If this refresh policy is selected, the command is not started in the background: &kdirstat; has to wait for it to finish so the directory display can be refreshed at the proper time. Refresh this entry's parent: Similar to Refresh this entry, but one level higher up. Useful for cleanup actions that delete the selected item but create a new one on the same level, for example the Compress standard cleanup: The original directory is deleted, but a .tar.bz2 file is created instead. If this refresh policy is selected, the command is not started in the background: &kdirstat; has to wait for it to finish so the directory display can be refreshed at the proper time. Asume entry has been deleted: Don't really re-read anything from disk, but assume the cleanup action deletes the selected item, thus simply remove that item from the directory tree's representation in &kdirstat;'s internal data structures. This is much faster than any real refresh, but it might cause the internal data structures to get out of sync with the hard disk if the cleanup action fails and doesn't really delete the selected item. In this case, the user will have to manually re-read that directory branch. Examples Open in Emacs This is a trivial example that shows you how to add a new cleanup action that opens a file in Emacs (or any other editor of your choice). Select one of the unused user-defined cleanup actions from the list. Make sure Enabled is checked. Enter Open in &Emacs in the Title field. Notice the '&': This marks the letter 'E' as this cleanup action's keyboard shortcut. Enter emacs %p in the Command line field. Leave both Recurse into subdirectories and Ask for confirmation unchecked. Make sure only Files is checked in the Works for... section and Directories and <Files> pseudo entries are unchecked. If you like Emacs' "dired" mode very much, you can also leave Directories checked, but it definitely doesn't make any sense trying to open an editor with a <Files> entry. Leave On local machine only selected. If you feel like experimenting a lot, you can try setting up Emacs so it fetches files from remote locations, but even then most likely only the 'ftp' protocol will work. Leave the Refresh Policy at No refresh. This ensures Emacs is started in the background and you can continue working with &kdirstat; while Emacs runs. It wouldn't make too much sense to change the command line to emacs %p & and change the refresh policy to, say, Refresh this entry: The refresh would take place immediately after Emacs starts, and this is probably not what you want. Compress This example explains the predefined Compress cleanup action in more detail. Remember, this cleanup action makes a compressed .tar.bz2 archive from a directory. The Command line for this cleanup action is: cd ..; tar cjvf %n.tar.bz2 %n && rm -rf %n cd .. changes directory one level up. We don't want to do something in the selected directory, but one level higher. The semicolon ; tells the shell to execute one more command - unconditionally, no matter if the previous command succeeded or failed. tar cjvf %n.tar.bz2 %n && is where the compressed .tar.bz2 archive is created. "c" is the tar command for "create", "j" means "use bzip2 compression", "v" is "verbose" (even though this is strictly spoken unnecessary here), "f" means use the next argument as the target file name rather than some default tape device (which nowadays nobody uses any more anyway). "%n.tar.bz2" will be expanded to the name of the selected directory without path plus "tar.bz2", "%n" will be expanded to the name without anything else. For a directory /usr/lib/something all this will result in a command tar cjvf something.tar.bz2 something && makes the shell execute the rest only if the previous command (the tar command) is executed successfully. This is used here to make sure the directory only is deleted if we really have a .tar.bz2 archive with the same contents so we can easily restore it when necessary. This is crucial in case there insufficient disk space to create the archive or should we have insufficient permissions to create the archive. rm -rf %n recursively deletes the directory without asking or complaining. Works for... is enabled for directories only. Note it wouldn't be a good idea to enable it for <Files> entries, too: The user would rightfully expect the .tar.bz2 archive to contain the contents of the <Files> entry only, i.e. only the files on that directory level. The command would, however, pack the entire directory tree from the parent level on into the .tar.bz2 file. The Refresh Policy is set to Refresh this entry's parent since not only the selected item is changed, but its parent also: It loses one child (the directory) but gets another one (the .tar.bz2 archive). Please note that Recurse into subdirectories is not checked here: the tar command and rm -rf take care of any subdirectories. Feedback Mail Description Send Feedback Mail... from the Help menu opens this dialog: Feedback mail Feedback mail window screenshot You answer the questions (at least those marked as required) and add your personal comments (in English, if you can, or in the special case of &kdirstat; alternatively in German). Upon clicking on the Mail this... button, your mail client opens with a precomposed mail. You can review that mail to make sure it doesn't contain anything you don't like. When you are convinced the mail is okay and doesn't contain anything you don't like, send it. With your opinion and your personal comments, you can make a contribution to the Open Source movement - even if you are not a developer, even if you don't have a clue how to improve or change the software. Your opinion is important, even if you decide you don't like the program and send the mail off with "this program is crap" checked. Open Source softare lives and breathes with user feedback. If you miss a feature, tell us about it. If you consider an existing feature confusing, tell us about it. If you find an application overloaded with features so you can't find the ones you really need, tell us about it. On the other hand, if you like the program the way it is and you wouldn't like to see it changed in any way, tell us about that, too. If you simply want to thank those who go through the trouble writing all that software, do it. Your input is appreciated, be it positive or negative. There is nothing more frustrating for an Open Source software author than that lingering uncertainty if there is anybody out there who actually uses his program. He may not get any response from users - does that mean nobody uses the software, or does it mean it simply runs so good nobody has reason to complain? You can contribute by telling him he is doing all right and he should keep up the good work. In the opposite case, why not tell the author of a particular annoying program just how annoying it is? This may shake him sober and make him reconsider his work. Open Source is one of the world's greatest democracies. Make your vote! Privacy Your mail sent with Send Feedback Mail... is sent to the authors of this program, to nobody else. No company or government institution will get your mail address or any personal data. You might have noticed no personal data are requested in the feedback form. In particular, you will never receive spam e-mail of any kind because of sending feedback mail. We, the authors of this program, loathe spam probably even more than the average KDE user since we get so much of it - spam robots tend to extract e-mail addresses from source code and from web pages, so you can be sure we do our best to make life as hard as we can on the spammers and certainly not help them in any way. The purpose of all this feedback mail is to gather information about average user satisfaction, about average opinions about an application's feature set, an application's stability and learning curve. It's all about averages, thus no specific user's data will ever be made available to the public - only statistical averages over a large number of users, if at all. Feedback Mail Example A typical feedback mail looks about like that: [kde-feedback] KDirStat-2.4.4 user feedback <comment> This is where the personal comments go. You may enter virtually any number of lines. </comment> general_opinion="5/8_nice_try" features_liked="stay_on_one_filesys" features_liked="feedback" features_liked="pacman" stability="5/5_keeps_crashing" learning_curve="5/5_still_no_clue" recommend="yes" Notice it's all plain ASCII. There is no attachment, no hidden header fields, no information about your machine or yourself - only what you would send to anybody else when you send a mail. By the way, this is also why we kept the format that simple. Many developers today prefer XML for all kinds of data, but the end user (you) should be able to read and understand what you send - just so you can make sure you don't send any information you'd rather keep to yourself. Developer's Guide to KDirStat Most of what you can see of &kdirstat; is one separate KDE widget that can be used in other applications, too. Those parts of &kdirstat; are even licensed under the LGPL, so you are even allowed to use it in commercial applications. The &kdirstat; sources are extensively documented. Read the documentation in the header files for more details or use "kdoc" to generate HTML documentation from them. Questions and Answers &reporting.bugs; Can I use &kdirstat; to sum up a directory on an FTP server? Yes. Simply specify the URL at the command line or even in &kdirstat;'s directory selection box: kdirstat ftp:/ftp.myserver.org/pub (command line) or ftp:/ftp.myserver.org/pub (directory selection box). &kdirstat; supports all protocols that KDE supports. You can even use the "tar" protocol (does it make any sense to do that? You decide). The only restriction is that the protocol needs to support the "list directory" service - which not all protocols do. If you are unsure about the syntax to use, try it in Konqueror first and look at Konqueror's URL line. For example, to figure out how to specify a "tar" URL, click into a "tar" archive in Konqueror and look at the resulting URL to get an idea of what it looks like. How do I get the exact byte size of an entry rather than Megabytes or Kilobytes? Right-click the number in the list. Why does the "du" command sometimes report different sizes than &kdirstat;? There are different kinds of sizes reported by different kinds of system calls or KDE services: The byte size and the block size. The byte size is the exact number of bytes of a file or directory. This is what &kdirstat; uses. The block size is the number of disk blocks allocated by a file or directory. Most "du" commands use that. Depending on the type of file system, parts of the last block of a file or directory may be unused, yet reserved for it anyway. If such a file system uses 1024 byte blocks, a file will at least need those 1024 bytes, no matter if it is 1024, 200 or even just one byte large. That depends on the file system type and sometimes even on how this is set up - i.e., this is highly system specific. &kdirstat; uses the byte size since this the only size that is reliably returned by all kinds of system calls and KDE services alike. It only really makes a difference in very pathological situations anyway, for example if you have subdirectories with a large number of tiny files. What does this display mean: 6.3 MB (allocated: 1.3 MB) This is a so-called "sparse file" (also known as "file with holes"). This means that the file really is 6.3 MB large, but only 1.3 MB of that are actually allocated - the rest are just zeroes. This is typical for core dumps (memory images of crashed programs written to a file named core or core.*) or binary database files: The kernel writes those files in a way so only real data content is allocated on disk and not the large amount of zeroes. Technically, a sparse file is created with the regular open() system call to open the file for writing, then using lseek() to extend the file size beyond its previous size and then writing at least one byte. The area between the old and the new file size becomes a "hole" in the file - it is not actually allocated on the disk. Upon reading this area, a value of zero is returned for each byte read. When bytes are written to that area, file system blocks are actually allocated, possibly creating two smaller holes before and after the area written to. Please note that most file utilities do not deal graciously with sparse files. Those that support them at all normally need special command line arguments. Otherwise they tend to simply reading all bytes (including all the zeroes from the holes) and writing them to a new location - which of course means that the resulting file is no longer sparse, but really occupies all the space its size indicates. This may mean that you can blow up the above 6.3 MB core dump file from 1.3 MB disk usage (and 5 MB zeroes in holes) to really 6.3 MB disk usage. GNU file system utilities like tar and rsync at least support command line options to prevent that. GNU cp is a notable exception - it has a heuristic that seems to work very well. GUI driven file managers on the other hand tend to simply ignore this - even the most modern and cool looking ones. If in doubt, check your favourite file tools. Produce a core dump - they are normally sparse files. The more memory a program uses, the more likely it is to have large sections of zeroes in its memory image. Try this (in a shell): Enable core dumps - they are usually disabled in most Linux distributions: ulmit -c 128000 This sets the limit of core dump files to 128000 blocks (512 bytes each), i.e., to 64 MB. This should be sufficient. Start a program with considerable memory consumption - in the background: xmms & Make the program dump core: kill -ABRT %xmms This sends the ABORT signal to this process, terminating it with a core dump. Look at the core dump: kdirstat . or, for a neutral third-party program (from the Linux coreutils package): /usr/bin/stat core* (You need to multiply the "blocks" output with 512 to find out allocated disk space) Copy that core dump (e.g. to another directory) and look at it again. You will be surprised how "heavy" all those zeroes suddenly have become. Try that with several copy utilities (/bin/cp, file managers of your choice). Remember to always use the sparse original, not any blown-up copies! Moving files should always be safe (unless a file manager is really, really stupid), but copying can easily blow up sparse files to huge assemblies of meaningless zeroes. Agreed, sparse files are rather uncommon these days, so this is usually not a problem. Just remember &kdirstat; knows how to deal with them. ;-) Please note that this special handling is only in effect if &kdirstat;'s optimized read methods for local files are used (you can turn this on and off in the Settings -> General dialog) - KDE's KIO methods do not return this kind of information. What does this display mean: 878.5 KB / 21 Links This means that this file has a number of hard links. &kdirstat; uses only the respective portion of its size for its statistics - in the above case, 878.5 KB / 21 = 41.8 KB. When another link to this file is processed, the next 875.5/21 KB are added to the total - and so on. The rationale is that is makes no sense to count such a file 21 times with its full size - this would greatly distort the statistics. For example, look at /usr/lib/locale on a (SuSE) Linux system - many files there have multiple hard links to save disk space. The total sum of that directory on a SuSE Linux 9.2-i386 system is 40.5 MB -- as opposed to 205.6 MB that the added-up output of /bin/ls -lR would suggest (or &kdirstat; with use optimized local read methods turned off in the Settings -> General dialog) - sometimes, as in this example, this really makes a difference! Technical background: In Unix/Linux file systems, files primarily have a numeric ID, their "i-number", the index of the corresponding "i-node", the file system's administrative information. Each directory entry of a file really is no more than a link to that i-node. You can have the very same file under several distinct names this way - even in different directories. The only limitation is that this is restricted to one file system (i.e. to one disk partition) because those i-numbers are unique only per file system. Hard links can also introduce a whole new dimenstion of problems with applications that create backup copies of working files - they usually only rename the original file to a backup name and write their content to a new file. Editors usually work that way. This however means that any additional hard links to that file now point to the outdated backup copy - which is normally not what is desired. Only very few applications handle this reasonably. So the bottom line is: Use hard links only if you know very well what you are doing. All this is probably why symbolic links have become so much more popular in recent years: They can also point to different file systems, even (via NFS) to different hosts in the network. On the downside, symlinks can also be stale - pointing into nothingness. This cannot happen with hard links: A file is only really deleted when the last of its links is deleted (this includes open i-nodes in memory - i.e., processes still having an open file handle to that i-node). Directories rely completely on hard links (this is also why &kdirstat; does not attempt to try anything smart with multiple-hard-link directories - it would make no sense): The ".." entries in each directory pointing to its parent is nothing else than another hard link to that parent (named ".."), and "." is nothing else than a hard link to itself. This is also why even a completely empty directory has a link count of 2 - one for "." in its own directory, one for its name in its parent directory. Like sparse files above, regular files with multiple hard links are pretty uncommon these days - but they are still used, and sometimes they can make a difference, and this is why &kdirstat; has special handling for them. Please note that this special handling is only in effect if &kdirstat;'s optimized read methods for local files are used (you can turn this on and off in the Settings -> General dialog) - KDE's KIO methods do not return this kind of information. I don't want to use KMail every time I send a mail with &kdirstat;. How do I tell &kdirstat; to use a different mail client? Start kcontrol or select Preferences in the TDE menu, then select Network -> Email and enter your favourite mail client in the Preferred Email client field. How do I get rid of those many percentage bar colors? I want them all displayed in the same color. Select Configure &kdirstat;... from the Settings menu, switch to the Tree colors page and drag the slider all the way up. Credits and License &kapp; Program copyright 1999-2002 Stefan Hundhammer sh@suse.de Contributors: Alexander Rawawss alexannika@users.sourceforge.net Initial treemaps (those who currently don't work any more) Documentation copyright 2002 Stefan Hundhammer sh@suse.de &underFDL; &underGPL; Installation How to obtain KDirStat &kdirstat; is part of the KDE project http://www.kde.org. &kdirstat; can be found on the &kdirstat; home page at http://kdirstat.sourceforge.net/ or at the mirror site at http://www.suse.de/~sh/kdirstat/ . Requirements Linux or any other Unix-type operating system. As stupid as this sounds: There were quite some people complaining that they couldn't get &kdirstat; installed on their Win9x system. Many people seem to believe that if it has windows, it has to run on MS Windows... KDE 3.x All required libraries as well as &kdirstat; itself can be found on The &kdirstat; home page. Compilation and Installation See file "build-howto.html" in the distribution tarball. &documentation.index;