Skip to content

tools/filetop: Add directory filter#5300

Merged
yonghong-song merged 1 commit intoiovisor:masterfrom
srivathsa729:filetop-dir-filter
Jun 27, 2025
Merged

tools/filetop: Add directory filter#5300
yonghong-song merged 1 commit intoiovisor:masterfrom
srivathsa729:filetop-dir-filter

Conversation

@srivathsa729
Copy link
Contributor

@srivathsa729 srivathsa729 commented May 7, 2025

Add support to filetop to filter by directory.

Signed-off-by: Srivathsa Dara srivathsa.d.dara@oracle.com

Add support to filtop to filter by directory.

Signed-off-by: Srivathsa Dara <srivathsa.d.dara@oracle.com>
@srivathsa729 srivathsa729 force-pushed the filetop-dir-filter branch from 6c8f442 to ac91de6 Compare May 7, 2025 14:35
bpf_text = bpf_text.replace('TYPE_FILTER', '!S_ISREG(mode)')
if args.directory:
try:
directory_inode = os.lstat(args.directory)[stat.ST_INO]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the target directory is a symbolic link, the directory_inode might differ from the target's inode. Does this behave as intended?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, I used os.lstat, which doesn't follow symlinks, so currently if a symlink is provided as an argument, it doesn't report any activity. Switching to os.stat should fix this by properly following the symlink to its target. I'll update the PR accordingly.

./filetop # file I/O top, 1 second refresh
./filetop -C # don't clear the screen
./filetop -p 181 # PID 181 only
./filetop -d /home/user # trace files in /home/user directory only
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The current functionality is good, but adding support for including subdirectories would make it even more powerful.
Thank you.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, that makes sense. I'll work on adding support for subdirectories.

Thanks for the input.

@srivathsa729
Copy link
Contributor Author

Along with directory filter, I felt it would be more helpful if we could also print
the complete file path. That way, we can clearly see where the accessed
files are located. I'm trying to construct the full path by walking up
the directory tree, but the BPF verifier is rejecting the pointer
arithmetic I’m using, even though the checks to prevent overflow
are in-place.

Any guidance or suggestions on how to get the full path working within
verifier limits would be appreciated.

Thanks,
Srivathsa

The patch reflecting the changes I’ve implemented:

+++ ../filetop-sandbox.py       2025-06-06 11:56:01.623956563 +0000
@@ -18,6 +18,8 @@
 from bcc import BPF
 from time import sleep, strftime
 import argparse
+import os
+import stat
 from subprocess import call
 
 # arguments
@@ -25,6 +27,7 @@
     ./filetop                 # file I/O top, 1 second refresh
     ./filetop -C              # don't clear the screen
     ./filetop -p 181          # PID 181 only
+    ./filetop -d /home/user   # trace files in /home/user directory only
     ./filetop 5               # 5 second summaries
     ./filetop 5 10            # 5 second summaries, 10 times only
     ./filetop 5 --read-only   # 5 second summaries, only read operations traced
@@ -55,6 +58,9 @@
     help="number of outputs")
 parser.add_argument("--ebpf", action="store_true",
     help=argparse.SUPPRESS)
+parser.add_argument("-d", "--directory", type=str,
+    help="trace this directory only")
+
 args = parser.parse_args()
 interval = int(args.interval)
 countdown = int(args.count)
@@ -79,7 +85,7 @@
     u32 name_len;
     char comm[TASK_COMM_LEN];
     // de->d_name.name may point to de->d_iname so limit len accordingly
-    char name[DNAME_INLINE_LEN];
+    char path[256];
     char type;
 };
 
@@ -91,6 +97,12 @@
     u64 wbytes;
 };
 
+struct path_buf_t {
+    char buff[256];
+};
+
+BPF_PERCPU_ARRAY(tmp_path, struct path_buf_t, 1);
+
 BPF_HASH(counts, struct info_t, struct val_t);
 
 static int do_entry(struct pt_regs *ctx, struct file *file,
@@ -108,17 +120,61 @@
     struct qstr d_name = de->d_name;
     if (d_name.len == 0 || TYPE_FILTER)
         return 0;
-
-    // store counts and sizes by pid & file
+   
+    u32 key = 0;
+    struct path_buf_t *tmp = tmp_path.lookup(&key);
+    if (!tmp)
+        return 0;
+   
+    if (d_name.len >= 256)
+        return 0;
+   
+    char* full_path = tmp->buff + 255 - d_name.len;
+    bpf_probe_read_kernel(full_path, d_name.len, d_name.name);
+    int path_len = d_name.len;
+   
+    int found = 0;
+   
     struct info_t info = {
         .pid = pid,
         .inode = file->f_inode->i_ino,
         .dev = file->f_inode->i_sb->s_dev,
         .rdev = file->f_inode->i_rdev,
+        .name_len = d_name.len,
     };
+    if (DIRECTORY_INODE) {
+        struct dentry *itr = de;
+        for (int i=0;i<50;i++) {
+            if (!itr->d_parent || itr==itr->d_parent) {
+                break;
+            }
+            itr = itr->d_parent;
+            u32 name_len = itr->d_name.len;
+            if(full_path - name_len - 1 < tmp->buff)
+                break;
+           
+            full_path--;
+            *full_path = '/';
+            full_path -= name_len;
+            bpf_probe_read_kernel(full_path, name_len, itr->d_name.name);
+            path_len += (name_len + 1);
+
+            if (itr->d_parent->d_inode->i_ino == DIRECTORY_INODE) {
+                found = 1;
+                break;
+            }
+        }
+        if (!found){
+            return 0;
+        } else {
+            bpf_probe_read_kernel(&info.path, path_len, full_path);
+        }
+    } else {
+        bpf_probe_read_kernel(&info.path, path_len, d_name.name);
+    }
+
     bpf_get_current_comm(&info.comm, sizeof(info.comm));
-    info.name_len = d_name.len;
-    bpf_probe_read_kernel(&info.name, sizeof(info.name), d_name.name);
+   
     if (S_ISREG(mode)) {
         info.type = 'R';
     } else if (S_ISSOCK(mode)) {
@@ -163,6 +219,16 @@
     bpf_text = bpf_text.replace('TYPE_FILTER', '0')
 else:
     bpf_text = bpf_text.replace('TYPE_FILTER', '!S_ISREG(mode)')
+if args.directory:
+    try:
+        directory_inode = os.stat(args.directory)[stat.ST_INO]
+        print(f'Tracing directory: {args.directory} (Inode: {directory_inode})')
+        bpf_text = bpf_text.replace('DIRECTORY_INODE',  str(directory_inode))
+    except (FileNotFoundError, PermissionError) as e:
+        print(f'Error accessing directory {args.directory}: {e}')
+        exit(1)
+else:
+    bpf_text = bpf_text.replace('DIRECTORY_INODE', '0')
 
 if debug or args.ebpf:
     print(bpf_text)
@@ -211,8 +277,9 @@
         print()
     with open(loadavg) as stats:
         print("%-8s loadavg: %s" % (strftime("%H:%M:%S"), stats.read()))
+   
     print("%-7s %-16s %-6s %-6s %-7s %-7s %1s %s" % ("TID", "COMM",
-        "READS", "WRITES", "R_Kb", "W_Kb", "T", "FILE"))
+        "READS", "WRITES", "R_Kb", "W_Kb", "T", "PATH"))
 
     # by-TID output
     counts = b.get_table("counts")
@@ -220,15 +287,12 @@
     for k, v in reversed(sorted(counts.items_lookup_and_delete_batch()
                                 if htab_batch_ops else counts.items(),
                                 key=sort_fn)):
-        name = k.name.decode('utf-8', 'replace')
-        if k.name_len > DNAME_INLINE_LEN:
-            name = name[:-3] + "..."
-
-        # print line
+        path = k.path.decode('utf-8', 'replace')
+       
         print("%-7d %-16s %-6d %-6d %-7d %-7d %1s %s" % (k.pid,
             k.comm.decode('utf-8', 'replace'), v.reads, v.writes,
             v.rbytes / 1024, v.wbytes / 1024,
-            k.type.decode('utf-8', 'replace'), name))
+            k.type.decode('utf-8', 'replace'), path))
 
         line += 1
         if line >= maxrows:```

@ekyooo
Copy link
Collaborator

ekyooo commented Jun 11, 2025

Along with directory filter, I felt it would be more helpful if we could also print the complete file path.

You can refer to the following PR that adds the full-path feature to libbpf-tools/opensnoop.
#5323
You can also refer to the code for the same feature in tools/opensnoop.py. The implementation is similar but slightly different.

Thank you.

@ekyooo
Copy link
Collaborator

ekyooo commented Jun 27, 2025

@srivathsa729
I think implementing the "supporting for subdirectories" feature could be quite complex. Since the current functionality is already good, I believe it’s fine to proceed with the subdirectory feature gradually in a separate PR.

Thank you.

@srivathsa729
Copy link
Contributor Author

Sure, Thanks for the review @ekyooo

@yonghong-song yonghong-song merged commit 137bd5f into iovisor:master Jun 27, 2025
1 of 12 checks passed
@ekyooo
Copy link
Collaborator

ekyooo commented Jul 6, 2025

@srivathsa729
Please refer to the PR below:
#5345

Thank you.

@srivathsa729
Copy link
Contributor Author

Hi, Thanks for the info. Working on it, will raise a PR, once it's done.

ekyooo added a commit to ekyooo/bcc that referenced this pull request Jan 23, 2026
  * Support for kernel up to 6.18

  * New Tools
    tools/softirqslower: New tool to trace slow software interrupt handlers (iovisor#5356)

  * Enhanced Functionality
    libbpf-tools/opensnoop: Added full-path support with `-F` option (iovisor#5323, iovisor#5333)
    libbpf-tools/filelife: Added full-path support (iovisor#5347, ab8e061)
    libbpf-tools: Introduced path helpers (ab8e061)
    libbpf-tools/trace_helpers: Added str_loadavg() and str_timestamp() common functions (694de9f)
    libbpf-tools/filetop: Added directory filter capability (iovisor#5300)
    libbpf-tools/runqslower: Added `-c` option to filter by process name prefix (673911c)
    libbpf-tools/runqlat: Dynamically size pid/pidns histogram map (iovisor#5342)
    libbpf-tools/fsdist, fsslower: Added support for fuse filesystem (9691c56)
    libbpf-tools/tcptop: Major refactoring using fentry/fexit for better performance (75bb73a, e2c7917, d786eaa, da3a474)
    tools/opensnoop: Added full-path support with `-F` option (iovisor#5334, iovisor#5339)
    tools/kvmexit: Added AMD processor support and parallel post-processing (13a4e5a, c2af2ee)
    tools/offwaketime: Added raw tracepoint support to reduce overhead (380ee01)
    Python uprobe API: Added functionality to detach all uprobes for a binary (iovisor#5325)
    Python API: Added support for executing a program and tracing it (iovisor#5362)

  * Bug Fixes
    libbpf-tools/filelife: Fixed wrong full-path handling (iovisor#5347)
    libbpf-tools/filelife: Fixed problem when using perf-buffer (ec8415b)
    libbpf-tools/funclatency: Delete the element from the `starts` map after it has been used (06ce134)
    libbpf-tools/offcputime: Fixed min/max_block_ns unit conversion error (iovisor#5327, d507a53)
    libbpf-tools/syncsnoop: Added support for sync_file_range2 and arm_sync_file_range() (4287921)
    libbpf-tools/ksnoop: Fixed two invalid access to map value (iovisor#5361)
    libbpf-tools/klockstat: Allows kprobe fallback to work with lock debugging (iovisor#5359)
    libbpf-tools/biotop: Fixed segmentation fault with musl libc build (52d2d09)
    libbpf-tools/syscall_helpers, Python BCC: Updated syscall list (add file_getattr/file_setattr) (b63d7e3, a9c6650)
    tools/tcpaccept: Fixed on recent kernels (c208d0e)
    tools/tcpconnect: Fixed iov field for DNS with Linux>=6.4 (iovisor#5382)
    tools/javaobjnew: Use MIN macro instead of min function (fb8910a)
    tools/biolatency, biosnoop, biotop: Use TRACEPOINT_PROBE() for tracepoints (iovisor#5366)
    Various tools: Don't use the old bpf_probe_read() helper (1cc15c3)
    CC: Support versioned SONAME in shared library resolution (beb1fe4, c351210)
    Python TCP: Added state2str() and applied to tools (bfa05d2)
    s390 architecture: Prevent invalid mem access when reading PAGE_OFFSET (d8595ee)

  * Build & Test Fixes
    Fixed build failure with clang21 (iovisor#5369)
    Fixed build for LLVM 23 by avoiding deprecated TargetRegistry overloads (iovisor#5401)
    ci: Make version.cmake handle shallow clone (2232b7e)
    ci: Various test fixes for proper CI operation (blk probes, rss_stat, kmalloc, btrfs/f2fs) (a499181, c338547, 6b7dd5d, ea5cf83)
    tests: Added coverage for versioned SONAME resolution (c351210)
    Removed luajit options to ensure no errors (26eaf13)

  * Doc update, other bug fixes and tools improvement
ekyooo added a commit that referenced this pull request Jan 26, 2026
  * Support for kernel up to 6.18

  * New Tools
    tools/softirqslower: New tool to trace slow software interrupt handlers (#5356)

  * Enhanced Functionality
    libbpf-tools/opensnoop: Added full-path support with `-F` option (#5323, #5333)
    libbpf-tools/filelife: Added full-path support (#5347, ab8e061)
    libbpf-tools: Introduced path helpers (ab8e061)
    libbpf-tools/trace_helpers: Added str_loadavg() and str_timestamp() common functions (694de9f)
    libbpf-tools/filetop: Added directory filter capability (#5300)
    libbpf-tools/runqslower: Added `-c` option to filter by process name prefix (673911c)
    libbpf-tools/runqlat: Dynamically size pid/pidns histogram map (#5342)
    libbpf-tools/fsdist, fsslower: Added support for fuse filesystem (9691c56)
    libbpf-tools/tcptop: Major refactoring using fentry/fexit for better performance (75bb73a, e2c7917, d786eaa, da3a474)
    tools/opensnoop: Added full-path support with `-F` option (#5334, #5339)
    tools/kvmexit: Added AMD processor support and parallel post-processing (13a4e5a, c2af2ee)
    tools/offwaketime: Added raw tracepoint support to reduce overhead (380ee01)
    Python uprobe API: Added functionality to detach all uprobes for a binary (#5325)
    Python API: Added support for executing a program and tracing it (#5362)

  * Bug Fixes
    libbpf-tools/filelife: Fixed wrong full-path handling (#5347)
    libbpf-tools/filelife: Fixed problem when using perf-buffer (ec8415b)
    libbpf-tools/funclatency: Delete the element from the `starts` map after it has been used (06ce134)
    libbpf-tools/offcputime: Fixed min/max_block_ns unit conversion error (#5327, d507a53)
    libbpf-tools/syncsnoop: Added support for sync_file_range2 and arm_sync_file_range() (4287921)
    libbpf-tools/ksnoop: Fixed two invalid access to map value (#5361)
    libbpf-tools/klockstat: Allows kprobe fallback to work with lock debugging (#5359)
    libbpf-tools/biotop: Fixed segmentation fault with musl libc build (52d2d09)
    libbpf-tools/syscall_helpers, Python BCC: Updated syscall list (add file_getattr/file_setattr) (b63d7e3, a9c6650)
    tools/tcpaccept: Fixed on recent kernels (c208d0e)
    tools/tcpconnect: Fixed iov field for DNS with Linux>=6.4 (#5382)
    tools/javaobjnew: Use MIN macro instead of min function (fb8910a)
    tools/biolatency, biosnoop, biotop: Use TRACEPOINT_PROBE() for tracepoints (#5366)
    Various tools: Don't use the old bpf_probe_read() helper (1cc15c3)
    CC: Support versioned SONAME in shared library resolution (beb1fe4, c351210)
    Python TCP: Added state2str() and applied to tools (bfa05d2)
    s390 architecture: Prevent invalid mem access when reading PAGE_OFFSET (d8595ee)

  * Build & Test Fixes
    Fixed build failure with clang21 (#5369)
    Fixed build for LLVM 23 by avoiding deprecated TargetRegistry overloads (#5401)
    ci: Make version.cmake handle shallow clone (2232b7e)
    ci: Various test fixes for proper CI operation (blk probes, rss_stat, kmalloc, btrfs/f2fs) (a499181, c338547, 6b7dd5d, ea5cf83)
    tests: Added coverage for versioned SONAME resolution (c351210)
    Removed luajit options to ensure no errors (26eaf13)

  * Doc update, other bug fixes and tools improvement
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants