When working with Linux servers and applications, silent failures and hard‑to‑trace faults are incredibly frustrating. Standard logs often don’t reveal the root cause of permission issues, network connection failures, port conflicts, or file access problems. That’s where strace and lsof come in — two powerful command‑line tools that let you inspect system calls, file usage, network sockets, and processes in real time.
In this article, we’ll cover:
What strace and lsof are
How they work and key use cases
Installation and basic usage
Examples with commands
Real‑world troubleshooting scenarios
Practical tips and best practices
strace & lsof?strace is a Linux diagnostic tool that traces system calls made by a process — such as file opens, reads, writes, network connects, and more. It can help reveal why an application fails even when logs are quiet.
lsof stands for list open files. Since “everything is a file” in Linux, lsof shows you open files, directories, network sockets, and pipes used by running processes, which helps identify file locks, port conflicts, and resource issues.
strace?Detecting missing files or wrong paths
Identifying permission issues (e.g., “Access denied”)
Debugging failed network connections
Seeing exactly where a process is blocked
lsof?Finding which process is using a TCP/UDP port
Detecting processes locking files or directories
Listing deleted files still held open by processes
Understanding resource usage of applications
Together, strace and lsof help you solve deep‑seated issues faster without guesswork.
Most Linux distributions include these tools by default. If you don’t have them installed:
Debian / Ubuntu:
sudo apt update
sudo apt install strace lsof -y
Fedora / RHEL / CentOS:
sudo dnf install strace lsof -y
Arch / Manjaro:
sudo pacman -S strace lsof
strace — Trace System CallsAttach to a running process:
strace -p <PID>
Run a command and trace its system calls:
strace ls -l /var/log
Filter for specific calls (e.g., file opens):
strace -e openat myservice
Example output:
openat(AT_FDCWD, "/etc/myservice/config.yaml", O_RDONLY) = -1 ENOENT (No such file or directory)
This tells you the service failed to find its config file. (back2cloud.com)
strace Tips-f — Follow child processes
-o output.log — Save output for offline review
-e trace=network — Focus on networking calls
lsof — See Open Files & SocketsList open files by process ID:
lsof -p <PID>
Find who’s listening on port 8080:
lsof -i :8080
Check deleted files still held open (often fills up disk space):
lsof | grep deleted
Issue: Web app becomes unresponsive when writing logs.
Find open files and confirm the log is opened:
lsof -p $(pidof webapp)
You see:
webapp 9876 user 5r REG /var/log/webapp.log
Check access errors with strace:
strace -e openat -p $(pidof webapp)
It shows:
openat(..., "/var/log/webapp.log", O_RDONLY) = -1 EACCES (Permission denied)
Fix: Correct file permissions:
chmod 644 /var/log/webapp.log
chown webuser:webgroup /var/log/webapp.log
Problem solved without rebooting. (back2cloud.com)
Web service fails to bind to port 9090.
lsof -i :9090
Shows:
nginx 4321 root 6u TCP *:9090 (LISTEN)
Solution — stop or reassign ports:
sudo systemctl stop nginx
Use lsof to inspect open sockets:
lsof -p $(pidof microservice)
Then trace connection attempts:
strace -e connect -p $(pidof microservice)
You see:
connect(..., sin_port=htons(3306), ...) = -1 ECONNREFUSED (Connection refused)
Meaning the database isn’t reachable — maybe not running or wrong host/port.
Always follow child processes: strace -f …
Output to log for complex debugging: strace -o trace.log …
Use grep/awk to filter noise
Use lsof to check deleted but still open files — a common disk space issue
Create reusable command aliases, e.g.:
alias debugweb="strace -f -e openat -p $(pidof webapp) | grep /var/log"
strace and lsof are essential troubleshooting tools for Linux admins, DevOps engineers, and sysadmins. They help diagnose elusive system issues — from permission errors to network and file resource problems — with precision and minimal guesswork. Use them before making destructive changes or guessing at logs.
By mastering these utilities, you’ll spend less time debugging and more time improving system reliability.
Don’t run strace on a production process blindly — consider performance impacts.
Use saved logs for post‑issue analysis.