
Linux is a powerful and reliable operating system, but even seasoned users encounter unexpected problems. Whether it’s a deleted file, a forgotten root password, or a sluggish system, knowing how to troubleshoot efficiently is key to becoming a true Linux expert.
This guide presents real-world Linux problem-solving scenarios along with step-by-step solutions, which are common among system administrators, developers, and everyday Linux users.
Scenario 1: You Accidentally Deleted an Important File
You accidentally deleted an important file using the rm command, and now you need to recover it. Unlike Windows and macOS, Linux does not have a built-in “Recycle Bin” for files deleted from the terminal.
Your recovery options depend on the filesystem in use.
For EXT3/EXT4 Filesystems
Use extundelete, which is an open-source utility designed to recover deleted files from ext3 and ext4 filesystems in Linux.
sudo apt install extundelete # Debian-based sudo yum install extundelete # RHEL-based
Before attempting recovery, unmount the partition to prevent further writes that could overwrite deleted data:
sudo umount /dev/sdX
Next, run the following command to recover the deleted file and make sure to replace /dev/sdX
with the actual partition where the file was deleted.
sudo extundelete /dev/sdX --restore-all
For XFS, Btrfs, or NTFS Filesystems
If your system uses XFS, Btrfs, or NTFS, the testdisk tool is a better option.
sudo apt install testdisk # Debian-based sudo yum install testdisk # RHEL-based
Run testdisk and follow the interactive prompts to restore lost files.
sudo testdisk
Prevention Tips:
-
- Use trash-cli: Instead of
rm
, usetrash-cli
to send files to a recoverable trash bin.
- Use trash-cli: Instead of
sudo apt install trash-cli trash-put myfile.txt
- Enable regular backups: Set up rsync or Timeshift to automatically back up important files.
Scenario 2: Recovering a Forgotten Root Password
You forgot your root password and can’t perform administrative tasks, which means you can’t install software, change system settings, or access critical files.
You can reset the root password by booting into recovery mode or modifying the GRUB bootloader.
Using Recovery Mode (Ubuntu/Debian)
First, reboot your system and hold Shift
during startup to access the GRUB menu, then select “Advanced options” → “Recovery mode” and choose “Drop to root shell prompt“.
Here, remount the root filesystem as writable and reset the root password.
mount -o remount,rw / passwd root
Reboot the system.
reboot
Using rd.break (RHEL/CentOS/Fedora)
First, reboot your system, press e
at the GRUB menu and find the line starting with linux
and add rd.break
at the end.
Next, mount the root filesystem and reset the root password.
mount -o remount,rw /sysroot chroot /sysroot passwd root
Finally, exit and reboot.
exit reboot
Prevention Tips:
- Create a passwordless sudo user to avoid being locked out of root access.
- Use SSH keys instead of passwords for authentication.
Scenario 3: You Installed a Package, but It’s Not Working
You installed a package, but it says “command not found
” when you try to run it, which usually happens when the binary isn’t in your system’s PATH, the package isn’t installed correctly, or there’s a missing dependency.
The solution is, first you need to verify that the package is installed or not.
dpkg -l | grep package-name # Debian-based rpm -qa | grep package-name # RHEL-based
If it’s missing, reinstall it:
sudo apt install package-name sudo yum install package-name
Next, check if the command is in your system PATH.
which package-name echo $PATH
If the binary is in a non-standard location, add it to PATH:
export PATH=$PATH:/usr/local/bin
Prevention Tips:
- Restart the terminal or run
hash -r
after installing new packages. - Use package managers like Snap or Flatpak, which handle dependencies better.
Scenario 4: Your System is Running Out of Disk Space
Your system displays a “No space left on device” error, preventing software updates, logging, and normal operations.
Here’s how to reclaim disk space and keep your system running smoothly.
Step 1: Check Disk Usage
The solution is, first you need to check how much space is used on each partition on your system using the df command.
df -h
Step 2: Find and Delete Large Files
Next, locate the largest files consuming space by running du command, which will scan your system and list the top 10 largest files or directories. Delete unnecessary files using rm
or move them to an external drive.
du -ah / | sort -rh | head -10
Step 3: Remove Unnecessary Logs
Logs are essential for troubleshooting and monitoring system activity, but they can grow rapidly and consume a significant amount of disk space.
Over time, old logs may no longer be needed, making them prime candidates for cleanup.
sudo journalctl --vacuum-time=2d # Deletes logs older than 2 days sudo apt autoclean # Removes outdated package files
Step 4: Remove Old Kernels (Ubuntu/Debian)
When you update your system, especially on Ubuntu or Debian-based distributions, new versions of the Linux kernel are often installed.
However, the old kernels are not automatically removed and over time, these old kernels can accumulate and take up a significant amount of disk space.
Removing them is a safe and effective way to free up space without affecting your system’s functionality.
sudo apt autoremove --purge
Prevention Tips:
- Set Up Log Rotation: Use logrotate to automatically manage log file sizes and retention periods.
- Monitor Disk Usage: Install tools like ncdu to track disk usage and identify space hogs.
- Regular Cleanups: Schedule periodic cleanups to remove temporary files, caches, and unused packages.
Scenario 5: Your Server is Suddenly Unresponsive
You are managing a Linux server, and suddenly, it stops responding and you try connecting via SSH, but the connection times out or refuses to establish. You might even notice that the server is still powered on, but it doesn’t react to any commands.
This situation can be caused by various issues, including:
- High CPU or memory usage due to runaway processes.
- Disk I/O bottlenecks, where the system is overloaded with read/write operations.
- Kernel panics or system crashes.
- Network failures, preventing remote access.
To restore control, follow these troubleshooting steps.
Step 1: Access the Server Locally or via TTY
If SSH isn’t working, try accessing the server directly or through a TTY session:
- On a physical machine, use the local console.
- On a virtual machine, use the hypervisor’s console.
- For Linux systems, switch to another TTY session using
Ctrl + Alt + F2
(orF3
,F4
, etc.).
Step 2: Check System Load
Once logged in, check the system’s load and resource usage, which will show the system’s load averages over 1, 5, and 15 minutes. A load value higher than the number of CPU cores indicates high demand.
uptime
Next, use top or htop to monitor processes in real time:
top Or htop
Look for processes consuming excessive CPU or memory.
Step 3: Identify and Kill Runaway Processes
To identify the most resource-intensive processes, run:
ps aux --sort=-%cpu | head
This lists the top CPU-consuming processes, where you can find a problematic process, and terminate it using:
kill -9 PID
Replace PID
with the process ID of the problematic application.
Step 4: Check System Logs
If the system is still responsive, check logs for errors:
sudo tail -f /var/log/syslog Or sudo dmesg | tail
These commands display recent system messages and kernel logs, which can help identify hardware or software issues.
Step 5: Reboot Safely Using SysRq
If the system is completely frozen, use the SysRq key combination to reboot safely:
echo b > /proc/sysrq-trigger
This triggers a safe reboot, ensuring data integrity by syncing disks and unmounting filesystems.
Conclusion
Troubleshooting is an essential skill for every Linux user. Whether it’s recovering deleted files, resetting passwords, or fixing system errors, knowing the right commands can save time and frustration.
Do you have your own troubleshooting tips? Share them in the comments! Let’s build a helpful Linux community together.