By Srujan in Engineering👨‍💻 — 04 Mar 2019

Why I say no to tmux

As a developer, I've watched countless colleagues spin up tmux sessions, detach from screen windows, and create elaborate terminal multiplexer setups to manage their long-running processes. While these tools have their place, I've come to strongly prefer a different approach: proper process management through systemctl and well-designed signal handling.

The Problem with Terminal Multiplexers for Process Management

Don't get me wrong—tmux and screen are excellent tools. They shine when you need multiple terminal sessions, want to preserve your work environment, or need to share sessions with teammates. But when it comes to running production services or even development processes that need to persist beyond your terminal session, they introduce unnecessary complexity and fragility.

The Fragility Factor

Terminal multiplexers create a house of cards. Your process depends on the multiplexer session, which depends on your terminal connection, which depends on your SSH session or local terminal. Each layer adds a potential point of failure. I've seen too many critical processes accidentally killed because someone ran tmux kill-server or because a network hiccup terminated the wrong session.

The PID Hierarchy Problem

Let me illustrate this with a real example. When you run a process in tmux, you get a process tree that looks like this:

systemd(1)
  └─ sshd(1234)
      └─ sshd(1235)
          └─ bash(1236)
              └─ tmux: server(1237)
                  └─ bash(1238)
                      └─ your_app(1239)

Your application (PID 1239) is now six levels deep in the process hierarchy. If any parent process dies unexpectedly, your application gets orphaned or receives a SIGHUP that it might not handle correctly. I've witnessed this exact scenario during a routine SSH daemon restart where a critical data processing job was killed because it was buried in a tmuxsession attached to an SSH connection.

Real-World Process Management Nightmares

Case 1: The Phantom Process A colleague once spent three hours debugging why their web server wasn't responding to requests. The process was still running according to ps aux, but wasn't listening on its port. It turned out the tmux session had died, but the application continued running as an orphaned process with broken stdin/stdout. The process was technically alive but couldn't log errors or accept new connections properly.

Case 2: The Resource Leak In another incident, a machine learning training job was consuming 100% CPU but producing no output. The tmux session showed it was running, but the process had actually crashed hours earlier. However, the tmux session kept the zombie process around, and the parent tmux server was stuck in an uninterruptible state trying to clean up file descriptors. We had to kill the entire tmux server, losing several other important sessions in the process.

Case 3: The Signal Cascade Disaster During a deployment, an operations engineer accidentally sent a SIGTERM to what they thought was a tmux session. Instead of gracefully shutting down one process, it triggered a cascade where the tmux server killed all sessions, terminating six different microservices simultaneously. The complex process hierarchy made it impossible to target just one service cleanly.

Resource Overhead and Process Hierarchy

Every tmux or screen session creates additional processes that consume system resources. More importantly, they create an unnecessary process hierarchy. Let's examine the actual resource cost:

# Process overhead for tmux session
$ ps -o pid,ppid,rss,comm --forest
  PID  PPID   RSS COMMAND
 1237     1  3420 tmux: server
 1238  1237  2180  \_ bash
 1239  1238 45230      \_ your_app

That's an extra 5.6MB of RAM just for the multiplexer overhead, plus additional file descriptors, pseudo-terminals, and kernel process slots. When you're running dozens of services, this adds up quickly. More critically, the process hierarchy complicates resource tracking and limits enforcement - cgroups and resource controllers have to work through multiple process layers.

Debugging and Monitoring Complexity

The deep process hierarchy creates serious debugging challenges. When a process misbehaves, tools like htop, ps, and system monitors show confusing parent-child relationships. Finding the actual resource consumer becomes a detective game:

# Which process is eating CPU? Good luck finding it quickly.
$ ps aux | grep -E "(tmux|your_app)"
ubuntu   1237  0.1  0.3  15234  3420 ?        Ss   10:15   0:01 tmux: server
ubuntu   1238  0.0  0.2  20144  2180 pts/0    Ss   10:15   0:00 bash
ubuntu   1239 99.5  4.4 451234 45230 pts/0    S+   10:15  45:32 your_app

Compare this to a systemd-managed process where the relationship is direct and clear:

$ systemctl status your_app
● your_app.service - Your Application
   Active: active (running) since Mon 2019-03-04 10:15:23 UTC; 45min ago
 Main PID: 1239 (your_app)
   CGroup: /system.slice/your_app.service
           └─1239 /opt/your_app/bin/your_app

Signal Delivery Problems

Signal handling becomes unpredictable in tmux environments. I've encountered situations where sending SIGTERM to a tmux session resulted in inconsistent behavior:

Sometimes the signal reaches the target process
Sometimes it kills the entire tmux session
Sometimes it gets lost in the process hierarchy
Sometimes it triggers a cascade killing unrelated processes

Here's a real example that bit us in production:

# Intended to gracefully restart one service
$ tmux send-keys -t myapp-session C-c

# What actually happened:
# 1. SIGINT sent to bash (PID 1238)
# 2. Bash forwards to foreground process group
# 3. Multiple processes receive signal simultaneously
# 4. Unpredictable shutdown order causes data corruption

The Lost Process Syndrome

One of the most frustrating tmux problems is "lost" processes. These occur when:

The tmux session dies but the child process continues running
The process becomes orphaned and reparented to init (PID 1)
You lose the ability to interact with the process normally
The process often continues consuming resources but becomes unmanageable

I've seen this pattern repeatedly in production environments:

# Process is running but "lost"
$ ps aux | grep your_app
ubuntu   1239  5.2  4.4 451234 45230 ?        S    10:15  45:32 your_app

# No tmux session exists anymore
$ tmux list-sessions
no server running on /tmp/tmux-1000/default

# Process is orphaned and uncontrollable
$ ps -o pid,ppid,comm 1239
  PID  PPID COMMAND
 1239     1 your_app

The only way to manage such processes is through kill signals, which often means ungraceful termination and potential data loss.

Process Cleanup and Zombie Issues

Terminal multiplexers can create zombie processes that are difficult to clean up. When a tmux session terminates abnormally, child processes might not receive proper termination signals. I've encountered servers with dozens of zombie processes all stemming from crashed tmux sessions:

$ ps aux | grep -E "(defunct|Z)"
ubuntu   1240  0.0  0.0      0     0 ?        Z    10:15   0:00 [your_app] <defunct>
ubuntu   1241  0.0  0.0      0     0 ?        Z    10:16   0:00 [worker] <defunct>
ubuntu   1242  0.0  0.0      0     0 ?        Z    10:17   0:00 [processor] <defunct>

These zombie processes consume process IDs and can eventually lead to PID exhaustion on busy systems. The parent tmux server might be gone, leaving no clean way to reap these zombies except a system reboot.

File Descriptor Leaks

Complex process hierarchies in tmux often lead to file descriptor inheritance issues. Child processes inherit file descriptors from the tmux server, which might include:

Network sockets from other tmux sessions
Log files from unrelated processes
Pseudo-terminal devices
IPC pipes and named sockets

I've debugged applications that mysteriously held file locks or network ports, only to discover they inherited these resources through the tmux process tree. This inheritance makes resource tracking nearly impossible and can lead to subtle bugs that only manifest under specific conditions.

The Multi-User Chaos

In team environments, tmux sessions become a source of chaos. Multiple developers create sessions with overlapping names, processes get started in sessions owned by colleagues who have left the company, and there's no clear ownership or lifecycle management. This is especially problematic on cloud servers where typically there's only one user account like ubuntu, making it impossible to determine who started which process:

# Whose process is this? Who can manage it?
$ tmux list-sessions
alice-dev: 3 windows (created Mon Mar  4 09:15:23 2019)
bob-training: 1 window (created Tue Feb 28 14:22:11 2019)
legacy-import: 5 windows (created Thu Feb 10 11:45:33 2019)

When Bob leaves the company, his training job keeps running indefinitely. When the legacy import process needs to be restarted, nobody knows what it does or how to safely stop it. On cloud instances where everyone uses the same ubuntuuser, there's no way to determine process ownership or responsibility.

Traditional monitoring tools struggle with tmux-managed processes. Process monitoring systems expect direct parent-child relationships with init or systemd. When your application is buried in a tmux hierarchy, monitoring agents often can't accurately track:

Resource usage attribution
Process lifecycle events
Restart counts and failure patterns
Log correlation and aggregation

Here's an example of how monitoring gets confused:

# Monitoring agent sees this process tree
$ pstree -p 1237
tmux: server(1237)───bash(1238)───your_app(1239)───worker(1240)

# But the monitoring config expects this
your_app(1239)───worker(1240)

# Result: Failed health checks and false alerts

The monitoring system might track the tmux server's resource usage instead of your application's, leading to meaningless metrics and missed performance issues.

The systemctl Alternative: Proper Service Management

Instead of fighting against the operating system's process management, I embrace it. Modern Linux distributions provide robust, battle-tested service management through systemd, and it's designed specifically for this use case.

Creating a Proper Service

Here's how I approach long-running processes. Instead of launching my application in a tmux session, I create a systemd service file:

[Unit]
Description=My Application
After=network.target
Wants=network.target

[Service]
Type=simple
User=myapp
Group=myapp
WorkingDirectory=/opt/myapp
ExecStart=/opt/myapp/bin/myapp --config /etc/myapp/config.toml
ExecReload=/bin/kill -HUP $MAINPID
Restart=always
RestartSec=5
StandardOutput=journal
StandardError=journal

[Install]
WantedBy=multi-user.target

This approach immediately provides several advantages that no terminal multiplexer can match.

Clean Process Hierarchy

With systemctl, your process hierarchy is clean and predictable:

$ pstree -p 1
systemd(1)─┬─systemd-journald(123)
           ├─systemd-networkd(124)
           ├─your_app(1239)───worker_thread(1240)
           └─other_service(1241)

# Direct relationship, no intermediate processes
$ ps -o pid,ppid,comm 1239
  PID  PPID COMMAND
 1239     1 your_app

This direct relationship with systemd (PID 1) eliminates the complex process hierarchy problems. There are no intermediate processes that can die and orphan your application. Signal delivery is direct and predictable. Resource tracking is accurate and straightforward.

Reliable Signal Handling

One of the most compelling reasons to use systemctl is proper signal handling. Instead of hoping your process survives terminal disconnections or multiplexer crashes, you design your application to respond gracefully to system signals:

SIGTERM: Graceful shutdown with cleanup
SIGHUP: Configuration reload without restart
SIGINT: Immediate but clean termination
SIGUSR1/SIGUSR2: Custom application behaviors

When your application properly handles these signals, you gain powerful operational capabilities. Need to reload configuration? systemctl reload myapp. Need to restart gracefully? systemctl restart myapp. The system handles the signal delivery reliably, and your application responds predictably.

Built-in Monitoring and Logging

Systemctl provides comprehensive process monitoring out of the box. Process crashes are automatically detected and logged. Restart policies can be configured with exponential backoff. Resource limits can be enforced. All of this functionality would require custom scripting and monitoring in a tmux-based approach.

The integration with systemd's journal means your application logs are automatically indexed, timestamped, and accessible through standard tools like journalctl. No more hunting through tmux session histories or forgotten log files.

Proper User and Permission Management

Services running through systemctl can be configured to run as specific users with minimal privileges. This follows the principle of least privilege and provides better security isolation. Compare this to tmux sessions that typically run with the permissions of whoever created them, often overprivileged administrative users.

Resource Control and Limits

Systemd provides sophisticated resource management through cgroups. You can limit CPU usage, memory consumption, and I/O bandwidth for your services. This level of resource control is impossible with terminal multiplexers and provides crucial protection against runaway processes.

Signal Handling: The Developer's Responsibility

The key to making this approach work is building applications that properly handle signals. This isn't just about systemctl—it's about creating robust, well-behaved processes that integrate cleanly with the operating system.

Graceful Shutdown with SIGTERM

Every long-running application should handle SIGTERM gracefully:

import signal
import sys
import time

def signal_handler(signum, frame):
    print(f"Received signal {signum}, shutting down gracefully...")
    # Close database connections
    # Finish processing current requests
    # Clean up temporary files
    sys.exit(0)

signal.signal(signal.SIGTERM, signal_handler)
signal.signal(signal.SIGINT, signal_handler)

This ensures that when systemctl stops your service, it shuts down cleanly rather than being forcefully killed.

Configuration Reloading with SIGHUP

SIGHUP handling allows for configuration changes without service interruption:

def reload_config(signum, frame):
    print("Reloading configuration...")
    # Re-read configuration files
    # Update runtime settings
    # Log the reload event

signal.signal(signal.SIGHUP, reload_config)

This enables systemctl reload myapp to update your application's configuration without dropping connections or losing state.

Operational Benefits

The operational advantages of this approach become clear in production environments:

Consistency Across Environments

The same service definition works in development, staging, and production. No need to remember different tmux session names or screen commands across environments.

Integration with Monitoring Systems

Monitoring tools like Nagios, Prometheus, or Datadog can easily check service status through systemctl. Alert systems can automatically restart failed services or escalate issues based on service state.

Automatic Startup and Recovery

Services configured with systemctl start automatically on boot and restart after crashes. This reliability is crucial for production systems and eliminates the manual intervention required with terminal multiplexer approaches.

Centralized Management

All services are managed through the same interface. System administrators don't need to learn different tools or remember various session management commands. Everything goes through systemctl.

When Terminal Multiplexers Still Make Sense

I'm not advocating for the complete elimination of tmux or screen. They excel in specific scenarios:

Interactive Development: When you need multiple terminal panes for coding, testing, and monitoring simultaneously
Temporary Tasks: Short-lived processes that benefit from terminal interaction
Remote Pair Programming: Shared terminal sessions for collaborative work
Emergency Debugging: Quick process inspection during incident response

The key is using the right tool for the right job. Long-running services deserve proper service management, not ad-hoc terminal solutions.

Making the Transition

If you're currently managing processes through terminal multiplexers, the transition to systemctl doesn't have to be abrupt:

Start with new services: Create proper service files for any new long-running processes
Identify critical processes: Move your most important services to systemctl first
Document service files: Create a standard template for your organization
Train your team: Ensure everyone understands basic systemctl commands
Gradually migrate: Move existing processes during maintenance windows

Conclusion

Terminal multiplexers are powerful tools, but they're not the right solution for every problem. When it comes to long-running processes, proper service management through systemctl provides superior reliability, monitoring, resource control, and operational consistency.

By embracing signal handling and designing applications that integrate cleanly with the operating system, we create more robust, maintainable, and scalable systems. The few minutes spent creating a service file and implementing proper signal handling pay dividends in operational simplicity and system reliability.

Your production services deserve better than living in forgotten tmux sessions. Give them the proper process management they need to thrive.

Why I say no to tmux

The Problem with Terminal Multiplexers for Process Management

The Fragility Factor

The PID Hierarchy Problem

Real-World Process Management Nightmares

Resource Overhead and Process Hierarchy

Debugging and Monitoring Complexity

Signal Delivery Problems

The Lost Process Syndrome

Process Cleanup and Zombie Issues

File Descriptor Leaks

The Multi-User Chaos

The Monitoring Blind Spot

The systemctl Alternative: Proper Service Management

Creating a Proper Service

Clean Process Hierarchy

Reliable Signal Handling

Built-in Monitoring and Logging

Proper User and Permission Management

Resource Control and Limits

Signal Handling: The Developer's Responsibility

Graceful Shutdown with SIGTERM

Configuration Reloading with SIGHUP

Operational Benefits

Consistency Across Environments

Integration with Monitoring Systems

Automatic Startup and Recovery

Centralized Management

When Terminal Multiplexers Still Make Sense

Making the Transition

Conclusion

2>&1 --> why no longer stderr

I deleted /usr/share accidentally

The Problem with Terminal Multiplexers for Process Management

The Fragility Factor

The PID Hierarchy Problem

Real-World Process Management Nightmares

Resource Overhead and Process Hierarchy

Debugging and Monitoring Complexity

Signal Delivery Problems

The Lost Process Syndrome

Process Cleanup and Zombie Issues

File Descriptor Leaks

The Multi-User Chaos

The Monitoring Blind Spot

The systemctl Alternative: Proper Service Management

Creating a Proper Service

Clean Process Hierarchy

Reliable Signal Handling

Built-in Monitoring and Logging

Proper User and Permission Management

Resource Control and Limits

Signal Handling: The Developer's Responsibility

Graceful Shutdown with SIGTERM

Configuration Reloading with SIGHUP

Operational Benefits

Consistency Across Environments

Integration with Monitoring Systems

Automatic Startup and Recovery

Centralized Management

When Terminal Multiplexers Still Make Sense

Making the Transition

Conclusion

2>&1 --> why no longer stderr

I deleted /usr/share accidentally

You might also like...