Proxying to a subcommand with Go

There are a number of Unix commands that manipulate something about the environment or arguments in some way before starting a subcommand. For example:

  • xargs reads arguments from standard input and appends them to the end of the command.

  • chpst changes the process state before calling the resulting command.

  • envdir manipulates the process environment, loading variables from a directory, before starting a subcommand

Go is a popular language for writing tools that shell out to a subprocess, for example aws-vault or chamber, both of which have exec options for initializing environment variables and then calling your subcommand.

Unfortunately there are a number of ways that you can make mistakes when proxying to a subcommand, especially one where the end user is specifying the subcommand to run, and you can't determine anything about it ahead of time. I would like to list some of them here.

First, create the subcommand.

cmd := exec.Command("mybinary", "arg1", "-flag")
// All of our setup work will go here
if err := cmd.Run(); err != nil {
    log.Fatal(err)
}

Standard Output/Input/Error

You probably want to proxy through the subcommand's standard output and standard error to the parent terminal. Just reassign stdout/stderr to point at the parent command. Obviously, you can redirect these later.

cmd.Stdout = os.Stdout
cmd.Stderr = os.Stderr
// If you need
cmd.Stdin = os.Stdin

Environment Variables

By default the subcommand gets all of the environment variables that belong to the Go process. But if you reassign cmd.Env it will only get the variables you provide, so you have to be careful with your implementation.

cmd.Env = []string{"TZ=UTC", "MYVAR=myval"}

Signals

If someone sends a signal to the parent process, you probably want to forward it through to the child. This gets tricky because you can receive multiple signals, that all should be sent to the client and handled appropriately.

You also can't send signals until you start the process, so you need to use a goroutine to handle them. I've copied this code from aws-vault, with some modifications.

if err := cmd.Start(); err != nil {
    log.Fatal(err) // Command not found on PATH, not executable, &c.
}
// wait for the command to finish
waitCh := make(chan error, 1)
go func() {
    waitCh <- cmd.Wait()
    close(waitCh)
}()
sigChan := make(chan os.Signal, 1)
signal.Notify(sigChan)

// You need a for loop to handle multiple signals
for {
    select {
    case sig := <-sigChan:
        if err := cmd.Process.Signal(sig); err != nil {
            // Not clear how we can hit this, but probably not
            // worth terminating the child.
            log.Print("error sending signal", sig, err)
        }
    case err := <-waitCh:
        // Subprocess exited. Get the return code, if we can
        var waitStatus syscall.WaitStatus
        if exitError, ok := err.(*exec.ExitError); ok {
            waitStatus = exitError.Sys().(syscall.WaitStatus)
            os.Exit(waitStatus.ExitStatus())
        }
        if err != nil {
            log.Fatal(err)
        }
        return
    }
}

Just Use Exec

The above code block is quite complicated. If all you want to do is start a subcommand and then exit the process, you can avoid all of the above logic by calling exec. Unix has a notion of parent and children processes - when you start a process with cmd.Run, it becomes a child process. exec is a special system call that sees the child process become the parent. Consider the following program:

cmd := exec.Command("sleep", "3600")
if err := cmd.Run(); err != nil {
    log.Fatal(err)
}
fmt.Println("terminated")

If I run it, the process tree will look something like this:

 | |   \-+= 06167 kevin -zsh
 | |     \-+= 09471 kevin go-sleep-command-proxy
 | |       \--- 09472 kevin sleep 3600

If I run exec, the sleep command will replace the Go command with the sleep command. In Go, this looks something like this:

 | |   \-+= 06167 kevin -zsh
 | |     \--= 09862 kevin sleep 3600

This is really useful for us, because:

  • We don't need to proxy through signals or stdout/stderr anymore! Those are automatically handled for us.

  • The return code of the exec'd process becomes the return code of the program.

  • We don't have to use memory running the Go program.

Another reason to use exec: even if you get the proxying code perfectly right, proxying commands (as we did above) can have errors. The Go process can't handle SIGKILL, so there's a chance that someone will send SIGKILL to the Go process and then the subprocess will be "orphaned." The subprocess will be reattached to PID 1, which makes it tougher to collect.

I should note: Syscalls can vary from operating system to operating system, so you should be careful that the arguments you pass to syscall.Exec make sense for the operating system you're using. Rob Pike and Ian Lance Taylor also think it's a bad idea.

Liked what you read? I am looking for work.

Leave a Reply

Your email address will not be published. Required fields are marked *

Comments are heavily moderated.