Building argparsh - a shell agnostic argument parser
Argument parsing in the shell doesn’t have to suck, so I made it not suck.
Writing a robust argument parser and good help/usage text is like eating healthy. There’s a few really disciplined people out there who do it every day, but most of don’t really pay attention to it until our habits catch up to us, and we find ourselves facing the consequences of our past actions. This usually comes to me in the form of picking up a script that I wrote weeks, months, or even years ago and not remembering what the arguments were. I have to then read the code and try to work around how the arguments are consumed and what each argument represented. It takes what should be an easy task, and turns it into an annoying chore. Just like with healthy food habits, I don’t think it’s fair to just blame people. Often, a bad diet is indicative of regional food supply issues more than personal responsibility, and in this analogy those issues are poor devtools.
So, I decided to build argparsh. It uses python’s argparse
as the underlying implementation, which not only allows me to avoid reinventing the wheel, but also
uses an interface that many people are already familiar with. I’ve been using argparsh
in some
other projects and I’m very happy with what it can do.
Why not use getopts
?
There’s already a tool that claims to do what I want - getopts.
getopts
is a longstanding utility for handling argument handling, but in my opinion, it has not
aged gracefully. My main gripe is that options are not self documenting, positional arguments aren’t
easily described, and it’s still incumbent upon the user to provide their own usage/help text and
correct incorrect invocations robustly.
Designing a new stateful argument parser generator
I believe that the approach of getopts
, with a DSL as an argument to the parsing program is
flawed. Here’s some of my reasoning:
- There’s little room for users to get feedback on whether or not their invocation is correct with the DSL being embedded as a script.
- It’s hard to edit/compose existing parsers/fragments of parsers.
This lead me to explore other existing methods of argument parsing. What I found was two main approaches:
- Function Annotations
- Parser Objects
Annotations describes libraries like click which annotate functions to build some kind of transformation to treat command line arguments as arguments and invocations of functions. However, these kinds of libraries are deeply integrated with the language they’re built-upon and don’t easily lend themselves to being a parser that’s invoked as a subprocess.
import click
@click.command()
@click.option("value", help="some value")
def app(value):
print(f"{value=}")
if __name__ == "__main__":
app()
Parser Objects describes libraries where a stateful object that represents the argument parser is constructed over a series of method calls before being used to parse input arguments (e.g. argparse). This construction lends itself well to composition, as the method calls can be composed.
import argparse
parser = arpgparse.ArgumentParser()
parser.add_argument("value", help="some value")
if __name__ == "__main__":
args = parser.parse_args()
print(f"{args.value=}")
To me, this seemed like the correct interface for building an argument parser in a language agnostic way. The only problem was figuring out how to model the state.
My initial thought was to have an interface like the following:
parser=$(argparsh new $0)
parser=$(argparsh add_arg $parser value --help="some value")
...
eval $(argparsh parse "$parser" "$@")
In this approach, the idea was that each command would produce part of a python script:
~$ argparsh new foo
import argparse
parser = argparse.ArgumentParser("foo")
~$ argparsh add_arg "$parser" value --help="some value"
import argparse
parser = argparse.ArgumentParser("foo")
parser.add_argument("value", help="some value")
and then parse
would simply append parser.parse_args(CLI ARGS)
and the convert the resulting
value into something that could be evaluated by the shell. However, this seemed cumbersome, and I
realized that it was possible to get this to work without having to take in the “state” object for
every call:
~$ {
argparsh new foo
argparsh add_arg "$parser" value --help="some value"
}
import argparse
parser = argparse.ArgumentParser("foo")
parser.add_argument("value", help="some value")
However, this approach limits error checking that can be done in each call, but that seemed
acceptable to me. The next issue was that by outputting pieces of a python script, the variable
holding the state will have whitespace. This means that omitting quotes when using the variable will
cause weird error states for users. To workaround this, I instead serialized some values for each
command with pickle
, then URL-encoded those bytes to get a string with no spaces. Then I had each
command print &
followed by the URL-encoded data and end without printing a newline so that the
concatenated state variable could be a single string with no whitespace.
In this process, I found myself wishing that shells had some kind of standard interface for an object with methods to mutate internal state.
The final version of the tool look like this:
# Create a parser program
parser=$({
argparsh new $0 -d "argparsh example" -e "bye!"
argparsh add_arg "a" -- \
--choices "['a', 'b', 'c']"\
--help "single letter arg"
argparsh add_arg -i --interval -- --type int --default 10
argparsh add_arg -f -- --action store_true
argparsh subparser_init --required true
argparsh subparser_add foo
argparsh subparser_add bar
argparsh add_arg --subparser foo qux
argparsh set_defaults --subparser foo --myarg foo
argparsh add_arg --subparser bar baz
argparsh set_defaults --subparser bar --myarg bar
})
eval $(argparsh parse $parser --prefix arg_ -- "$@")
echo "Parsed args as shell variables:"
echo "[arg]: a="$arg_a
echo "[arg]: interval="$arg_interval
echo "[arg]: f="$arg_f
if [ "$arg_myarg" == "foo" ]; then
echo "[arg]: qux="$arg_qux
else
if [ "$arg_myarg" == "bar" ]; then
echo "[arg]: baz="$arg_baz
fi
fi
Conclusions
There’s other details that were involved in making this project not only work, but work robustly
enough for me to feel comfortable including this in other projects I maintain. To see more about
that, I recommend taking a look at the code here. In the
next post, I will write a bit about optimizing argparsh
, and why that became necessary.
Here’s some key takeaways I had while working on and using the initial prototype:
- Portable tools for building argument parsers are actually a very good idea
- Shells could benefit from objects
- Python as a means of augmenting the shell