Bashing bash
I recently realized that my bash scripts tend to rot with every single line I add. The snippet below explains it well:
#!/usr/bin/env bash
# this script to be ran every 30 minutes
for f in $(find -name "*.json"); do
lockfile="$(basename $f).lock"
if [ -f "$lockfile" ]; then
continue
fi
touch "$lockfile"
# exposes $USERNAME and $PASSWORD
./json_to_envvars.py "$f"
# backgrounding so we can do many actions in paralell
./do_action.sh "$USERNAME:$PASSWORD" &
rm -f "$lockfile"
done
There are multiple pain points here.
- The for loop will bork on filenames with whitespace.
- there are workarounds using null byte and such; but who remembers what flags to use? Is it
-z
or-0
for find or xargs??? Nobody but man(1) knows
- there are workarounds using null byte and such; but who remembers what flags to use? Is it
- The subshell doesn't have proper quoting, this is the correct way
lockfile="$(basename "$f")" # yes, I'm vomiting
- Parsing json in bash is painful; so I tend to shell out to either a helper script (so we depend on python being installed, great) or jq (that questions my sanity after a while)
- What happens if any of these commands fail? Should we abort? Design team? Hello?
The single biggest mistake made here, however, is using the file system as the state. I see this tendancy all the time in my scripts. Imagine trying to debug this mess when the whole file system is your state machine. I understand why containers are so hot; they restrict persistent state to volumes.
I thought I was clever when I got my college degree. Yet here I am, writing scripts like these for a living. $lockfile
might not even be writable by the current process. Shit.
Also, if the parameter to ./do_action.sh
is to be kept confidential, you're screwed by lurking ps aux
cowboys. Such a silly thing, imagine using command line parameters for passing secrets. When people say C has footguns, they forgot to mention that those footguns are still present in other shapes in bash
What's the remedy? Just move to a real language. That way, you avoid
- for-loop shenanigans
- worrying about quoting every single instance of every variable in existance
- shelling out to a million different utility-scripts and nullifying every co-workers will to live after trying to follow the program flow
- a whole class of bugs related to writing state to the file system
- leaking confidential information, just by existing
The same logic, implemented in python:
#!/usr/bin/env python3
with open('locks.json') as f:
locks = json.load(f)
def lock(file):
locks[file] = True
def unlock(file):
locks[file] = False
for file in glob.glob("*.json", recursive=True):
with open(file) as f:
envvars = json.load(f)
if status[file]["lock"]:
continue
lock(file)
arg = f'{envvars["username"]}:{envvars["password"]}'
do_action(arg, background=True)
unlock(file)
with open('locks.json') as f:
f.write(json.dumps(locks))
Now that I look at it, it might actually be the case that I am not a great system designer quite yet. Because the python code above instead has race conditions if multiple instances of the main script is invoked. BUT! My talking points still stand as the rock in brock:
- for-loop shenanigans ✅ no more
- worrying about quoting every single variable 🙅♂️ never again
- shelling out to a million different utility-scripts 🐖 how about i smell ya later instead
- a whole class of bugs related to writing state to the file system 📁 ok but this was a design quirk, I just learned
- leaking confidential information just by existing 🤫 python function calls are very quiet by default
That leads me to thinking, how do you design a system to be resilient to dead-locks, whilst allowing parallel processing? The file system is inherently paralellizable (just write to another .lock-file) and has a single namespace.
Unless we daemonize the python script?
#!/usr/bin/env python3
with open('locks.json') as f:
locks = json.load(f)
while True:
next_run = datetime.now() + timedelta(minutes=30)
try:
do_the_for_loop(locks)
except Exception as e:
save_locks(locks)
raise e
save_locks(locks)
pause.until(next_run)
That's hot. I really like this architecture more. The state is written to disk on each run, successful or not. The script can be service-ified by systemd; so that it is restarted when it dies.
Is this how a design session usually goes? Cause I need more of those on $MY_COMPANY
then. Also, remember that I've only worked with corporate software since september last year; I'm still learning becoming a code monkey