For this round of Miscellaneous Findings, we have a bunch of ways to mess with text. They all use tools that come with Unix, so they should work without having to install extra junk, if you are working on a Unix-based OS. The last finding, using regexes in sed
, I found particularly useful for converting hundreds of JS files that used CommonJS (e.g. var something = require('something')
) to use ES Modules (import something from 'something'
)
This is a roundup of miscellaneous things that I’ve found out about (or have rediscovered). I take notes on findings regularly, and I put the findings that translate well to speech on my podcast, Small Findings. The rest (which are often technical findings), I put here. They’re not always written up for maximum comprehension as a blog post, but if anything is hard to understand, please email me if you need clarification.
Replacing all with sed
To replace all instances of a string in a directory tree with another string, do a find
for the file types you want to target, pipe that to xargs
to run sed
on the files it finds.
Example:
find . -type f \( -name '*.md' -o -name '*.js' -o -name '*.json' -o -name 'Makefile' \) | xargs sed -i "s/small-findings/smallfindings/g"
Where:
- md, js, json, and Makefile are the kinds of files in which the replacement should be made.
- xargs is telling
sed
to run with:- A regular expression that replaces
small-findings
withsmallfindings
- A list of files that is whatever the
find
command found.
- A regular expression that replaces
#unix #bash #programming #text
xargs
(I’ve seen xargs a lot in shell scripts I’ve used and had a hazy idea about what it did but only just now did I actually look it up.)
xargs
is a command that:
- Runs another command for you
- Converts stuff piped to it via stdin into command-line arguments for that other command
It’s a glue tool that’s necessary because:
- A lot of Unix commands communicate via stdin/stdout pipes
- Some do not
In that way, it’s like apply
in JavaScript which converts arrays into function arguments.
As an example, you can use it to pass the results (a bunch of filenames) of get-entries-in-date-range to cat
to mash up the results into a single file:
./tools/get-entries-in-date-range.sh 2020-03-28 | xargs cat > episode-2-script.md
#unix #bash
tr for replacing text
There is a Unix command called tr
. You pipe in input text and give it two arguments:
- The set of characters to replace
- The set of replacement characters
Then, it writes the result out to stdout.
The nice thing is that it works on multiline text, unlike sed.
So, you can use it in combination with sed to work around sed’s single-line limitations.
e.g.:
cat in.json | tr \\n @ | sed -e 's/\]@\[/,/g' | tr @ \\n > out.json
That line:
- Pipes the contexts of in.json
- Replaces line breaks with
@
. (Thereby making it a single line.) - Runs sed to replace instances which were originally
]\n[
with just a comma. - Reverses the first replacement. Replaces
@
with line breaks. - Writes the result to out.json.
So, if in.json happened to be a bunch of concatenated JSON arrays and looked like this:
[
"a",
"b"
]
[
"c",
"d"
]
[
"e",
"f"
]
(Which is not valid JSON.)
The above line would put this into out.json:
[
"a",
"b"
, "c",
"d"
, "e",
"f"
]
And that is valid JSON.
#tr #shell #unix #sed #text #bash
Regex in sed
If you use sed
without the -r
switch, it does support a sort of regex, but doesn’t support capture groups. If you do, you can do something like this to replace all instances of var something = require('somepackage')
with import something from 'somepackage'
in a file:
xargs sed -r "s/var (\w+) = require\('(.*)'\)/import \1 from '\2'/g" -i myfile.js
(You use \1
, \2
, et al to point to capture groups in the replacement clause instead of $1
, $2
.)
If you want to run that on every JS file in a directory tree, you can pipe the output of a find command that looks for all .js
files into xargs, which will run the sed command you give it and add each output from find
to the commands. It’s sort of like currying.
find . -type f \( -name '*.js' \) | xargs sed -r "s/var (\w+) = require\('(.*)'\)/import \1 from '\2'/g" -i
#command #shell #regex #unix