Unix command line conventions over time | Hacker News

Hacker News new | threads | past | comments | ask | show | jobs | submit

DanielBMarkham (43821) | logout

		Unix command line conventions over time (liw.fi)
		418 points by pabs3 4 days ago \| flag \| hide \| past \| favorite \| 215 comments

bfrankline 3 days ago | [–]

I was at Bell during the options “debate.” I think something that this otherwise wonderful article misses is that some believed that commands were never intended to be the only way to use the system as the shell was intended to be just one of the many user interfaces that Research Unix would provide. From that perspective, it was entirely reasonable to believe that if a command was so complex that it needed options than it was likely more appropriate for one of the other user interfaces. I believe this is unfortunately now understood by some as it shouldn’t exist. However, because of the seemingly instantaneous popularity of the shell and pipelining text it became synonymous with Unix. It’s a shame because McIlroy had a lot of interesting ideas around pipelining audio and graphics that, as far as I know, never materialized.

MontyCarloHall 3 days ago | | [–]

>From that perspective, it was entirely reasonable to believe that if a command was so complex that it needed options than it was likely more appropriate for one of the other user interfaces

What would the non-shell interface to commands for text processing pipelining (e.g. sort, cut, grep, etc., all of which absolutely need options to function) have looked like? Some people to this day believe that any text processing more complicated than a simple grep or cut should be done in a stand-alone script written in a domain-specific language (e.g. Perl or awk), rather than piping shell commands to each other.

Personally, I’m glad we got the best of both worlds—having command line tools with a dizzying array of options doesn’t preclude being able to accomplish the same tasks with a more verbose stand-alone script. It is often far faster to write shell pipelines to accomplish fairly involved tasks than to write a stand-alone script. The classic example is the (in)famous McIlroy vs. Knuth: six lines of McIlroy’s shell pipeline accomplished the same thing as dozens of lines of Knuth’s “literate program,” while being equally as understandable [0].

>It’s a shame because McIlroy had a lot of interesting ideas around pipelining audio and graphics that, as far as I know, never materialized.

I would love to hear more about this. The UNIX shell is amazing for pipelining (most) things that are easily represented as text but really falls flat for pipelining everything else.

[0] https://leancrew.com/all-this/2011/12/more-shell-less-egg/

svat 3 days ago | | | [–]

> six lines of McIlroy’s shell pipeline accomplished the same thing as

Common misconception. See https://buttondown.email/hillelwayne/archive/donald-knuth-wa... ("Donald Knuth Was Framed") and discussion at https://news.ycombinator.com/item?id=22406070 (etc).

MontyCarloHall 3 days ago | | | [–]

How is it a misconception? My overall point was that shell oneliners are often much faster to quickly bang out for a one-off use case than writing a full program from the ground up to accomplish the same thing. This is demonstrated to a very exaggerated degree in the Knuth vs. McIlroy example, but it also holds true for non-exaggerated real-world use cases. (I had a coworker who was totally shell illiterate and would write a Python script every time they had to do a simple task like count the number of unique words in a file. This took at least 10 times longer than someone proficient at the shell, which one could argue is itself a highly optimized domain-specific language for text processing.)

If your point is that the shell script isn't really the same thing as Knuth's program: sure, the approaches weren't algorithmically identical (assuming constant time insertions on average, Knuth's custom trie yields an O(N) solution, which is faster than McIlroy's O(N*log N) sort, though this point is moot if you use awk's hashtables to tally words rather than `sort | uniq -c`), but both approaches accomplish the exact same end result, and both fail to handle the exact same edge cases (e.g. accent marks).

svat 3 days ago | | | [–]

The task Knuth was given was to illustrate his literate programming system (WEB) on the task given to him by Bentley, which meant writing a Pascal program "from the ground up", and (ideally) the program containing something of interest to read.

If instead of writing a full program as asked, he had given some cop-out like “actually, instead of writing this in WEB as you asked, I propose you just go to Bell Labs or some place where Unix is available, where it so happens that other people have written some programs like 'tr' and 'sort', then you can combine them in the following way”, that would have been an inappropriate reply, hardly worth publishing in the CACM column. (McIlroy, as reviewer, had the freedom to spend a section of his review advertising Unix and his invention of Unix pipelines, then not yet as well-known to the general CACM reader.)

So while of course shell one-liners are faster to bang out for a one-off use-case, they obviously cannot accomplish the task that was given (of demonstrating WEB). (BTW, I don't want to too much repeat the earlier discussion, but see https://codegolf.stackexchange.com/questions/188133/bentleys... — on that input, the best trie-based approach is 8x faster than awk and about 200x faster than the tr-sort script.)

aleksiy123 3 days ago | | | | [–]

Seems like you should just go with what you you know best.

Taking 10x longer doesn't seem like a language problem. If you don't know bash well you're going to take even longer to do it in bash than in python.

In any case the task you described is pretty much the same in python as in bash. At worst the python is going to be more more verbose.

   python -c "print(len(set(w for l in list(open('test.txt')) for w in l.split())))"

vs

   tr ' ' '\n' < file_name | sort | uniq -c | wc -l

t43562 2 days ago | | | [–]

The shell's advantage is that of the pipeline components don't need to suck the whole file in so it can potentially operate on much larger files without running out of memory. I think only "sort" is problematic and at least it's a merge sort.

In Python you could use a generator but it would get a little more complicated and you'd still have to add all the words to set() but hopefully the number of different words is not that great.

The trie approach is quite memory efficient and that can matter.

aleksiy123 2 days ago | | | [–]

I'm fairly sure `open` is a generator and doesn't load the whole file into memory. So you wouldn't hit a memory error unless like you said the amount of unique words is high enough.

t43562 2 days ago | | | [–]

I think you're right but I believe that wrapping it in List(...) is where that would force the whole file into memory.

aleksiy123 1 day ago | | | [–]

Yeah, you're right, that's my mistake.

I think you can just omit it but yeah...

IshKebab 3 days ago | | | | [–]

> one-off use case

And fortunately nobody is foolish enough to think that shell scripting is robust enough to use for more than one-off uses! /s

MontyCarloHall 3 days ago | | | [–]

Hah, I have a friend who spent a large chunk of an undergraduate summer internship at Google porting a >50k line bash script (that was used in production!) to Python. It was not their most favorite summer, to say the least.

andi999 3 days ago | | | [–]

How does one do that? I mean you just can type 50k lines in 2.5 month if you type 1000 lines per day. Which sound a lot to me.

thyrsus 2 days ago | | | [–]

It's only possible if you can identify large portions of the 50k original lines as having been previously implemented by other components (python modules, microservices, etc.), or that large portions are dealing with cases that are guaranteed to no longer arise (so you either produce different results or error out if you detect them).

jlarocco 3 days ago | | | | [–]

> What would the non-shell interface to commands for text processing pipelining (e.g. sort, cut, grep, etc., all of which absolutely need options to function) have looked like? Some people to this day believe that any text processing more complicated than a simple grep or cut should be done in a stand-alone script written in a domain-specific language (e.g. Perl or awk), rather than piping shell commands to each other.

I have no idea what the original intention was, but I could see the interface being Emacs or Vi(m).

A workflow I use a lot in Emacs with eshell is to pipe command output to a buffer, manipulate the buffer using Emacs editting commands (including find/replace, macros, indent-region, etc.), and then run another shell command on it, write it to a file, or copy/paste it somewhere else.

It's not for every situation but it's a lot faster than coming up with a complicated shell "one-liner".

msla 3 days ago | | | [–]

The problem there is that you have to rethink your solution if you decide you want to turn your buffer manipulation into a reusable command. I like Emacs, but the easy transition from pipeline to shell script is a big point in pipelines' favor.

jlarocco 3 days ago | | | [–]

It's not really a problem, though. 9 times out of 10 shell one-liners are single use, and when they're not, I want something more readable than a one liner, anyway.

layer8 3 days ago | | | | [–]

A general tool for that workflow is vipe: https://github.com/juliangruber/vipe

mek6800d2 3 days ago | | | | [–]

I haven't looked at this in years, but IIRC Knuth's program could be built and run on almost any OS that had a Pascal (?) compiler, whereas McIlroy's solution obviously required a Unix-like shell, piping, and the necessary commands/tools.

Tijdreiziger 3 days ago | | | [–]

Interesting, any chance you could expand on these 'other user interfaces'? I'm not really familiar with Unix itself, but I've always considered Linux a shell-first OS (as opposed to Windows (NT), which I consider a GUI-first OS).

bfrankline 3 days ago | | | [–]

The other environment that is still popular today is the “statistical environment” that Rick Becker, Allan Wilks, and John Chambers created. It eventually became “S” and John would essentially recreate it as “R.” It’s a very nice environment for performing statistics and graphics together.

imwillofficial 3 days ago | | | | [–]

A shell is just how you interact with the underlying system. The gui is also a shell. Confusing I know!

edgyquant 3 days ago | | | [–]

E.g. the Windows GUI is the “Windows Shell,” not be confused with the Windows Command Line which is one (well, many) application within the shell.

daptaq 3 days ago | | | | [–]

I like to see it this way: A shell /wraps/ the kernel. You cannot issue system calls directly, but a program that handles user input generically (and ideally dynamically), can do this for you. A desktop environment, Emacs, and to an increasing degree web browsers are all different "shells".

tenebrisalietum 3 days ago | | | [–]

A shell is a program dedicated to allowing an operator to launch other programs. It can be as simple as a menu or as complex as a COM-interfaceable GUI with graphical desktop metaphor. It's often configured to but not strictly required to automatically launch on user login.

Any user-space executable can issue system calls, the shell isn't special there.

imwillofficial 3 days ago | | | | [–]

Like valence shells on atoms. I dig it

lousyd 3 days ago | | | | [–]

Maybe something like Jupyter? Text documents with sections of code or commands?

nmz 3 days ago | | | [–]

FWIW, you can pipe graphics (sort of) with arcan[1]. So its not like the ideas have been abandoned.

1. https://arcan-fe.com/

jancsika 3 days ago | | | [–]

> It’s a shame because McIlroy had a lot of interesting ideas around pipelining audio and graphics that, as far as I know, never materialized.

Do you know if those ideas are documented anywhere?

bfrankline 3 days ago | | | [–]

The most popular today outside of the shell environment is the statistical environment “S.” John Chambers would recreate it as “R” and I understand that it’s very popular and does a nice job of performing statistics and graphics together.

tambourine_man 3 days ago | | | [–]

Very interesting. We’d love to read more on other media pipelining and alternatives to the shell.

bfrankline 3 days ago | | | [–]

It was very primitive. It was essentially a mechanism for composing operations on vectors. I don’t know for certain but I would guess that it was inspired by IBM and their work on APL.

harry8 3 days ago | | | | [–]

Yeah. I wonder if it looks anything like gstreamer...

AlphaSite 3 days ago | | | [–]

I imagine powershell could do this quite well with it’s typed output.

(Not that you can’t just base raw bytes via unix shells but it’s awfully error prone).

pseudostem 4 days ago | | [–]

Unix history is always fascinating.

>None of this explains dd.

This one really made me think. I have never thought about dd before. Although I have wondered about other commands. Perhaps the author should also have added some command options not requiring a minus prefix, e.g. tar, ps.

The manpage [0] reads:

In the first (legacy) form, all option flags except for -C and -I must be contained within the first argument to tar and must not be prefixed by a hyphen (`-'). Option arguments, if any, are processed as subsequent arguments to tar and are processed in the order in which their corresponding option flags have been presented on the command line. In the second and preferred form, option flags may be given in any order and are immediately followed by their corresponding option argument values.

[0]: http://man.openbsd.org/tar

dredmorbius 4 days ago | | [–]

dd is a mainframe command, and its syntax follows the JCL conventions used there.

https://www.ibm.com/docs/en/zos-basic-skills?topic=concepts-...

smcameron 4 days ago | | | [–]

Huh. I'd always kind of guessed that dd used if=input-file of=output-file as a way to sort of prevent the use of shell globbing and to not rely on the order of arguments (as cp does) since dd as it is often used can be a bit dangerous (I often found myself using it with disk device files) and you want to be extra careful in specifying the input and especially the output files.

Edit: wikipedia agrees with your IBM origin story https://en.wikipedia.org/wiki/Dd_(Unix)#History

dredmorbius 4 days ago | | | [–]

The syntax is common to most JCL ("job control language") commands. I'd say "all", though there are probably exceptions.

For some reason, this seems to be vanishingly scarce knowledge in Linux / Unix circles.

There was a time in my life I knew how to spell JCL....

reaperducer 3 days ago | | | [–]

For some reason, this seems to be vanishingly scarce knowledge in Linux / Unix circles.

There was someone on HN a couple of weeks ago who was astounded to learn that "modem" means "modulator-demodulator." Like it was some kind of forbidden magic.

Things that you an I consider entry-level knowledge are like scrolls and rune stones to tech people these days.

onetom 3 days ago | | | [–]

I had a similar experience, when ppl didn't know that codec comes from coder-decoder... It happened, when I called a Clojure namespace x.y.codec, with encode and decode functions in them. I've also noticed how some of my colleagues hasn't realized, that the Rust serde library is short for serialize-deserialize.

dharmab 3 days ago | | | | [–]

I'm old enough to have used modems and never heard the long name before.

Stratoscope 3 days ago | | | [–]

You don't have to be a certain age to have used modems. You used one today to send that comment! Likely more than one.

Radio, cable, and fiber don't carry bits. They carry waves - analog signals - just like an old fashioned copper phone line. They all need a modulator-demodulator to convert between bits and waves.

imwillofficial 3 days ago | | | | [–]

I remember when the internet used to scream at me. I remember the proper name of a modem from my ancient CompTIA A+ classes I took as a teenager.

leokennis 3 days ago | | | | [–]

I started my IT career by accident on a mainframe system: JCL, COBOL, CICS, DB2.

When I made the switch to working on a JAVA application running on Linux I had a hard time accepting how easy a lot of stuff was.

mixmastamyk 3 days ago | | | | [–]

More in this area:

http://www.catb.org/esr/faqs/things-every-hacker-once-knew/

My70thaccount 3 days ago [flagged] [dead] | | | [–]

Please do not link to neo-Nazis.

thfuran 3 days ago | | | | [–]

For safety, selecting arguments distinguished only by choice of two letters adjacent on the most common keyboard layout seems unwise.

elvis70 3 days ago | | | | [–]

The parameter names don't seem to match those of dd.

If you type man dd, the function of the tool will be described as "copy and convert". I can't find the video but there is an interview in which Kernighan, I think, explains that the tool should have been named cc but it was not possible because it was already taken by the c compiler.

dredmorbius 3 days ago | | | [–]

The IBM webpage only covers a few of the JCL DD parameters.

There are more at pp. 486ff in Doug Lowe, MVS JCL (1994) here:

http://library.lol/main/6784CBC9EE9BBA5AD26993F497514661

It's still a rather rough fit, though I'm still pretty sure the connection exists.

Keep in mind that Unix dd dates to the early 1970s, and JCL itself may have evolved.

butlerm 3 days ago | | | | [–]

The name of dd(1) may have some tenuous connection to the JCL DD statement, but otherwise any similarity seems virtually non-existent. They don't even remotely do the same thing, none of the options or parameters are the same or have similar meanings, or anything like that.

https://man7.org/linux/man-pages/man1/dd.1.html

vincent-manis 3 days ago | | | [–]

My understanding is that the name dd was essentially a joke. A DD statement in JCL defined a logical name (seen by the program) to refer to a file (`dataset'). It had many strange operands, and often its behavior was completely unintuitive. For example, here is a job that deletes the file MYFILE.

  //MYJOB  JOB
  //FOOBAR DD DSNAME=MYFILE,DISP=(,DELETE)
  //STEP1  EXEC PGM=IEFBR14

IEFBR14 is a program that does nothing at all!

Although the Unix dd command wasn't patterned on the JCL command, I suspect that the multiplicity of possible options led its designers to choose the key=value option syntax that looked vaguely OS/360ish.

By the way, the - flag for options first appeared in MIT's CTSS, which was the direct ancestor (at least at the user level) of Multics.

dredmorbius 3 days ago | | | | [–]

I'd ... thought that more of the Unix arguments (block, unblock, conv, etc.) were supported on JCL, though my IBM link doesn't support that.

I'll see if I can find a more canonical / complete reference.

GuB-42 3 days ago | | | [–]

dd is fine, at least fine as a disk destroyer can be. It is not standard for UNIX but it is understandable.

tar is a bit messy but the worst is definitely ps. It has dash and non-dash options, and they don't have the same meaning and interact in a weird way.

Edit: the command is "ps", I somehow managed to mess up the most important word...

JadeNB 3 days ago | | | [–]

> tar is a bit messy but the worst is definitely the worst. It has dash and non-dash options, and they don't have the same meaning and interact in a weird way.

Sorry if it should be obvious, but "the worst is definitely the worst" … what is the worst? (Presumably not tar, unless I'm reading incorrectly.)

StringyBob 3 days ago | | | [–]

ps. By far the worst. Because it supports multiple syntax types I can never remember the ps command options. Even if you google for examples you get different syntax and end up with a mix of the same option letters doing different things in one command. It seems to lead to all sorts of subtly unexpected behaviors

mixmastamyk 3 days ago | | | [–]

    ps -ef | grep foo

Is what I use 98% of the time.

kQq9oHeAz6wLLS 3 days ago | | | [–]

ps aux then pipe to grep and/or cut. Or sort. Or...

qsdf38100 3 days ago | | | | [–]

ps afx

I like the forest (tree) mode.

lou1306 3 days ago | | | [–]

The best explanation for leaving dd in its original form is that it's not really wise to alter the behaviour of a tool that has been nicknamed "destroyer of disks", regardless of that hot new CLI convention.

teddyh 4 days ago | | [–]

> I believe the oldest program that uses subcommand is the version control system SCCS, from 1972

IIUC, other non-Unix operating systems commonly used the subcommand style for normal interactive use; it was the style used by the normal command shell, and all commands in those operating systems therefore followed this style. The operating system (or its shell) was responsible for enabling easy use of this style by adding completion support (commonly using the ESC key, IIRC). This older style can still today be seen in Unix tools like ftp and telnet, which uses this style for its internal command line interface. Another tool using this style, which many might be familiar with, is the multi-platform Kermit.

Unix therefore has been rather late in adopting this style, first in gradually adopting tools which uses the subcommand style (ip, git, etc.), and later with support in shells for programmable completion support where a program can optionally register completion hooks for the user shell to use, making it finally come up to the level of those older operating systems.

aap_ 4 days ago | | [–]

The chronology is wrong. dash-options came before pipes. The first edition already has them while pipes came in the third edition.

eesmith 3 days ago | | [–]

As supporting evidence, the Unix Manual from 1971, http://www.bitsavers.org/pdf/bellLabs/unix/UNIX_ProgrammersM... contains synopses like:

  du [-s] [-a] [name ...]
  ld [-usaol] name ]
  ls [-ltasd] name ...
  pr [-lcm] name

Maursault 3 days ago | | | [–]

Wow, that's the actual manual, first edition man pages.

kzrdude 4 days ago | | [–]

The GNU conventions are terribly convenient. If we were redesigning it from scratch, I'd like to keep all of them except for removing duplicated functionality. One of --long x and --long=x have to go, and that means we only can keep --long=x. And so on.

mort96 4 days ago | | [–]

I disagree with getting rid of `--long x`. It's really common to build command argvs programmatically, where you have a string representing some option (such as a file path), and you need to pass that as a long option. With `--long x`, you can do like:

    char *argv[] = {"my-program", "--config-file", config_file, NULL};
    run_command(argv);

With only `--config-file=x`, you would have to allocate memory for the option:

    char *config_file_opt = asprintf("--config-file=%s", config_file);
    char *argv[] = {"my-program", config_file_opt, NULL};
    run_command(argv);
    free(config_file_opt);

And it becomes even more hairy if you A) want to avoid heap allocation in most cases or B) want to use standard functions rather than asprintf. You'd have to do something like this:

    char config_file_opt_buf[1024];
    char *config_file_opt;
    size_t config_file_opt_len = snprintf(config_file_opt_buf, sizeof(config_file_opt_buf), "--config-file=%s", config_file);
    if (config_file_opt_len >= sizeof(config_file_opt_buf)) {
        config_file_opt = malloc(config_file_opt_len);
        snprintf(config_file_opt, config_file_opt_len, "--config-file=%s", config_file);
    } else {
        config_file_opt = config_file_opt_buf;
    }

    char *argv[] = {"my-program", config_file_opt, NULL};
    run_command(argv);
    if (config_file_opt != config_file_opt_buf) {
        free(config_file_opt);
    }

And even that ignores any kind of error checking of snprintf or malloc which may fail. I also wouldn't even be comfortable assuming the code I just wrote is correct, without thinking long and hard about it and reading standards/documentation thoroughly. Whereas the first example is so simple it's obviously correct.

I think we should keep `--long x`. If anything, `--long=x` is the one which should go. There will always be ambiguity; the meaning of something like `-ab` also depends on whether `-a` takes an argument or not.

Really though, I think there are enough situations where `--long=x` is useful that I think keeping both versions makes sense.

mike_hock 3 days ago | | | [–]

You're about to spawn a new process, you can't honestly be worried about a string allocation.

The fact that this is cumbersome in C is fixed by fixing C, not by forcing a pattern onto the ecosystem that only makes sense to work around C's deficiencies.

mort96 3 days ago | | | [–]

I don't think C makes this especially cumbersome, most languages make it somewhat complicated to do format-printing (or string concatenation) into a stack-allocated buffer if it's big enough or fall back to heap-allocating a buffer if the stack buffer is too small.

The argument that you should just heap-allocate because it usually doesn't matter in contexts where you'd spawn a process anyways is a much better argument though. Still, I find the `{"whatever", "--config-file", config_file}` approach much more elegant.

woodruffw 3 days ago | | | | [–]

I'll add another minor reason: `--foo="bar"` (with quotes) relies on the quoting/splitting behavior of the shell, which is easy to forget.

For example, someone might write:

    args = [cmd, f'--foo="{foo}"']

and expect the surrounding quotes to be stripped off, but that'll only happen if the spawn goes through a shell first. Some argument parsers may also try to strip quotes off, but I don't believe there's any consistent behavior guaranteed around that.

jbverschoor 3 days ago | | | | [–]

If all of this matters so much, don’t use an (interactive) shell.

mort96 3 days ago | | | [–]

We're not talking about the shell, we're talking about the options parser. Nothing in my examples invokes any shell.

jbverschoor 3 days ago | | | [–]

Sure, but who calls your main()? Usually when you're in a shell, which in turn calls exec and copies over some data.

If you're that worried about performance about a few string allocations, you shouldn't be passing around strings anyway, shell or not.. And simply call functions from the same process and use for example the file descriptors you already have.

You could also just simply pass a binary blob (messagepack) as one of the arguments, if that's your thing.

sph 3 days ago | | | [–]

I am annoyed go's flags package did not respect the GNU convention for options, which I agree is the best.

daptaq 3 days ago | | | [–]

I am not too surprised that the confluence point between Plan 9 and Google have little interest in following the GNU way.

0des 3 days ago | | | [–]

Plan 9 did nothing wrong.

euroderf 3 days ago | | | | [–]

Try github.com/spf13/pflag, it handles double-dash arguments OK, and is drop-in.

conaclos 4 days ago | | | [–]

> that means we only can keep --long=x

Just wondering: why `--long=x` over `--long x`?

grose 4 days ago | | | [–]

`--long x` is potentially ambiguous. `x` could be referring to a file if `--long` doesn't take any options, so parsing it correctly depends on this knowledge. The equals sign removes this ambiguity.

twic 3 days ago | | | [–]

If we were starting from scratch, we could require some marker on flags which take values. Perhaps it would have to be `--long: x`. `--verbose x` would be read as a flag with no argument, followed by a positional argument.

Karellen 3 days ago | | | [–]

Isn’t that just replacing “=“ with “:”?

twic 3 days ago | | | [–]

No, there's a space in there too!

jolmg 3 days ago | | | [–]

This works to allow values to be optional while not requiring concatenation of strings in languages where that's troublesome like C.

It only lacks in being able to work with brace expansion to facilitate specifying multiple values.

kzrdude 4 days ago | | | | [–]

--long=x supports the simultaneous --color and --color=auto usecase, i.e the value is optional. Without = it's ambiguous.

jolmg 4 days ago | | | | [–]

It also has the advantage that it works with brace expansion to specify multiple values.

--long={foo,bar}

becomes

--long=foo --long=bar

conaclos 4 days ago | | | [–]

With bash/zsh the auto-completion seems missing for unknown options. While `--key file` supports auto-completion since file is just another parameter.

jolmg 4 days ago | | | [–]

Yup. Both styles have their advantages, which is why they're both common.

For the problem you mention, I normally have them separate while I use completions, then add the = sign after.

kzrdude 3 days ago | | | | [–]

that's a tool issue that should be fixed

kzrdude 4 days ago | | | | [–]

Thanks for teaching something useful!

conaclos 1 day ago | | | [–]

Actually you can expand any pattern:

  --option{X,Y}   ->  --optionX --optionY
  --option:{X,Y}  ->  --option:X --option:Y
  ...

layer8 3 days ago | | | [–]

I’m curious why they introduced the ‘=‘ variant at all. Maybe they thought it would be clearer for human readers?

megous 3 days ago | | | [–]

Sometimes I'm glad for public interface ossification. :)

inopinatus 4 days ago | | | [–]

Still waiting for su —-with-wheel-group

euroderf 3 days ago | | | [–]

I come not to bury wheel, but to praise it.

Maursault 3 days ago | | | [–]

An industry big wheel, a wheel man with nice wheels, put shoulder to the wheel to put wheels in motion spinning wheels to reinvent the wheel before the wheels came off the wagon and squeaky wheel gets the grease. My name is Jonas. I'm carrying the wheel.

euroderf 2 days ago | | | [–]

I confess that back in the day, whenever I tried to grok a man page or manual page about wheel, it was basically incomprehensible.

nohackeratall 4 days ago | | [–]

And then you have ImageMagick where the order of flags and options is a science by itself.

sph 3 days ago | | [–]

Also ps. Everyone uses their own incantation and never strays from it. I use `ps aux` and used to use `ps -def` on Solaris. Some options require a dash, some don’t, it’s such a confusing CLI I never bothered to learn more.

EDIT: also why is `ps` default output so useless? With no arguments it only prints a couple processes and that's it. I have no idea what it's supposed to show me. Talk about bad UX.

floren 3 days ago | | | [–]

Regarding default ps output: remember how people were using Unix in the old days. Logged in via a text terminal on a multiuser system, 99% the only processes you care about are those associated with the current terminal, and that's what ps shows by default. You might have half a dozen backgrounded programs running in that terminal, but that's about it.

Now, my X sessions always have a dozen xterms running and plain ps isn't very useful, but it might break scripts if they changed it...

veltas 3 days ago | | | [–]

Yeah this is the problem, the 'terminal' used to be the box/machine in-front of you. Now a 'terminal' is pretty much just a window or tab most of the time.

spijdar 3 days ago | | | | [–]

In case you're curious, the differences basically stem from BSD and Solaris/POSIX having different command syntaxes, and GNU/Linux's ps command trying to implement both. `ps -ef` is the Solaris syntax, while `ps aux` is from BSD.

kps 3 days ago | | | [–]

Almost. `ps ax` was original Unix form (BSD did add `u`, and later accepted `-` on the flags). `ps -ef` was AT&T System III and descendants.

kitschyred 3 days ago | | | | [–]

> Talk about bad UX.

I think you could say this for CLIs in general. While we can all agree that the CLI is great for tasks that require repetition; it's just a PITA for any one-time task. When you just need to do something once and as soon as possible, the manual-help-type-run loop gets tiresome quickly.

Passing options to a program so it can do what most users would want it to do in the first place is nonsensical, and yet that describes the behaviour of many if not all of *nix's tools. It should be the other way around: do what is expected by default, and provide options for different, more uncommon use-cases. Though I suppose this is also a form of baggage from the past that it's very difficult to get rid off because of backwards compatibility.

codethief 3 days ago | | | | [–]

Have you tried `ps faux`? :)

wchar_t 4 days ago | | | [–]

Indeed, I passionately hate programs where the order of the options and files/arguments matter just as much as the options themselves.

zvr 3 days ago | | | [–]

Given that ImageMagick can describe a sequence of operations (much like commands piped together), the order is obviously significant.

The exact same holds for ffmpeg command-line.

loudmax 3 days ago | | | | [–]

The `find` command is extremely useful, but user-friendly, it is not.

  $ find -type f Data/
  find: paths must precede expression: `Data/'
  find: possible unquoted pattern after predicate `-type'?

sigh

chengiz 3 days ago | | | [–]

I think Imagemagick and maybe git are where I require a web search most often, along with hope that someone has had the same need before.

jbverschoor 4 days ago | | | [–]

Well magick is more a single live scripting language with multiple paths. While it works, I think it’s the wrong tool for the job

nocman 3 days ago | | | [–]

So you are saying Imagemagick is the wrong tool for any job then? (I ask this because the OP didn't reference any particular task). I would disagree with that. Like many media-related command line tools, it does a lot and therefore has a fairly involved command line interface, but I have found it very useful over the years for processing multiple images (and sometimes even just individual ones).

professorsnep 3 days ago | | | [–]

ffmpeg too! Been doing a lot with it lately and I barely even feel like I have scratched the service of the ffmpeg argument science.

protomikron 3 days ago | | [–]

Interesting is also find. It took me some time to accept that the directory must come first.

E.g.

  find /path/to/dir -type f

works, while

  find -type f /path/to/dir

does not.

mort96 3 days ago | | [–]

Probably helps to think of find as its own weird pipeline rather than a command with options. The first element of the pipeline is the directory(/directories), then `-type f` filters out all the non-files, then `-name '*.txt'` filters out everything which doesn't end in ".txt", etc. That also helps to remember why `-maxdepth` has to be right after the directory; it affects which files even become part of the pipeline, it doesn't make sense as a pipeline element.

The clearest proof that it's a pipeline is probably how it interacts with `-exec`: `find . -type d -exec echo {} \;` will print only directories, while `find . -exec echo {} \; -type d` will print both files and directories.

Regardless though, I sure wish they would let you specify the directories at the end.

yourad_io 3 days ago | | | [–]

Very insightful, I never realized this. Thank you

chubot 3 days ago | | | [–]

I wrote a post about how to understand find syntax:

find and test: How To Read And Write Them

https://www.oilshell.org/blog/2021/04/find-test.html

tl;dr The args AFTER the root dir args form a boolean expression language -- I call it the "I'm too lazy to write a lexer" pattern

    find /tmp -type -f -a -name 'foo*'

test is the same way, e.g.

    test -f /tmp -a -f /bin

michaelcampbell 3 days ago | | | [–]

    find /tmp -type f -a -name 'foo*'  # no "-" for the -type

this is bitten me more often than I'd care to admit.

chubot 3 days ago | | | [–]

Oops, good point! Gah

I honestly wonder where this wonky syntax came from ... i.e. was it a Bell Labs thing or later

mmphosis 3 days ago | | | [–]

I just installed tesseract-ocr and it has a similar convention:

  tesseract imagename outputbase [options...] [configfile...]

tomxor 3 days ago | | | [–]

Yup, always seems to take a bit more conscious effort to remember that detail for find.

teddyh 4 days ago | | [–]

> GNU also added standard options: almost every GNU program supports the options --help, --version

GNU keeps a list of them, so that anyone implementing a similar functionality should use the same option if it already exists:

https://www.gnu.org/prep/standards/standards.html#Option-Tab...

somat 3 days ago | | [–]

I have an opinion on this, it is stupid but the more I think about it the more I am convinced I am right.

getopt is lipstick on a pig.

dashed argument are ugly and hard to read.

The worst offenders, programs that implement a language in the args. if you do this just build a proper parser and leave off the dashes. the worst offenders.

iptables find megacli

megacli, remember that? it was the control application for lsi megaraid cards, it was the program that turned me off getopt style args. when I figured out that the megacli parse let you drop the dashes(and the obnoxious capitalization) it was like a breath of fresh air in a fart filled room.

So a better way to do args? use dd style args. Yes dd, dd had it right all this time, I find "key=value" style args to be superior to "--key value" and don't even get me started on the absurdity of "--key=value"

codeulike 3 days ago | | [–]

I thought Powershell syntax was needlessly confusing until I started writing bash scripts, now I see that ps wasn't so bad.

But I accept why linux command line is the way it is - its a shark, not a dinosaur

jiehong 4 days ago | | [–]

And then there is also the case of CLIs that aren’t written in C: they don’t get the benefits of a unified getopts lib.

One oddity not explained is tar allowing multi options crammed together without any dash, that nobody can ever remember (eg. tar xvf archive.tar)

LeoPanthera 4 days ago | | [–]

> One oddity not explained is tar allowing multi options crammed together without any dash, that nobody can ever remember (eg. tar xvf archive.tar)

This is a BSDism, and it's deprecated even there. bsdtar allows it for compatibility with old scripts.

GNUtar has long-switches for all options, which makes it considerably easier to use and understand, for example:

gtar --create --file etc.tar --verbose /etc

nsajko 4 days ago | | | [–]

> This is a BSDism

No, it was like that when Tar was first introduced, as part of Seventh Edition Unix:

https://man.cat-v.org/unix_7th/1/tar

https://en.wikipedia.org/wiki/Tar_(computing)

eesmith 3 days ago | | | [–]

See also "tap" from v1, to "manipulate DECtape", described at http://www.bitsavers.org/pdf/bellLabs/unix/UNIX_ProgrammersM... , p94 of the pdf:

> tap [key] [name ...]

> The function portion of the key is specified by one of the following letters ...

For example, "x" to extract.

> The following characters may be used in addition to the letter which selects the function desired

For example, "v" for verbose.

elteto 3 days ago | | | | [–]

> that nobody can ever remember

These have gone a long way for me:

  tar xzf <in>       eXtract Zipped File
  tar czf <out> <in> Create Zipped File

Anything more complicated and I have to google.

Macha 3 days ago | | | [–]

The `-z` on extract is not needed on basically most modern tar implementations (OpenBSD's is I believe the one outlier). tar -xf foo.tar.gz (or .xz, .bz2, .zst, etc.) will work, auto detect the archive type and extract.

1vuio0pswjnm7 3 days ago | | | [–]

OpenBSD, which is a fork of NetBSD, is not the only "outlier".^1

For many years, NetBSD tar has autodetected bzip2 compression.

   tar xzf 1.tar.bz2

will work on gzip as well as bzip2. Whereas GNU tar still requires "j" instead^2

   tar xjf 1.tar.bz2

1. For example, FreeBSD or MacOS tar is BSD tar. It will autodetect bzip2 compression.

2. The GNU tar included with VoidLinux still requires z or j.

The pax(1) utility is the POSIX solution to these incompatibilities.

Macha 3 days ago | | | [–]

Void Linux appears to include GNU tar 1.34

The following should work, it worked on my Arch Linux install (but note that auto detection is not a "new" feature to GNU tar, I've been using it for at least 5 years)

    echo "Hello" > foo.txt
    tar cjf foo.tar.bz2 foo.txt
    rm foo.txt
    tar xf foo.tar.bz2
    cat foo.txt

and produce "Hello"

To be clear, I'm not saying the following should work

    tar xzf foo.tar.bz2

What should work is:

    tar xf foo.tar.bz2

And for proof this is GNU tar

    $ tar --version
    tar (GNU tar) 1.34
    Copyright (C) 2021 Free Software Foundation, Inc.
    License GPLv3+: GNU GPL version 3 or later <https://gnu.org/licenses/gpl.html>.
    This is free software: you are free to change and redistribute it.
    There is NO WARRANTY, to the extent permitted by law.

    Written by John Gilmore and Jay Fenlason.

1vuio0pswjnm7 3 days ago | | | [–]

If I have a script I want to work on both BSD and Linux, and I do want to rely on uname, I usually do something like

   bzip2 -dc 1.tar.bz2|tar xf -

yourad_io 3 days ago | | | | [–]

Oh this is great, I remember z for gzip but always have to look up J vs j for bzip2/xz

cb321 3 days ago | | | [–]

With https://github.com/c-blake/nio/blob/main/utils/catz.nim you can get similar format agnostic decoding/decompression not just in tar but in any pipeline context based on magic numbers, not filename extensions and even doing the copy loop needed for unseekable inputs to replace the early read -- e.g. cat foo.gz|catz|less works..

aendruk 3 days ago | | | | [–]

No need for the mnemonic device when you can literally type “--extract --gzip --file”, which reads as a sentence and has no trouble adapting to --xz or --zstd.

kergonath 3 days ago | | | | [–]

> gtar --create --file etc.tar --verbose /etc

That’s considerably more cumbersome than tar -cvf.

teddyh 4 days ago | | | [–]

Or ps, with its BSD and SysV variants. On Linux, the most widely used version of ps uses BSD semantics when using options without a dash, and SysV semantics for dashed options. Thus, both BSD “ps aux” and SysV “ps -fel” work.

1vuio0pswjnm7 3 days ago | | | [–]

"... that nobody can ever remember."

Perhaps it comes down to what programs one uses routinely. Typing "tar xzf" multiple times per week I am unlikely to ever forget it.

teddyh 4 days ago | | [–]

> Initially, GNU used the plus (+) to indicate a long option, but quickly changed to a double dash (--).

IIRC, they liked the plus, but changed to use the double dash to be POSIX compatible (which presumably requires options to be preceded by a dash).

jolmg 4 days ago | | [–]

I like the use of + to mean the opposite of single dash short options, like with shell options, e.g. `set -e` vs `set +e`.

For long options, there's the convention of `--no-foo` meaning the opposite of `--foo`, but `+e` hasn't really caught on for short options besides the shell ones.

Being able to negate options, letting the last one win, is useful to be able to define default options in aliases. For example, it might be useful to be able to do this:

  $ alias less='less -X'
  $ less +X foo.txt

or

  $ alias grep='grep -n'
  $ grep +nH -r foo *

mort96 3 days ago | | | [–]

I find it really confusing when `+x` is used to mean the opposite of `-x`. Usually, the `-x` option means "enable something", which is fine when you think of the "-" as a dash, but introducing `+x` encourages you to think of them as minus and plus; suddenly, in your command's language, "minus x" means "enable x" and "plus x" means "disable x", which makes very little sense.

The set command is a great example, "minus e" means "enable error checking" and "plus e" means "disable error checking". That doesn't make sense, that's not what minus and plus means.

jolmg 3 days ago | | | [–]

Yeah, but beyond that, I can't see any disadvantage to it. It's practical, relatively intuitive syntax. + isn't conventionally used for anything else, being partly conventional for this already, and it works to combine multiple options into one.

The fact that they seem backwards seems like something that's easy to get used to over time. It's like electrons being negatively charged.

jolmg 2 days ago | | | [–]

> Usually, the `-x` option means "enable something", which is fine when you think of the "-" as a dash, but introducing `+x` encourages you to think of them as minus and plus;

To get used to it, one could think of it as a crossed-out dash instead of a plus. Crossed-out means that the dash is disabled.

euroderf 3 days ago | | | | [–]

Also "nice".

yourad_io 3 days ago | | | [–]

Consider the semantics: niceness (relinquishes resources) as the opposite of meanness (demands resource) in a multi user system

> This is why the nice number is usually called niceness: a job with a high niceness is very kind to the users of your system (i.e., it runs at low priority), while a job with little niceness hogs the CPU. The term "niceness" is awkward, like the priority system itself. Unfortunately, it's the only term that is both accurate (nice numbers are used to compute the priorities but are not the priorities themselves) and avoids horrible circumlocutions ("increasing the priority means lowering the priority...").

Jerry Peek et al. in Unix Power Tools (O'Reilly, 2007, p. 507)

via https://stackoverflow.com/questions/14067128/why-are-nicenes...

onetom 3 days ago | | [–]

It's interesting, that no one questioned the use of double dash itself.

It looks especially silly, when long options are used to encode some mini programming language (DSL), like in imagemagick and ffmpeg, or when they are used to represent some associative data, like in case of the AWS CLI.

If -- would be a : AFTER the option name, it would look like Objective C named arguments or set-words in REBOL and Red.

If -- would be a : BEFORE the option name, it would look like keywords in Lisps.

As a side-note, shells got quite evolved, actually, but the momentum of the mediocre bash-line prevailed, sadly.

I urge people to learn about es, the Extensible Shell: https://wryun.github.io/es-shell/

Its ancestor, the Plan 9 rc (which I think stands for Run Command) and it used "; " as a prompt, so u could combine multiple commands easier, by copy-pasting whole lines, instead of fiddling with the exclusion of the varying length prompt.

Both of these shells have a lot better string quoting rules than bash, so they play nicer with filenames containing special characters.

They also follow the unix philosophy closer, by not necessarily providing line editing capabilities, which can be added by running them thru eg. rlwrap or not really needed, when running them from Emacs or Plan 9's acme environment.

I would also highlight, that es is just 163KB, while bash is 1.3MB, yet it provides a magnitudes more powerful, dynamic, garbage collected, functional programming language!

theamk 3 days ago | | [–]

The double-dash is selected for pure practicality.

Any option character excludes a set of inputs, and we want this set to be as small as possible. It is already hard to look at file named "-f" -- do you want to also make accessing files named ":f" (or even worse, "f:") harder? Thus the desire to keep the first "-".

Now, they could have used "-:" or "-=" for long options.. but I think "--" won because it is so easy to type -- no need to move your fingers.

Using long options to encode DSL, like imagemagick and ffmpeg, can be pretty silly.. But those programs are exceptions, and most of the option usage is of much simpler variety. Like "grep --label=input --color=always".

The comments about es sound weird... You can set ";" as prompt in sh/bash too. I would not call its quoting "nicer" -- yes, using '' for a single quote is nice, but lack of " support sounds like a bad omission. Yes, bash is pretty fat by 1990's standards.. but if you want a real programming language with no line editing capabilities, no need to reach for obscure shells -- grab perl, python, tcl, lua or any of the other scripting languages.

onetom 3 days ago | | | [–]

I understand, that the double-dash was a logical step from a single-dash. My problem is, what we ended up with, is really not great. It's a typical case of the slowly boiled frog...

rc and es is an good example of what improvements can be achieved, if we challenge historical decisions and get rid of the not so great ones.

The "; " prompt was just a historical side-note, because I saw someone asking about it the other day on twitter: https://twitter.com/thingskatedid/status/1316081075043463170...

By the quoting rules, I meant, that if you have a variable (eg v='a b') with a value containing whitespace, when you reference it ($v), then you don't have to worry about quoting it in any way, because the space in its value won't cause any problems.

onetom 3 days ago | | | | [–]

btw, Rebol solved the mixing of options and filenames by introducing a syntax for file references; u prefix them with a percent sign, eg.:

%/c/plug-in/video.r

%//sound/goldfinger.mp3

%"/c/program files/qualcomm/eudora mail/out.mbx"

%c:/docs/file.txt

%\some\cool\movie.mpg <= understood backslashes too, so it worked in DOS/Windows too!

%cool%20movie%20clip.mpg

Source: http://www.rebol.com/docs/core23/rebolcore-12.html#section-2...

then you could mix them with `:get-words` and `set-words:`. Rebol also had an interesting take on options - with and without arguments -; it called them refinements and you would just stack them with slashes onto the "command". the refinement arguments were added to the end of the parameter list though, so it was not exactly obvious which argument belonged to which refinement, but in practice it worked quite nice, surprisingly:

str: "test"

insert/dup/part str "this one" 4 5

print str

=> this this this this test

and reversing /dup and /part:

str: "test"

insert/part/dup str "this one" 4 5

print str

=> thisthisthisthisthistest

Source: http://www.rebol.com/docs/core23/rebolcore-9.html#section-2....

ofrzeta 3 days ago | | [–]

It's super annoying to have a command that was called with "-h" output "please type --help for help". And there are lots of it.

xphos 3 days ago | | [–]

Not sure if anyone has brought this up but I think command process in some sort flow chart similar to GNUradio but for command line would be really cool.

There is this problem I face really often with that I semantically know all the small operation I want to run on command line but some of them have branching element in them. Often I just want to reconverge my data and run more commands. As one can imagine your writing in a functional way acting on raw data always but it's hard for other people to see a shell command or a bash script and really internalize the underlying operation.

It would be really fancy if you could take in a command line opt and generate a corresponding flow diagram too.

kevin_thibedeau 3 days ago | | [–]

GCC doesn't actually follow GNU conventions since it has both single hyphen and double hyphen long options.

mort96 3 days ago | | [–]

I suspect GCC thinks of its options as mainly single-option short options and double-hyphen long options. It's reasonable to interpret things like `-funsafe-math-optimizations` to mean option `-f` with argument `unsafe-math-optimizations`, just like how `-Dfoo` means option `-D` with argument `foo`.

Unless you were thinking about something else.

kevin_thibedeau 3 days ago | | | [–]

Just go through the docs. They have a ton of single hyphen long opts that aren't a letter with argument.

mort96 3 days ago | | | [–]

Huh, you're right. I had never noticed options like `-gen-decls` or `-nostdinc` before. I also should've thought of `-static`. Yeah, that's weird.

Also, I just noticed: they have `-nostdinc` for C, but the C++ equivalent is `-nostdinc++`, which looks like "no stdin C++".

theden 4 days ago | | [–]

There are alternatives to dd like ddrescue (https://wiki.archlinux.org/title/disk_cloning#Block-level_cl...), but given it's standard on all *nix machines it's hard to avoid, just need to be extra careful before executing. There are also wrappers for dd, like ddi (https://github.com/tralph3/ddi) for extra safety

debdut 4 days ago | | [–]

> --mail=ADDR

> The --email bit is a joke.

what's the context of the joke?

kwijybo 4 days ago | | [–]

Zawinski's Law of Software Envelopment, also known as Zawinski's Law, states: Every program attempts to expand until it can read mail.

nonrandomstring 4 days ago | | | [–]

Caught me out!

I assumed it meant to send the output of the command as mail. That's vaguely useful instead of piping to a cmdline MUA. I couldn't believe I'd been missing this for 30 years and so I tried it on few things like

ps aux --mail=me@domain

error: unknown gnu long option

Looked in my inbox. Nothing.

Then I got to the bottom of the page for the punchline.

grrrr

sph 3 days ago | | | | [–]

Also the idea that GNU tools are bloated.

jdnordy 3 days ago | | [–]

I thoroughly enjoyed this post. The history of software is interesting... the process is so iterative. Often, I take for granted the current state of software I use. The refinements came through many years, many engineers, and many decisions.

I wonder where I can find more posts / articles like this. short, punchy, well written histories of software I use.

sudhirkhanger 3 days ago | | [–]

I wish everybody would follow single and double dash convention. It keeps things simple but unfortunately not everybody does.

thecosmicfrog 3 days ago | | [–]

See: Go.

blacklion 3 days ago | | [–]

Example in "Early 1970s" is notoriously bad.

Text says: "it would be given some number of filenames as command line arguments, and it would read those ... Options didn’t exist".

Example shows: unneeded usage of `cat` as if `wc` is non-standard utility and cannot process file names by itself, option for `wc` is used. It contradicts text in both ways!

archduck 3 days ago | | [–]

Is there any actual, useful, worthwhile reason for avoiding useless `cat` though?

ElevenLathe 3 days ago | | | [–]

I can imagine if you are running 70s vintage DEC hardware, there might be appreciable performance benefits to not spawning that extra process.

gumby 3 days ago | | [–]

The double dash was indeed from the GNU project and specifically RMS (though I think it was at the recommendation of someone else whose name I am blanking on). This followed an earlier GNU approach, also from RMS, of using + to signal long options.

The idea, though not the syntax, came from the installation of a TOPS-20 machine and some VMS machines on a lower floor. Our own homegrown OS (ITS) used control characters as commands and had DDT as a shell.

> In the beginning, in the first year or so of Unix, an ideal was formed for what a Unix program would be like…

Note that the approach you’re talking about came straight from the Multics of that era, though Multics already supported some short options at that time.

> None of this explains dd.

Because just as Unix in many ways copied Multics, the dd command copied IBM.

cb321 3 days ago | | [–]

Since `ps` is featuring prominently here, folks might be interested in https://github.com/c-blake/procs which is a color ps (Linux-only right now). It has a more canonical CLI since it is based upon the Nim CLI generator https://github.com/c-blake/cligen. It also uses cligen's subcommand feature (display, find, scrollsys) and so it can replace pgrep, pkill, ps, top, vmstat, etc., etc. Uses the same color scheme as https://github.com/c-blake/lc which re-organizes the mismash ls CLI quite a bit.

ggm 4 days ago | | [–]

I don't think this is entirely wrong, but it's drawing a bit of a long bow. The getopt wars on Usenet were interesting, shar files flew around before we settled on gnu's take on things.

I think distinct shell commands are good. MH is my go to instance of doing this right (it's still under active development)

conaclos 4 days ago | | [–]

I pretty like the consistency of esbuild CLI [1]. It has three kinds of options:

1. no-valued option: --example 2. single-valued option: --key=value 3. multi-valued option: --elements:1st --elements:2nd

Short options are ruled out, although it accepts the short option -h for discoverability.

I particularly like the repeated option pattern for multi-valued options. This avoids ambiguities such as:

  cli --elements 1st 2nd arg

Yes, the ambiguity may be removed by using doubled dashes. However, this is error-prone.

I often wondering whether a more familiar syntax for the second and third kinds could be adopted. For instance:

  --key value
  --elements 1st --elements 2nd

[1] https://github.com/evanw/esbuild

bbkane 3 days ago | | [–]

This is largely what the Azure CLI does. It simplifies even further by eliminating the no-value option. Instead, you pass "true" or "false" as the value ( --example true ). It's a little more verbose but very easy to parse/write/generate. I like this convention so much I stole it for my homemade Golang CLI parsing library https://github.com/bbkane/warg/ .

teddyh 4 days ago | | [–]

> Initially, GNU used the plus (+) to indicate a long option

Some tools still use +options, like dig(1).

lifthrasiir 4 days ago | | [–]

Also in some cases both -options and +options exist, where +options do the inverse of -options (e.g. dash; I think the old Perl 6 also did the same thing for sub MAIN, though it's now -/options) or a subtle but useful variation of -options (e.g. ImageMagick).

MontyCarloHall 3 days ago | | | [–]

The worst offender in this regard is bash’s `set` command.

For example, `set -e` enables the `e` option (exit script immediately upon seeing a nonzero exit code). Guess how to disable it? Yup, `set +e`

nsajko 3 days ago | | | | [–]

Dash just implements the behavior specified by POSIX/SUS.

lifthrasiir 3 days ago | | | [–]

In hindsight, indeed! They just go straight into `set`.

eesmith 3 days ago | | | | [–]

Or "tail", where "tail -12" is the last 12 lines and "tail +12" is everything after the first 12 lines.

conaclos 4 days ago | | | [–]

And others use option -long such as ffmpeg.

brabel 4 days ago | | | [–]

And Java with the infamous `-version` (only after JDK11 I think it started recognizing also `--version`) and `-cp` meaning `--classpath`, not `-c -p`!

tambourine_man 3 days ago | | [–]

If Apple made the transition to OS X in a position of strength, like the one it enjoys today, I think it may had imposed a pattern and rewritten tar, ps, dd, etc.

However, the way history has unfolded, I think we’re stuck with this mess for good.

tambourine_man 3 days ago | | [–]

If there ever was a company that could've tried “Unix done right™” was Apple. Maybe Sun. But I guess it wasn't meant for this world.

Perhaps Apple's biggest contribution to the Unix/Linux landscape was launchd, which inspired systemd. I don't know if we would've endeavored in rewriting essential and mostly working decades-old parts if there wasn't a proven path already laid out.

Maybe LLVM as well, a modular compiler toolchain, but that's a strech.

pjmlp 3 days ago | | | [–]

AIX and Solaris already had init replacements before launchd.

Sun's only good idea for desktop UNIX was NeWS and they killed it. Their desktop execution for Swing proves how much they understood (not really), desktop development.

As for Apple and UNIX, it was more the inverted takeover from NeXT than anything else, given A/UX execution.

Also Steve Jobs wasn't a UNIX fan, it was more of a embrace POSIX due to the competition with Sun, and extend with NeXTSTEP Objective-C frameworks than anything else.

Get hold of his USENIX talk about "UNIX should be invisible".

tambourine_man 2 days ago | | | [–]

I wasn't trying to say that Apple (or Steve) is particularly found of Unix historicaly. Nor should they, Unix should be a means to an end, not one in itself. Unix should really be invisible to most users if present at all.

What I'm saying that Apple is a company usually willing to rewrite things the “right way”, vide Webkit, LLVM.

pjmlp 2 days ago | | | [–]

LLVM wasn't started by Apple, rather Illinois university, and they only adopted it after GPL 3 happened.

WebKit was born as Khtml by KDE project.

tambourine_man 22 hours ago | | | [–]

I'm aware :)

I don't know how that proves that it's not a company willing to rewrite foundational/taken for granted stacks.

pjmlp 3 days ago | | | [–]

In that case, there wouldn't be any desktop UNIX from Apple, and Copland would have managed to eventually make it.

tambourine_man 3 days ago | | | [–]

I don't know about that. I mean, look at Longhorn. And Microsoft was unbeatable back then.

Being huge and powerful is no guarantee of success when it comes to shipping software. Size can often be a hinderance.

pjmlp 3 days ago | | | [–]

Exactly because Microsoft was unbeatable, Longhorn UI ideas survived as WPF, while the OS components were rescued into C++ and COM, clunky as Windows Vista, and refined into Windows 7, including a kernel rewrite via the MinWin project.

They are on a similar position nowadays, trying to fix the UWP disaster, maybe by a future Windows 12 we get what Windows 8 should have been all along.

Likewise, starting from the same premise that Apple wouldn't be bleeding money, they would been able to keep up with their roadmap with System 8 and integration of Copland technologies.

tambourine_man 2 days ago | | | [–]

Copland UI ideas and code survived and shipped in Mac OS 8. Multithreaded Finder, Platinum Theme (multiple themes in fact, though killed at the last moment), windows as tabs at the bottom of the screen.

However the main challenge, memory protection, didn't. Much like Longhorn's file system.

pjmlp 2 days ago | | | [–]

So now only if 100% of the features got migrated, does it count.

In any case this is playing guessing games, what would have happened if Apple wasn't begging for money.

So who knows how System 9, 10… would have turned out.

As for the Longhorn example, part of it did ship, as SQL Server features and some ideas on ReFS.

tambourine_man 22 hours ago | | | [–]

Sure, the whole point of the thread was a guessing game. I'm speculating that Apple, had it not been worried with going bankrupt, could have tried to fix code that had grown organically and incoherently over decades.

And that this was a missed opportunity for the whole industry as it's probably not going to happen anytime soon.

saurik 3 days ago | | | [–]

I mean, you really think its "position of strength" didn't come from it embracing Unix compatibility (which let it welcome the flood of web developers)?

tambourine_man 3 days ago | | | [–]

In a way, yes, but mostly: iPhone

ducktective 4 days ago | | [–]

Speaking of CLI option/argument handling for shell scripts, I use getopts for simple scripts which have few switches and manual loop when I want to support long options.

1- short: http://mywiki.wooledge.org/BashFAQ/035#getopts

2- long: http://mywiki.wooledge.org/BashFAQ/035#Manual_loop

I don't support `--option=` mode. I don't support giving options after arguments. I prepend info and errors with braces like : `[TAG] message` and print them to STDERR.

Opinions?

bluenose69 4 days ago | | [–]

> we could introduce a completely new syntax that is systematic, easy to remember

Is it really difficult to learn that e.g. "rm" stands for "remove"? I've known very few people who found it difficult to learn the key things very quickly.

And not having to type (and spell) full words is a relief. (I know a shell can take care of spelling, but not all unix commands are entered interactively.)

Still, the article is interesting, and worth reading.

frederikvs 4 days ago | | [–]

I don't think remembering rm is the issue. It's remembering all the options in all their variations. Take the ps command. I tend to use it as "ps aux" - but I have no idea what the a, the u, or the x stand for individually. I know other people who use "ps -ef". (Also note the difference : the aux doesn't require a dash, and if you accidentally put it in, it doesn't do what you want it to do.)

As another example : the -v option. In most tools this means "verbose". In some tools, but not all, you can pass it multiple times to increase the level of verbosity. But then there are a few tools, e.g. top, where -v seems to mean "version" [0]. Or take the -h option. On many tools it stands for "help". But on tools like ls or sort, it stands for "human-readable".

To me personally this is not an issue - I know by heart a few combinations of options that I use frequently, and I know where to find out the details of those, as well as other options I use less frequently. But I can see how it would be easier for people to pick it up if those things were just a little more standardised.

[0] https://linux.die.net/man/1/top

smcameron 4 days ago | | | [–]

The "aux" vs "ef" thing is a Berkeley vs. System V unix thing. SunOS wanted "aux", while Solaris wanted "-ef", iirc. Linux ps will do either. From the man page:

       This version of ps accepts several kinds of options:

       1   UNIX options, which may be grouped and must be preceded by a dash.

       2   BSD options, which may be grouped and must not be used with a dash.

       3   GNU long options, which are preceded by two dashes.

And:

Note that "ps -aux" is distinct from "ps aux". The POSIX and UNIX standards require that "ps -aux" print all processes owned by a user named "x", as well as printing all processes that would be selected by the -a option. If the user named "x" does not exist, this ps may interpret the command as "ps aux" instead and print a warning.

Edit: On my system, "ps -aux" does the same thing as "ps aux" and no warning is printed, despite the what the man page says. I'm not going to try to create a user named 'x' though.

mort96 3 days ago | | | | [–]

My favorite example: `cp -r` is recursive. `scp -r` is recursive. `chmod -r` removes read permissions; `chmod -R` is recursive.

sph 3 days ago | | | [–]

I have done chmod -r plenty times by mistake and never known it removed read permissions. Oh God.

yencabulator 13 hours ago | | | | [–]

I ever use chmod without specifying at least one of ugoa. Helps with that mistake.

Macha 3 days ago | | | | [–]

`ssh -p` specifies a port number. `scp -p`/`sftp -p` copies the mtime. You need to use -P for the latter two.

trasz 3 days ago | | | | [–]

Why would anyone remember those options? That’s what man pages are for. You only need to remember the ones that you use most often.

morelish 4 days ago | | | [–]

Tbh I’ve found powershell approach better and also more verbose. Powershells commands are all named in the same way, with built-in support to make finding commands straightforward. Though it takes a while to get used to it. Explorability is better IMO with an approach like powershell. The downside is even though you can setup aliases, it still seems like powershell scripts are longer to input that bash and sometimes not as powerful.

bzxcvbn 4 days ago | | | [–]

Powershell defines tons of aliases to mimic bash, eg rm is an alias for remove-item, ls for get-childitems, cat for get-content, etc. What I like about Powershell is that arguments gets aliases too, for example -cf is the standard alias for -confirm.

justsomehnguy 3 days ago | | | [–]

The best thing with parameters in PS is what you don't need to write parameter name fully, and it is supported everywhere, while only some *nix tools support it (most notably 'ip')

ElectricalUnion 3 days ago | | | | [–]

The cool part about those aliases is that invoking Get-Help on them (say, Get-Help ls) will show you the relevant documentation of the non-aliased full commandlet name.

bbkane 3 days ago | | | | [–]

I really find it difficult to find the specific command I want. The verb-noun convention doesn't really lend itself to nested subcommands. At least with the Azure CLI I can generally navigate to the command and action I want with tab completion

juki 3 days ago | | | [–]

The Verb-Noun convention isn't that bad once you learn the approved verbs well enough to know which one the command you're looking for would be using. After you have the right verb, tab-completion works fine. Also, `Get-Command` (built-in alias `gcm`) is very helpful. E.g.

  PS> gcm *build*

  CommandType     Name                                               Version    Source
  -----------     ----                                               -------    ------
  Alias           Build-Checkpoint                                   5.8.6      InvokeBuild
  Alias           Build-Parallel                                     5.8.6      InvokeBuild
  Alias           Invoke-Build                                       5.8.6      InvokeBuild
  Application     mcbuilder.exe                                      10.0.2200… C:\Windows\system32\mcbuilder.exe

Or find everything from a module:

  PS> gcm -m InvokeBuild

  CommandType     Name                                               Version    Source
  -----------     ----                                               -------    ------
  Alias           Build-Checkpoint                                   5.8.6      InvokeBuild
  Alias           Build-Parallel                                     5.8.6      InvokeBuild
  Alias           Invoke-Build                                       5.8.6      InvokeBuild

And at least if you use `MenuComplete` for tab (`Set-PSReadLineKeyHandler -Key Tab -Function MenuComplete` in your profile), you can also just write `*<noun>` and hit tab. It will offer completions regardless of the verb.

oaiey 4 days ago | | [–]

"Nothing of that explains did"... Made my day

kickingvegas 3 days ago | | [–]

With this thread, wondering what specs/guidelines (if any) are there for common option behavior?

For example,

-v, —-version

-o <file>, —-output=<file> if <file> is - then use stdout

-i <file>, —-input=<file>. if <file> is - then use stdin

-h, —-help

Or is it just anarchy out here?

conaclos 1 day ago | | [–]

I found guidelines [1] that list several common short options:

  -a, --all
  -d, --debug
  -f, --force
  -h, --help
  -o, --output
  -p, --port
  -q, --quiet
  -u, --user

[1] https://clig.dev/#arguments-and-flags

conaclos 3 days ago | | | [–]

-v is also used for --verbose. In this case -V is often used for --version.

Another common option I am thinking about is -r for --recursive.

I could advocate to use only commonly accepted short options.

amtamt 4 days ago | | [–]

What is the getopt equivalent for sub command handling? Or get opt can parse subcommads too?

teddyh 4 days ago | | [–]

You probably want argp_parse(): https://www.gnu.org/software/libc/manual/html_mono/libc.html...

See especially getsubopt(): https://www.gnu.org/software/libc/manual/html_mono/libc.html...

shcheklein 3 days ago | | [–]

`None of this explains dd.` :) (it's at the very end of the post)

liendolucas 3 days ago | | [–]

I try to use as much as possible long options in commands where they are available. They tend to be more descriptive and sometimes avoids to consult man pages unnecessarily.

Purplish5583 4 days ago | [–]

Interesting. When did redirections got added?

gpvos 3 days ago | [–]

Very early, before the pipes.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact