It's internal to the shell however so is the individual pipeline parts to some e...

It's internal to the shell however so is the individual pipeline parts to some extent.

The long answer is that all commands in the pipeline are treated as either a murex shell script or a builtin function. Any external commands are actually the builtin function called "exec" which is calling an external command as its parameters. This means I could effectively write my own pipeline with richer APIs beyond just reading []bytes of data. This doesn't come without significant drawbacks as well (which I can also talk about if your interesting) but I'm working to mitigate them.

So the following two lines of code are equivalent:

    open foo.json | grep foobar
    open foo.json | exec grep foobar

So if you then were to pipe that into another builtin you'd need to cast it again. eg

    open foo.json | grep foobar | cast json | ...do something

This is just illustrative though. Technically this would break the JSON file because "grep" just works on dumb lines and JSON isn't a linefeed document. This is why murex comes with such an extensive number of builtins, for example if you wanted to do a grep-like regex match against elements in a JSON array and output that as valid JSON you could the following pipeline:

    open foo.json | regexp m/foobar/

However going back to your original question, if someone wanted to write a shell script or any other utility that support murex data-type but wasn't part of the murex shell binary, then there's no way murex could pass that type information to it.

This is a problem I've been mulling over for a few years now and I've considered a few options in that time - none of them are without their problems. At one point I even laid down some POC code but that has since been deleted because pragmatically it seems a premature problem to solve while I'm the only person developing and using the shell.

edit:

It's also worth noting that there is a default data-type that is assumed even when no data-type is specified. That type is a loosely tabulated "catch all" type format for your standard unix tools. That means output from commands like `ls -l`, `ps`, and other classic utilities can be parsed and fields selected intelligently. eg if you were to type the following partial command line then hit tab:

    ps -fe | [ <tab>

You'd see the headings from `ps` in your autocompletion suggestions:

     UID     PID     PPID    C       STIME   TTY     TIME    CMD

So you could select one or more of them:

    ps -fe | [ UID PID CMD ]

...and murex would filter the output of ps to only those 3 columns.

It's not perfect by any means - since there are a thousand different ways developers can output data to the terminal - but it does seem to be an asset more than a hinderance for my particular use cases. Of course YMMV and I'd welcome any feedback if that were the case :)