More

blueskin_ · on May 26, 2015

I tried to catch an Uber ride from Heathrow before, and it was a total nightmare in terms of not knowing where to go, having come out of the wrong exit, and the driver ending up in a car park. Really, the airports benefit from allowing them too, as they reduce traffic congestion and the number of confused people milling around looking for where their ride is...

blueskin_ · on May 26, 2015

I remember when thinkgeek had a wide variety of genuinely interesting stuff and not just the latest pop culture 'geek' merchandise (sure, they've always had some of that, but years back it wasn't the overriding theme like it is today). This is a sad moment, but probably to have been expected.

blueskin_ · on May 6, 2015

Sometimes people don't comment because they don't have anything interesting/useful to add, but still found the article interesting and want it to be more visible, so they upvote it and move on.

blueskin_ · on May 6, 2015

>'Cyber'

Literally laughed out loud. It's just become too much of a joke now.

blueskin_ · on May 6, 2015

People don't want it because it's binary, not because you can't grep it.

* you need to use a new proprietary tool to interact with them

* all scripts relating to logs are now broken

* binary logs are easy to corrupt, e.g. if they didn't get closed properly.

>You can have a binary index and text logs too! / You can. But what's the point?

The point is having human-readable logs without having to use a proprietary piece of crap to read them. A binary index would actually be a perfect solution - if you're worried about the extra space readable logs take, just .gz/.bz2 them; on decent hardware, the performance penalty for reading is almost nonexistent.

If you generate 100GB/day, you should be feeding them into logstash and using elasticsearch to go through them (or use splunk if $money > $sense), not keeping them as files. Grepping logs can't do all the stuff the author wants anyway, but existing tools can, that are compatible with rsyslog, meaning there is no need for the monstrosity that is systemd.

oblio · on May 6, 2015

What's wrong with Splunk? Honest question.

blueskin_ · on May 6, 2015

Price, mostly. It's good, but there are alternatives that aren't as ridiculously expensive.

oblio · on May 6, 2015

Any alternative that you'd recommend? Thanks.

linuxydave · on May 6, 2015

ELK stack - Elasticsearch, Logstash, and Kibana. The whole stack is opensource :)

oblio · on May 6, 2015

Interesting, but it's not a SaaS. It doesn't look like a direct rival to Splunk.

linuxydave · on May 6, 2015

It is a direct rival to Splunk :) They do very similar things however IMHO Splunk is the better solution right now. There are LaaS companies that use ELK if you need a cloud solution - Loggly is the first one that springs to mind and I think another is LogSene.

fletchowns · on May 6, 2015

It's expensive

_5csa · on May 6, 2015

* Why would you need a proprietary tool? * What if they get broken? I don't want to look at them raw anyway. * Text logs are easy to corrupt as well. Oh, append only? Well, you can do that with binary storage too.

And again, there is no need for proprietary tools at all. Everything I want to do is achievable with free software - so much so, that I use only such software in all my systems.

As for compressing - yeah, no. Please try compressing 100Gb of data and tell me the performance cost is nonexistent.

As for LogStash & ES: Guess what: their storage is binary.

Also note that my article explicitly said that the Journal is unfit for my use cases.

leni536 · on May 6, 2015

Why does it have to be proprietary?

cthalupa · on May 6, 2015

It doesn't have to be - but let's look at reality here. NIH syndrome is everywhere, we have millions of competing protocols and formats, everyone thinks they can build a better solution than someone else, etc.

I suppose that if there was a large push to universally log things in binary the possibility exists that sanity would prevail and we'd get one format that everyone agreed upon, but I don't see any reason that this would be the case when historically it basically never happens.

So, at least from my prediction of a future where binary logging is the norm, we have a half dozen or so competing primary formats, and then random edge cases where people have rolled their own, all with different tools needed to parse them.

Or we could stick with good ol' regular text files and if you want to make it binary or throw it in ELK/splunk or log4j or pipe it over netcat across an ISDN line to a server you hid with a solar panel and a satellite phone in Angkor Wat upon which you apply ROT13 and copy it to 5.25 floppy, you can do it on your own and inflict whatever madness you want while leaving me out of it.

leni536 · on May 6, 2015

It's not like text formats are universal. Thankfully we have settled on utf-8. The same could happen to a slightly more structured universal binary format that would be more suitable for many applications (like logging) and would have an established toolset just like 'text' now.

bad_user · on May 6, 2015

That doesn't make sense. There can be no such thing as a universal binary format, unless you reinvent text/plain. Analyze a little on why we refer to some formats as being text-based and you'll see it.

regularfry · on May 6, 2015

It doesn't, but nothing is universal like `grep`. If you find a machine that's logging stuff which doesn't have `grep`, you're already having a bad day.

You just can't say that about binary log formats. Text is a lowest common denominator; and yes, that cuts both ways, but the advantages of universality can't be trivially thrown away.

_5csa · on May 6, 2015

The machines I'm administering will all log the same way, therefore, within that context, they are universal. We have well documented tools and workflows, so anyone new to the system can catch up and start working with the logs within minutes.

We don't unexpectedly find machines that don't conform to our policies. We control the machines, we know where and how to find the logs. If we found any where we had to grep, we'd be having a very bad day.

Our lowest common denominator is not text, because we control the environment, and we can raise the bar. Being able to do that is - I believe - important for any admin.

regularfry · on May 6, 2015

Right, but this isn't an argument about log formats. You're making a bigger argument about workflows, and you're saying that yours is unconditionally better. In your environment, it might be most appropriate to put in the up-front investment to totally control all your log formats. Within your context, you get to define a lowest common denominator which isn't text, and it sounds like that makes sense. With the services you run, you might be able to dictate that the log formats are restrictive enough that writing a parser for each one isn't a problematic overhead.

To get the benefits you're claiming, the storage format of your logs is actually irrelevant. If you're going to have an environment where you have to exert that much control over the output of your applications, when you parse the logs doesn't matter. You could do your parsing with grep and awk as the very last step before the user sees results, and you'd see the same benefits. Parsing up-front, assuming you know what data you can safely throw away, might appear to some as a premature optimisation.

> We have well documented tools and workflows, so anyone new to the system can catch up and start working with the logs within minutes.

It sounds like this is something which could be usefully open-sourced, to show how it's done.

> Our lowest common denominator is not text, because we control the environment, and we can raise the bar. Being able to do that is - I believe - important for any admin.

It's a question of what you choose to optimise for. Pre-parsed binary logs in a locked-down environment might be as flexible as freeform text, but I'd need to see a running system to properly judge.

_5csa · on May 6, 2015

> you're saying that yours is unconditionally better

I don't think I'm saying that. The article presents two setups and a few related use cases, where I believe binary log storage is superior.

> With the services you run, you might be able to dictate that the log formats are restrictive enough that writing a parser for each one isn't a problematic overhead.

I don't need to dictate all log formats. If I can't parse one, I'll just store it as-is, with some meta-data (timestamp, origin host, and so on). My processed logs do not need to be completely uniform. As long as they have a few common keys, I can work with them.

For some apps or groups of apps, I can create special parsers, but I don't necessarily need that from day one. If I'm ok with only new logs being parsed according to the new rules (and most often, I am), I can add new rules anytime.

> Parsing up-front, assuming you know what data you can safely throw away, might appear to some as a premature optimisation.

>> We have well documented tools and workflows, so anyone new to the system can catch up and start working with the logs within minutes. > It sounds like this is something which could be usefully open-sourced, to show how it's done.

LogStash is a reasonable starting point. Our solution has a lot of common with it, at least on the idea level.

> Pre-parsed binary logs in a locked-down environment might be as flexible as freeform text, but I'd need to see a running system to properly judge.

Only our storage is binary. That is all the article is talking about. Within that binary blob, there are many traces of freeform text, mostly in the MESSAGE keys of application logs which we care less about (and thus, parse no further than basic syslog parsing). You still have the flexibility of freeform text, even if you store it in a binary storage format.

blueskin_ · on May 5, 2015

I had the same feeling - plus the broadband there was terrible. I would never live in a suburb even if I was paid to. The middle of nowhere though, would be even worse than a suburb as it's the same problems but more so and without the (relative) proximity to work.

blueskin_ · on April 30, 2015

Not the first time it's been done.

Of course, it's only worked some of the time it's been done, I guess.

blueskin_ · on April 30, 2015

Mildly misleading title there; the dishes are all in the southern hemisphere IIRC.

That said, Jodrell Bank is a very interesting place. Good article: http://www.theregister.co.uk/2013/03/25/geeks_guide_jodrell_...

shellac · on April 30, 2015

It should be 'headquartered'. The UK isn't physically big enough to host such an array, of course.

maaku · on April 30, 2015

Not to mention weather...

DanBC · on April 30, 2015

How much effect does weather have on radio telescopes?

We don't have many very strong winds (at least, I don't think so) and the rest has (I think) little impact. Am Iissing something?

Trumpet6 · on April 30, 2015

Radio interference from man made sources is the biggest issue for radio telescopes in the traditional radio band. So it's not really the weather that decides, but the crappy radio skies of the UK.

maaku · on April 30, 2015

Rain clouds and radio don't mix well. That's why weather stations use radar (radio) to detect clouds.

blueskin_ · on April 28, 2015

Hasn't the US done this too, even if not recently then at least historically?

The Royal Navy still does this today though.

blueskin_ · on April 24, 2015

Are people only just realising this?

I would only ever use agent forwarding to a trusted host exactly because of what it does - put a socket on that host that responds with your SSH key... Anyone on that host with root has access to your key.