Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Fun With HTTP Headers (nextthing.org)
144 points by parenthesis on Feb 11, 2009 | hide | past | favorite | 22 comments


The ultimate header is produced by a newspaper. Try running this line and see. It is a random line, so you might want to try it a few times :-)

   curl -i -s --head http://www.vg.no/ | grep N



they even randomized it. that's awesome


the "Cneonction: close" thing is a quirk of Netscaler loadbalancers. It's done to nullify any "Connection: close" headers the webserver spits out, as the Netscaler wants to manage it better. It's scrambled instead of removed so that it doesn't have to regenerate packets (length is the same) and it's scrambled semi-randomly so that people don't just assume it's a misspelling and add compatibility for it.


Interesting, I wonder why they didn't go with something more self explanatory:

  Connection: -> X-Ignore-X:


X-Ignore-X is longer than close, which I suppose would mess up the packet length. Or maybe having an unrecognized value for the Connection key would still default to a close? Just guessing here.


TCP checksums are fairly simple; a TCP stack basically just sums up the 16-bit words in a packet and stores the result in the checksum field; this will not detect 16-bit words being swapped around.

My guess is that the load balancer tried to invalidate the header while preserving the TCP checksum.


Ah yes, forgot about the checksum field :) thx


I meant:

  Connection: close
replaced with

  X-Ignore-X: close


X-No: I will not give you a job for poking at my headers. But nice try.


The simplest explanation for "OCR is watching you" is an old web attack called "HTTP Response Splitting"; it happens when a server generates headers (like a Referer) based on user input, but doesn't escape out newlines.


certainly easier to run this in terminal. the firefox way (a bit more tedious, but nice if you're already on the website):

open firebug, enable net monitoring, look at the very first GET request, check out the headers, scan for sense of humor.

---

stick this in your controller:

headers["We-Are-Uh-Meh-Zing"] = "true on sunny days"


I have a web scraping script written in Python that I want to make multi-threaded. It scrapes web pages from a list, and enters results into a DB. Can someone (the author maybe) show me a simple example of how to make a multi-threaded Python script?


Check out the docs on the "multiprocessing" or "threading" module. In particular multiprocessing.Pool is handy for controlling the number of parallel things you have going on at once.

  # Assume we have functions GetUrls() that retrieves a list
  # of the urls we want to get, and Download(url) which
  # downloads the content of a url and sticks it in the
  # database.

  import multiprocessing  
  pool = multiprocessing.Pool(processes=100)
  urls = GetUrls()
  pool.map(Download, urls)
See:

http://docs.python.org/library/multiprocessing.html

http://docs.python.org/library/threading.html

Also, if you're entering results into a database, the easiest way may be just to spawn multiple python processes from the command line.


Go to python.org and read the documentation on threading, concurrency. The mailing list under community is very good, but they expect you to have read the docs and goodled first.


That's really a job for google.


If you're using curl to poke around headers, as the author suggests, use the -I flag instead of the -i flag. -I gives you the headers only.


That works, however some sites return different headers depending on whether they get a HEAD or GET request. -I sends a HEAD request, while -i sends a GET request.


Speaking of P2P technologies, I was interested to run across a KaZaA server:

HTTP/1.0 404 Not Found X-Kazaa-Username: anonymous_user X-Kazaa-Network: KaZaA X-Kazaa-IP: xx.xx.xx.xx:1348 X-Kazaa-SupernodeIP: xx.xx.xx.x:3699

It looked like it was running on someone’s DVR. Anyone have any pointers as to what software does that?

Uh, Kazaa does that. Not a DVR.


IIRC, it was full of recently-recorded TV shows. I hypothesized someone was running some DVR software that also shared the shows on the Kazaa network.


I see you watch your referrer log. Created an account here just to reply? Welcome to the site.


Yeah, he said that. He hypothesized that KaZaA was running on someone's publicly accessible DVR box. Some people use Linux boxes as DVRs.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: