Re: [blogite] Pingback misc

Date view Thread view Subject view Author view Attachment view

From: Simon Willison (
Date: Sat Sep 07 2002 - 17:03:12 BST

At 16:38 07/09/2002 +0000, Jim Dabell wrote:
> > My biggest objection to the HTTP header method is this: A very, very
> > large proportion of the sites linked to will not have PingBack of any
> > kind. This means that for most sites you are linking to you will have to
> > perform a HEAD request, see that they don't have a PingBack header, then
> > perform a second GET request and see that they don't have a <link>
> > element either. That's two requests, which is actually a greater overhead
> > than just sending a GET!
>Surely the overhead for HEAD in this context is negligable?

The actual HEAD request may be a very small amount of data, but what
concerns me is the overhead of opening another connection to the server.
Thinking about it though you could do a normal GET request but cut it short
the moment you hit a X-PINGBACK header so it is definitely worth
considering. In fact, I see no reason not to include "You may optionally
send an HTTP x-pingback header, but the <link> element is required" in the

> > Also remember that you can read data from a socket line by line (or
> > character by character if needs be). This means that you can close the
> > socket connection from the GET request the moment you receieve a <link
> > rel="pingback" element OR you hit the </head> tag (as the link element
> > will not appear past that point). This means the overhead isn't actually
> > that bad.
>Think about all the screwed up html without </head> or <body>, coupled with
>thousands of keywords etc. Sure, you can drop the connection after 32k or
>whatever, but that's a big difference to a few lines of HTTP headers. You
>could also require that the <link> is the first element of <head>, but I
>think adding parsing requirements on top of standard html is a mistake.

At the end of the day how this is handled is up to the people implementing
their own clients. My client will work like this, but it should not be a
requirement that /all/ clients work like this:

1. Send the GET request
2. Read the headers line-by-line - if an X-PingBack (or whatever we decide
to call it) is found then stop receiving from the socket
3. Start reading the HTML
4. If a <link rel="pingback"> element is found, stop receiving.
5. If a </head> tag is found, stop receiving
6. If a <body> tag is found, stop receiving
7. If we've received 5 KB with no sign of a <link> tag, stop receiving

That way I receive a maximum of 5 KB and I can be 99% certain I will spot
the PingBack information, if it exists. Again, this is how I plan to
implement my client but it is not the required (or even necessarily the
recommended) way of doing things.

> > This doesn't even have to be done via XML-RPC (although that should be
> > the favoured method).
>The "correct" approach imho would seem to be <link>ing (or HEADing) to a
>description of the available pingback interfaces instead of directly to the
>service. You could list several:
> <interface priority="1" type="xml-rpc">
> </interface>
> <interface priority="1" type="http-get">
> </interface>
> <interface priority="2" type="email">
> </interface>
> <interface priority="10" type="manual-web-form">
> </interface>
> <interface priority="10" type="manual-email">
> </interface>

This has been touched on before, and I think it worth some serious
consideration. Pointing to an XML file describing a site's PingBack support
is definitely a more "logical" use of the <link> tag than pointing to the
XML-RPC server directly (especially considering the XML-RPC server can't be
used by a normal browser anyway). It also means that sites can have all of
their PingBack information in one file - this allows for PingBack clients
to cache the file and also lets site authors update the PingBack
information for all of the pages on their (possibly static) site at once -
like having a central style sheet.

There's a bit more overhead involved in grabbing the XML file but as it can
be cached anyway I don't see this as a huge problem.

In fact the more I think about it the more I agree that auto discovery is a
nicer approach than any kind of central server. Auto discovery gets rid of
the need for those URL patterns I suggested earlier (which were a bit
strange). It's also a nice simple concept -an HTML page containing data
about who you should inform if you link to the page.



Web Developer,

Message sent over the Blogite mailing list.

Date view Thread view Subject view Author view Attachment view

This archive was generated by hypermail 2.1.5 : Sat Sep 07 2002 - 18:05:00 BST