Re: [blogite] Pingback misc

Date view Thread view Subject view Author view Attachment view

From: Aquarion (nicholas@aquarionics.com)
Date: Sat Sep 07 2002 - 18:32:34 BST


On Sat, Sep 07, 2002 at 04:38:04PM +0000, Jim Dabell wrote:
>
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
>
> On Friday 06 September 2002 6:45 pm, Simon Willison wrote:
> > At 17:48 06/09/2002 +0000, Jim Dabell wrote:
> [snip]
> > >There is no reason to not support both - if you can't implement the http
> > >header method for some reason, you can include it in a <link> element -
> > > if a pingback implementation doesn't find the http header, it can
> > > always try the other method.
> >
> > My biggest objection to the HTTP header method is this: A very, very
> > large proportion of the sites linked to will not have PingBack of any
> > kind. This means that for most sites you are linking to you will have to
> > perform a HEAD request, see that they don't have a PingBack header, then
> > perform a second GET request and see that they don't have a <link>
> > element either. That's two requests, which is actually a greater overhead
> > than just sending a GET!
>
> Surely the overhead for HEAD in this context is negligable?

Our greatest overhead, since we are on webservers with assumably decent
connections, is looking up the domain. Getting it twice is silly.

As for the static argument, MT - for example - turns archives into
static HTML files so it doesn't have to work at /all/ to render them,
which is a monumentally cool idea. Putting things in the headers of
these is difficult without playing with apache directives.

>
> > Also remember that you can read data from a socket line by line (or
> > character by character if needs be). This means that you can close the
> > socket connection from the GET request the moment you receieve a <link
> > rel="pingback" element OR you hit the </head> tag (as the link element
> > will not appear past that point). This means the overhead isn't actually
> > that bad.
>
> Think about all the screwed up html without </head> or <body>, coupled with
> thousands of keywords etc. Sure, you can drop the connection after 32k or
> whatever, but that's a big difference to a few lines of HTTP headers. You
> could also require that the <link> is the first element of <head>, but I
> think adding parsing requirements on top of standard html is a mistake.

Er, no.

Epistula's implementation of pingback (In the pinging of servers) does
this:

Open socket, or die screaming.
Start reading by line. If we have a <link>, record it, and drop out of
loop.
Otherwise, read next line and try again until:
        a) We have a <link> (/\<link rel\=\"pingback\"
        href\=\"http:\/\/(.*)\"(.*)(\/?)>/i)
        b) We have a <body> (Not yet implimented, but planned)
        c) We have an EOF
        d) We have read 50 lines.
Close connection, Do whatever I want to with loop.
(PHP Code at http://www.aquarionics.com/src/admin/addentry.phps,

Message sent over the Blogite mailing list.
Archives: http://www.aquarionics.com/misc/archives/blogite/
Instructions: http://www.aquarionics.com/misc/blogite/


Date view Thread view Subject view Author view Attachment view

This archive was generated by hypermail 2.1.5 : Sun Sep 08 2002 - 19:05:00 BST