Re: [blogite] Pingback misc

Date view Thread view Subject view Author view Attachment view

From: Simon Willison (simon@incutio.com)
Date: Fri Sep 06 2002 - 19:45:40 BST


At 17:48 06/09/2002 +0000, Jim Dabell wrote:
>Hi, hope this is a public mailing list, and it's okay to post and not just
>lurk - I came across this list by accident, so I'm not sure :)
>
>I'm currently writing my own blog engine (to grok it all, not for any
>particular need) and am looking for ideas for features etc.
>
>Going back to a comment on Simon's blog, I think that it would be a good
>idea to move the pingback pointer to http headers rather than in a link
>element - it's got a few advantages:
>
>...<snip>...
>- - offers a fallback
>
>There is no reason to not support both - if you can't implement the http
>header method for some reason, you can include it in a <link> element - if
>a pingback implementation doesn't find the http header, it can always try
>the other method.

My biggest objection to the HTTP header method is this: A very, very large
proportion of the sites linked to will not have PingBack of any kind. This
means that for most sites you are linking to you will have to perform a
HEAD request, see that they don't have a PingBack header, then perform a
second GET request and see that they don't have a <link> element either.
That's two requests, which is actually a greater overhead than just sending
a GET!

Also remember that you can read data from a socket line by line (or
character by character if needs be). This means that you can close the
socket connection from the GET request the moment you receieve a <link
rel="pingback" element OR you hit the </head> tag (as the link element will
not appear past that point). This means the overhead isn't actually that bad.

>- - pingback checks to make sure the permalink url is actually in the page
>
>It seems to me that this should be an optional part of the spec anyway -
>some implementations won't want the overhead.

While I agree with Stuart that it is a very, very good idea for a PingBack
implementation to check for a link on the page I agree with you that this
should be optional. I think the PingBack specification should pretty much
boil down to what amounts to a single line of text:

Blog A tells blog B: "I have linked to your page X from my page Y"

This doesn't even have to be done via XML-RPC (although that should be the
favoured method). Someone mentioned on my blog the other day that they
implemented TrackBack with ease because it allows them to send a ping using
just an HTTP GET request (sending the variables in a query string). Since
the information needed for a PingBack is pageLinkedFrom and pageLinkedTo
there is no reason a pingback enabled blog couldn't accept data from a
query string as well as from an XML-RPC request. In fact, you could even
have a form on your web site with two fields and the text "please use this
form to tell my blog you have linked to it".

Getting back to the overhead problem, I have a working prototype of a
centralised system that could severely reduce overhead by helping blogs
maintain an internal list of which URLs are PingBack-able (and which server
they should ping). This eliminates the need for a blogging client to
retrieve a page and look for a <link> element entirely, although I think
the <link> element should stay a part of the spec as it offers an
alternative decentralised method of keeping track of which blogs are
pingbackable.

>What are the semantics for changing pages? Suppose you get a pingback, and
>the referring page deletes the reference to you? Or the site goes offline,
>never to come back?

I don't think there's much we can do about this, other than remind
implementors that they are free to have their implementation "check up" on
their pingbacks once a month (or whatever) to check the links still exist.
PingBack has a certain element of trust built in to it in that we hope
people will only send PingBacks from URLs that are likely to stay permanent.

Oh, and welcome to the list :)

Simon

-- 
Web Developer, www.incutio.com
Weblog: http://www.bath.ac.uk/~cs1spw/blog/
Message sent over the Blogite mailing list.
Archives:     http://www.aquarionics.com/misc/archives/blogite/
Instructions: http://www.aquarionics.com/misc/blogite/

Date view Thread view Subject view Author view Attachment view

This archive was generated by hypermail 2.1.5 : Sat Sep 07 2002 - 17:05:01 BST