[blogite] A decentralised "my blog has been updated" service

Date view Thread view Subject view Author view Attachment view

From: Simon Willison (simon@incutio.com)
Date: Sat Sep 07 2002 - 12:06:45 BST


We've been talking about this at work and as it poses an interesting
challenge I thought I'd share it with the list. This is all completely
hypothetical stuff, but it's a fun problem nontheless.

Most (probably all) of us send a ping to weblogs.com when our blog is
updated. These pings are then propagated once an hour to sites like blo.gs,
with the end result being that some of our sites can display lists of blogs
on our blogrolls in order of their last update. There are other services
linked in to weblogs.com that can alert people when their favourite blogs
are updated, and all in all it's a pretty important service.

The current system has two disadvantages:

1. It has a central point of failure (weblogs.com).

2. It can have up to a one hour delay.

The challenge then is to come up with a decentralised system to allow blogs
to alert people (and bots and other sites / blogs) when the blog is updated.

I've been thinking about a system where blogs can "subscribe" to other
blogs. Subscribing involves sending a request to the blog in question
(almost certainly using XML-RPC) saying "please message my XML-RPC server
at URL when you update". Blogs have to keep track of who has subscribed
and, when they are updated, send out a bunch of update messages.

The problem here comes when you have a really popular blog. How many people
would subscribe to Scripting News for example? It could be hundreds, which
is far too many for an individual blog to ping on its own.

Here's a solution: Say a blog has to ping 20 other blogs. Instead of
pinging all 20, it pings the first blog on the list (or a randomly chosen
blog from the list) and includes with it's message a list of the other 19
blogs that need pinging. The first blog can then consider it's duties
complete. The second blog now has to ping the next blog in the list and
send it the remaining list, and so on until the list has run out. If a ping
fails for some reason the blog that made the ping should remove that blog
from the list and forward the remainder of the list to the next listed blog.

There are a few problems with this approach, and unfortunately some of them
are show stoppers. Firstly, a list of 100 blogs to ping is a pretty big
list to send in one go. A solution would be to limit the number of blogs in
a list to 20 - if you need to ping more than 20 you split the list up into
blocks of 20 or less and send each one off separately.

Secondly, how do you know you can trust the blogs on your list? What if one
of them fails to pass on the list (maliciously or for technical reasons) -
should that happen none of the remaining blogs on the list would receive
your ping. One idea we had to counter this was to send a reversed copy of
the list to the last blog on it - that way the two lists would "meet in the
middle" and the blog that received both could stop sending the second copy
on. This would require lists to have a unique identifier (probably based on
the source blog and timestamp) and would also require blogs in the middle
to keep track of lists that had recently passed on. It wouldn't completely
solve the trust problem either - two untrustworthy nodes could still kill
off the whole message.

Thirdly (and this is the show stopper) the system described above could be
used for distributed denial of service attacks! Say someone has a grudge
against kryogenix.org (sorry Stu :P) - they could send out a list to a huge
number of blogs with kryogenix.orgs XML-RPC received listed as the /second/
item on the list. One list hop later and kryogenix is being battered with
hundreds of XML-RPC requests all at once. Not good.

That's as far as we got before we got stuck. Can anyone else think of a
reliable way of operating a peer2peer decentralised "blog A has been
updated" system without running in to any of the problems listed above?

Cheers,

Simon

-- 
Web Developer, www.incutio.com
Weblog: http://www.bath.ac.uk/~cs1spw/blog/
Message sent over the Blogite mailing list.
Archives:     http://www.aquarionics.com/misc/archives/blogite/
Instructions: http://www.aquarionics.com/misc/blogite/

Date view Thread view Subject view Author view Attachment view

This archive was generated by hypermail 2.1.5 : Mon Sep 09 2002 - 05:05:00 BST