Re: [blogite] SmartReferer 0.1.2

Date view Thread view Subject view Author view Attachment view

From: Ian Hickson (ian@hixie.ch)
Date: Sun Dec 29 2002 - 04:41:43 GMT


On Sat, 28 Dec 2002, lenz wrote:
>
> 1. There should not really be a lot of 404's.
> A (single) 404 error is generated at most when a non-SR sender site links
> to a SR receiver site, no matter how many users traverse that link. Though
> not required, the best SR implementations will avoid repeately querying a
> site that is a known non-SR sender host.

Granted, the problem is not as serious as the favicon.ico or /P3P issue.
However, experience has shown that solutions that assume a fixed URI are
met with ridicule from the Web coommunity. I strongly recommend avoiding
such schemes.

> 2. The autodiscovery process based on a single URL is very poor design.
> The delegation system implicit in SR ports makes it easy to delegate SR
> ports on any system. The SR base port needs not describe any resource in
> the given site, but tells you where to find a correct SR port for the given
> resource.

Even once this specification is widely deployed, there will still be close
to zero chance that a site like geocities.com will be setting up an XML
file to redirect to all their subdirectories. While the specification is
in its infancy, there is no chance.

In my opinion, that is a problem that the specification must solve.

> I am thinking about some sort of standard GET parameter to append to an
> HTTP GET request, but I'm not sure yet.

What is wrong with the Pingback mechanism? (Fetch the resource URI, and
examine the result for either an HTTP header or a particular <link>
element.)

> 3. Case-insensitive longest matching subsequence
> URIs are not case-sensitive on - at least - Win/IIS systems.

URIs are _by definition_ case sensitive. That a particular Web server
happens to consider all case variants of a URI as pointing to the same
resource does not change this.

> As of 0.1.2, all DOMAIN tags are thought of as being subdirectories:
> http://foo/bar and http://foo/bar/ are considered exactly alike and won't
> match http://foo/barcode.php

That is technically a violation of the URI specs, I believe. According to
the HTTP and HTTP URI specs,

   http://example.com/foo
   http://example.com/foo/

...are distinct URIs pointing to distinct resources. (The first typically
results in a 301 redirect to the second.)

> Let's see how SR handles the following case in a real-life situation. As
> you say, you can have a number of different sligthly different referers, like:
>
> http://example.org
> http://example.org/
> http://www.example.org/
> http://www.example.org/index.html
> http://example.org/?lastModified=2089420986
>
> I agree with you - the easiest and most logical way to obtain information
> would be to query the URI itself. But this means any URI of our website
> must be equipped with the ability to answer such question, and this means
> that such information is redundant. I don't want to change my existing
> PHP-Nuke setup, and don't want to edit anything. So IMHO this is not the
> way to go if you want to offer a low cost - simple conversion method.

I disagree -- adding Pingback support to all 40 of my domains and
subdomains took no more than a _single_ line in a _single_ configuration
file. You don't need to make each file know information about itself, you
only need to make each file be able to point to a referrer authentication
server or file.

> The second way to go would be to query a central repository for the SR
> port.

Avoid centralisation at all costs.

> [...] SR base port [...]

I feel I should point out again that these files are neither "smart",
"base", nor "ports" -- each time I read that term I feel like correcting
it. :-)

Smart implies some live code, which you explicitly want this system to
avoid, and a port is a term already used in an HTTP context to mean the
TCP port of the connection. The term "base" is also already overloaded,
especially in a URI context (it typically refers to the absolute URI from
which a relative URI should be resolved).

> PS. I guess I can use my own words what I post here for SR's FAQ section;
> am I right or is there any limit on this? Thanks.

I am not a lawyer, and a lawyer would be the best person to advise you
here, but I believe that barring any license agreements or contracts to
the contrary, you own the copyright of any text you write, and can reuse
that text anywhere.

Ideas:

   To get around the multiple host problem: make the mapping file contain
   only the path part of the URI, optionally with that section being
   labelled as applying to a certain list of domains.

   To get around the fixed path problem: require the server to associate
   an HTTP header for referrer mapping to each file, giving the URI of a
   mapping file. (Also allow a link element for sites without header
   configuration access?)

   Make this file have a URI to which the two link endpoints (referrer
   and requested resource) must be connected.

   Make the mapping file a very simple format which only serves one
   purpose: mapping from a pair of URIs to a list of URIs.

   For the metadata stuff: leave this up to the RDF people. RDF embedded
   in the canonical URI's resource would seem the most obvious solution.

I can't help but feel that this problem should be solved at a lower level
though. Something like a new HTTP method seems more appropriate (although
granted that would require a lot more work to implement).

-- 
Ian Hickson                                      )\._.,--....,'``.    fL
"meow"                                          /,   _.. \   _\  ;`._ ,.
http://index.hixie.ch/                         `._.-(,_..'--(,_..'`-.;.'
Message sent over the Blogite mailing list.
Archives:     http://www.aquarionics.com/misc/archives/blogite/
Instructions: http://www.aquarionics.com/misc/blogite/

Date view Thread view Subject view Author view Attachment view

This archive was generated by hypermail 2.1.5 : Thu Jan 16 2003 - 16:05:01 GMT