Home | Contact Us | FAQ | Search & Site Map | Link to Us
Sign In | Join | Other 45 Sites in Network
Home
Discussion GroupsGeneralPHPASPPerlColdFusionFlashHTML, CSS, ScriptsBrowsers

Webmaster Forum / HTML, CSS, Scripts / HTML / October 2008



Tip: Looking for answers? Try searching our database.

Arbitrary definition of class names by user agents

Thread view: 
Enable EMail Alerts  Start New Thread
Thread rating: 
Steven Simpson - 26 Oct 2008 13:42 GMT
Stefan Ram wrote (in "More than one language in a page"):
> In this case, one might even use Google's new attribute value:
>
> <p lang="en">The word
> <q><span lang="fr" class="notranslate">chef</span></q>
> is of French origin.</p>
>
>   See
>
> http://googlewebmastercentral.blogspot.com/2008/10/helping-you-break-language-ba
rrier.htm

Is this a new trend of user-agent writers (Microformats, and now Google)
staking claims on the @class namespace?  I'm surely not the only one
disturbed by this.  Somehow, an author publishing on the web, with no
control over which user agents will access his page, has to avoid
clashes with the union of all names deemed special by all those user
agents, now and in the future?

I suppose the proponents justify this practice by a line in the HTML
spec (HTML4.01 §7.5.2), that class names are also for "general purpose
processing by user agents" as well as stylesheet selectors.  It doesn't
go into any further detail, but I don't think it was the intention that
applications which the author has no control over (e.g. once a page is
published) should define class names willy-nilly.  More likely, the
author would have opted in to some scheme, such as a company's internal
robot to do some advanced indexing on all its own pages.

Here are some ideas for external interpretation, i.e. by some 'third
party' such as Google:

   * Opt in to a third party's scheme.  Register ones URIs with Google,
     so they know that 'notranslate' means what they think on those
     pages.  I don't fancy doing that with a lot of third parties, though.
   * Third parties register class names with an authority (e.g. W3C).
     But still, authors have to watch out for future uses of names.
     And third parties shouldn't have to register with W3C when they've
     already registered (for example) DNS names.
   * Define a sub-namespace not used by CSS to form DNS-like names,
     e.g. ':com:google:notranslate'.  Okay, but potentially verbose if
     used a lot.  And it doesn't generally sidestep non-CSS mechanisms
     of defining class names.
   * Use head/@profile with a URI owned by the third party.  This is
     what Microformats seem to be doing, but I don't think it is
     adequate.  Independent microformats used in the same page still
     have to avoid clashing with each other, which means going back to
     some authority's third-party register.  Plus, the author doesn't
     have control over the class names - it's all or nothing for a
     particular format.
   * Extend CSS with properties not related to style.  There's nothing
     in the framework of CSS that limits it to just style (right?).  I
     favour this, and shall elaborate on it...

Google could define a CSS property which turns translation on or off,
and the author could associate any class he chooses (indeed, any CSS
selector) with that property:

.notranslate { // Okay, so he chose the same one after all!  ;-)
 -google-translation: disable;
}

Then, to avoid Google having to scan his stylesheets just to find this
rule, the author links it in with:

<link rel="stylesheet" media="translator" href="...">

Other user agents won't touch it, because they don't recognise
"translator".  Google won't touch other stylesheets because they're not
labelled with "translator".

A few issues raised by this approach are:

   * It's not style/presentation, which is what CSS was designed for.
     But I think this is a superficial problem - just regard the name
     "CSS" and rel="stylesheet" as historical accidents, and CSS
     becomes an application of arbitrary properties, that happens to
     include ones related to style.
   * It's now invading the CSS-property and media-type namespaces.  But
     both of these could go the same way as XML namespaces and
     link/@rel schemas, if necessary.

To summarise: Rather than user agents stomping over the heretofore
author-defined namespace of class names, they should fit into it in the
same way that CSS properties do.  This would scale better, and would be
less intrusive on the author's ability to choose.
Jukka K. Korpela - 26 Oct 2008 14:55 GMT
> Is this a new trend of user-agent writers (Microformats, and now
> Google) staking claims on the @class namespace?

It surely is, and all the warnings seem to get ignored. The idea of
assigning fixed meanings to class names sounds _so_ cool and useful, and you
don't need anybody's permission or time-wasting discussions!

And it probably looks obvious that "notranslate" won't accidentally be used
for something else by someone else, so it looks safe to define it as you
like. It might be different with shorter and more vague class names like
"date" - does it refer to date notations, or dating, or something else? You
cannot possibly know what the string "date" might intuitively mean to
billions of people speaking hundreds of different languages. So by
declaring, say, "date" as predefined, you would assign arbitrary meanings to
an unknown number of constructs in documents, meanings that need not have
anything to do with the intentions of their authors.

In fact, "notranslate" is potentially very risky too. It is true that in any
existing document, it probably relates to someone's intentions of not having
something translated. But it might also mean that something _has not_ been
translated. Or it might mean 'do not translate (the content)' in a very
specific and limited technical meaning, _not_ a universal declaration that
the content should not be translated. For example, in some bilingual site
maintenance approach, it might be an instruction to human translators to
leave the content untranslated, since it shall be the same in both
languages - without meaning that it should be the same in _all_ languages.

The only sensible approach in using class attributes for purposes like
"notranslate" in the Google technique would have been to use a class name
that is syntactically malformed by existing specifications. That way, no
legitimate existing usage of the string as class attribute would have been
affected.

Even better, a new attribute (or element) should have been introduced.

Someone might say that from the viewpoint of generalized markup, a
processing instruction might have been the most adequate approach. But
generalized markup is water under the bridge, and we live with tag sets that
everyone can use as he likes and sees fit.

And on the realistic side, translation instructions should not really be
merged into markup. They are process-oriented, not data-oriented or
structure-oriented. You typically have words or phrases that should not be
translated, and would you really like to be forced to add
non-translatability markup into each and every occurrence in each document,
instead of having e.g. a site-wide glossary of terms that specifies them,
among other things?

Besides, the most common case for non-translatability that I can imagine
right now is English words and phrases in non-English text. For them, common
sense might say that it should suffice to declare their language as English.
When translating, say, some text from Dutch to French, you are normally not
supposed to translate any English words and phrases in them. If they are OK
in the original, they're usually the right choice in the translation as
well. So the only thing needed would be language markup.

> Google could define a CSS property which turns translation on or off,

That would be even more wrong than using "predefined" class names, since
translation issues are not presentational in the sense that CSS is supposed
to be.

>     * It's not style/presentation, which is what CSS was designed for.
>       But I think this is a superficial problem - just regard the name
>       "CSS" and rel="stylesheet" as historical accidents, and CSS
>       becomes an application of arbitrary properties, that happens to
>       include ones related to style.

Excuse me while fall into despair.

> To summarise: Rather than user agents stomping over the heretofore
> author-defined namespace of class names, they should fit into it in
> the same way that CSS properties do.

I cannot recognize parody any more, sorry.

Signature

Yucca, http://www.cs.tut.fi/~jkorpela/

Ben Bacarisse - 26 Oct 2008 15:14 GMT
>> Is this a new trend of user-agent writers (Microformats, and now
>> Google) staking claims on the @class namespace?
<snip>
> In fact, "notranslate" is potentially very risky too. It is true that
> in any existing document, it probably relates to someone's intentions
[quoted text clipped - 6 lines]
> untranslated, since it shall be the same in both languages - without
> meaning that it should be the same in _all_ languages.

Agreed.  It could also relate to the other meaning of "translate" --
the geometric one.  A paragraph which is to be left in its normal
position, not translated in any direction, might well be marked
"notranslate".

Signature

Ben.

Steven Simpson - 26 Oct 2008 20:05 GMT
>> Google could define a CSS property which turns translation on or off,
>
[quoted text clipped - 9 lines]
>
> Excuse me while fall into despair.

What's wrong?  I'm not suggesting that we abandon the distinction
between content and presentation, merely recognising that only two
things constrain CSS technically to presentation:

   * the set of properties defined by various specs,
   * the media type/query filter,

...and by extending these together, you get a framework still capable of
separating presentation from content, but also capable of separating
other kinds of (erm) "interpretation" from content.

Looking at it another way, if you wanted to devise a framework for the
latter separation, you could easily come up with one identical to that
used for the former, except that:

   * the file format's property set would differ from CSS's,
   * you'd have a different set of @media,
   * you wouldn't call the format CSS,
   * your @rel type wouldn't mention 'style'.

It would be technically sufficient to continue using @rel="stylesheet",
and rely on @media to distinguish between presentation and 'other kinds
of interpretation'.  But if that really is a problem, just use
@rel="propertysheet".
Harlan Messinger - 27 Oct 2008 15:38 GMT
>> Is this a new trend of user-agent writers (Microformats, and now
>> Google) staking claims on the @class namespace?
[quoted text clipped - 30 lines]
> way, no legitimate existing usage of the string as class attribute would
> have been affected.

If Google had specified class="google:notranslate" in place of
class="notranslate", despite the lack of any intrinsic significance of
the x: in class names it would have gone a long way toward eliminating
potential conflict.
 
Sign In
Join
My Latest Posts
My Monitored Threads
My Blog
My Photo Gallery
My Profile
My Homepage

Start New Thread
Enable EMail Alerts
Rate this Thread



©2009 Advenet LLC   Privacy Policy - Terms of Use
This website includes both content owned or controlled by Advenet as well as content owned or controlled by third parties.