> ISO-8859-1 or Cp1252 (we were using mainly windows editors mostly CF
> Studio to create the files). Pages which use non ASCII characters are
> not displayed
first off those are not the same encoding. the windows codepage is more
like a superset of latin-1 (iso-8859-1). so you really need to know
which encoding was used on those files.
> propperly becasue CFMX opens the files as UTF-8.
no, cf opens files in the server's default encoding. it "displays" them
in it's default encoding, utf-8.
> I got a hint, that i can set <cfprocessingdirective
> pageencoding="ISO-8859-1"/> to correct that. The problem with the
> above is, that i would have to do this for every file since the
> application runs trough a single index.cfm and calls a lot of
> additional content via <cfinclude ...> or mostly trough <cfmodule
> ...>. Changing all files is at the moment is
make it a good practice from now on. in any case, you can force your cf
server to use a different character encoding for HTTP from its default
(UTF-8) by changing the defaultCharset value in
cf_root/lib/neo-runtime.xml file:
<var name='defaultCharset'><string>UTF-8</string></var>
again the issue is which encoding to substitute for utf-8. frankly,
given the multilingual nature of your app you might want to convert your
pages' encoding to utf-8 & use that from now on. it wlll be less painul
in the long run (especially when somebody starts asking for non-latin-1
charsets). there are a couple of tools to do this in mass around (like
unifier).
you might want to read thru this:
http://www.macromedia.com/devnet/coldfusion/articles/globalize.html
Telemedianer - 30 Dec 2005 10:49 GMT
Paul,
thanks again for your response, you are quick! About the currently used
codepage; i do not know how to detect it, i have tried to use the file utility
on this server, but it returns the mime-type (as expected) but no codepage info
however, any idea ? We have used cfstudio on windows, so i guess it's rather
"Cp1252 " than iso-8859-1, however i guess this doesn't matter much, since
except the "Umlaute (??? ???? should be in Cp1252 and iso-8859-1)" all
characters should be coded as html entities.
The server's default encoding seems to be utf-8, that is what the jvm settings
overview in cf admin says. Unfortunately i do not know how to change it under
linux so that the JVM will detect it propperly, that is why i have tried to
change it via java parameter (-Dxxx=xxx).
I would really like to find a setting to change this globally, because i do
not want to put to much effort into this transition from cf5 to cfmx7 (we are
moving away from cf, since it doesn't fit the needs of the customer anymore).
i have changed the cf_root/lib/neo-runtime.xml file, it didn't change
anything. trough google i found some other linux users with the saame problem
(changing the neo-runtime.xml file on windows seems to work, on linux it has no
effect in 7.01).
Cheers,
-S
Telemedianer - 30 Dec 2005 13:14 GMT
hi, i got it working:
<var name='defaultCharset'><string>ISO8859_1</string></var> (the java
implementation does not like the notation: "iso8859-1").
What i see is that the response header now has the iso-8859-1 charset by
default. this is great and can also be done in Application.cfm with a <cfheader
...>
The remaining problem seems to be that files are still opened as if they were
utf-8, special characters are sent as "?" to the client.
in "cfadmin > server setings > Setting Summary" under "JVM Details" there is
an entry:
Java File Encoding : UTF8
i think if it would be possible to change this, it would solve my problem but
i do not know how. i have tried to pass "-Dfile.encoding=ISO8859_1" to the JVM
which (after a restart of the cf service) results in the bove posted error.
some help ... anyone ?
Cheers,
Simon
Telemedianer - 30 Dec 2005 13:58 GMT
i found the following temporary solution (there are several thousand of cfm
files + another bunch of html/txt files using iso-8859-1). as mentioned this
doesn't solve my problem becasue user continue to contribute documents in
iso-8859-1 but to get a start you may do the following in a bash/sh:
$ export ext="cfm"
$ find ./ -name "*.${ext}" -exec iconv -f iso-8859-1 -t utf-8 {} -o {}.new \;
$ find ./ -name "*.${ext}" -exec mv {} {}.old \;
$ for f in `find ./ -name "*.${ext}.new"`; do mv $f `echo $f | sed -e
's/\.new//'`; done