Home | Contact Us | FAQ | Search & Site Map | Link to Us
Sign In | Join | Other 45 Sites in Network
Home
Discussion GroupsGeneralPHPASPPerlColdFusionFlashHTML, CSS, ScriptsBrowsers

Webmaster Forum / ColdFusion / Advanced Techniques / September 2006



Tip: Looking for answers? Try searching our database.

Characters such as apostrophes and smart quotes turning into boxes or question marks

Thread view: 
Enable EMail Alerts  Start New Thread
Thread rating: 
Grant - 26 Sep 2006 19:14 GMT
We recently upgraded from CF5 to CF7 and are having a problem with previously
saved text that no longer displays correctly.  Some characters (apparently,
non-ASCII characters such as curly apostrophes and smart quotes) are rendering
as boxes or question marks. We recently upgraded to Oracle 10g from Oracle 8i,
but this problem appears to be independent of the database that the text is
stored in. Here is sample code that will illustrate the problem:

<CFSET string1="Department?s">
<CFSET string2="hey?there">

<CFOUTPUT>
string1 is #string1#
<BR>
string2 is #string2#
</CFOUTPUT>

output looks like this:

string1 is Department?s
string2 is hey?there

These are rendered as boxes when viewed in Internet Explorer. (They show up as
question marks when I copy and paste them here.)

The Demoronize UDF helps *some* of the time, but this is still happening with
a lot of text, especially text that gets pasted from a website into a form,
then saved to a database.  Does anybody have a solution for this?  This is
breaking my applications and is incredibly annoying.  I'd like to either
replace the problematic characters at the time they are displayed, or replace
them when they are input in the database in the first place (and go back and
update all the previously saved data to replace the problematic characters with
plain text equivalents).

Any suggestions appreciated.
Dan Bracuk - 26 Sep 2006 20:45 GMT
Go to cflib.org and find the safetext function.
Grant - 28 Sep 2006 16:42 GMT
The Safetext UDF strips out scripts and other undesirable tags.  It doesn't do anything to fix the characters I described.
web-spinners - 29 Sep 2006 19:59 GMT
There are two strategies for dealing with unwanted chars.
1. Tell what is not allowed.
2. Tell what is allowed.

The problem with strategy 1 is that you must list ALL the bad chars.
Strategy 2 is better.
The following code uses a Regular Expression .

<cfscript>
function goodChars(str)
{
    str = REReplace(str, "[^A-Za-z0-9\=\_\s\-\.##$&@]", "", "ALL");
    return str;
}

function hasAllGoodChars(str)
{
    if (REFind("[^ A-Za-z0-9\_\-]",str) eq 0){
    return true;}else{return false;}
}
</cfscript>
jasals - 29 Sep 2006 20:50 GMT
You could run replacements on each variable... proabably not the most
efficient... see below...

<CFSET gecko = #trim(#queryname.fieldname#)#>
<CFSET gecko = #replace(gecko, "character to be replaced", "safe character",
"ALL")#>
<CFSET gecko = #replace(gecko, "é", "e", "ALL")#>
etc...

#gecko#
Adam Cameron - 29 Sep 2006 22:50 GMT
Provided your code, the CF server, the DB server and the browser are all
told to use the correct (and SAME!) character encoding, you should not have
this problem.

Do a search on these forums for "special characters" for full and
comprehensive discussion on this topic.

Signature

Adam

Grant - 29 Sep 2006 22:57 GMT
I finally isolated the problematic characters so I edited the DeMoronize UDF
(available at cflib.org) by adding the following text replacements at the
bottom:

    text = Replace(text, chr(8208), "-", "ALL");
    text = Replace(text, chr(8209), "-", "ALL");
    text = Replace(text, chr(8210), "&ndash;", "ALL");
    text = Replace(text, chr(8211), "&ndash;", "ALL");
    text = Replace(text, chr(8212), "&mdash;", "ALL");
    text = Replace(text, chr(8213), "&mdash;", "ALL");
    text = Replace(text, chr(8214), "||", "ALL");
    text = Replace(text, chr(8215), "_", "ALL");
    text = Replace(text, chr(8216), "&lsquo;", "ALL");
    text = Replace(text, chr(8217), "&rsquo;", "ALL");
    text = Replace(text, chr(8218), ",", "ALL");
    text = Replace(text, chr(8219), "'", "ALL");
    text = Replace(text, chr(8220), "&ldquo;", "ALL");
    text = Replace(text, chr(8221), "&rdquo;", "ALL");
    text = Replace(text, chr(8222), """", "ALL");
    text = Replace(text, chr(8223), """", "ALL");
    text = Replace(text, chr(8226), "&middot;", "ALL");
    text = Replace(text, chr(8227), ">", "ALL");
    text = Replace(text, chr(8228), ".", "ALL");
    text = Replace(text, chr(8229), "..", "ALL");
    text = Replace(text, chr(8230), "...", "ALL");
    text = Replace(text, chr(8231), "&middot;", "ALL");
 
Sign In
Join
My Latest Posts
My Monitored Threads
My Blog
My Photo Gallery
My Profile
My Homepage

Start New Thread
Enable EMail Alerts
Rate this Thread



©2008 Advenet LLC   Privacy Policy - Terms of Use
This website includes both content owned or controlled by Advenet as well as content owned or controlled by third parties.