Actions
Bug #1505
closedwt-3.2.3-rc2 core dump if Latin1 string is interpreted as UTF8
Description
If the following string stored in Latin-1 is passed to WString::fromUTF8() a seg fault core dump is produced.
I know that it is not correct with regards to content, but it seems to me that the checking code does not work correctly. I would assume that the two latin-1 characters should appear as a '?' character.
"máquina quente do forró"
Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7ffff19c9700 (LWP 5094)]
Wt::WString::checkUTF8Encoding (value=...)
at /home/ruppert/work/wt/kdeforche-wt-523950c/src/Wt/WString.C:209
209 /home/ruppert/work/wt/kdeforche-wt-523950c/src/Wt/WString.C:
#0 Wt::WString::checkUTF8Encoding (value=...)
at /home/ruppert/work/wt/kdeforche-wt-523950c/src/Wt/WString.C:209
#1 0x00007ffff7450ad0 in Wt::WString::fromUTF8 (value=<value optimized out>,
checkValid=true)
at /home/ruppert/work/wt/kdeforche-wt-523950c/src/Wt/WString.C:190
Updated by Stefan Ruppert about 12 years ago
Again forgot to login... Thats my report ;-)
Updated by Koen Deforche about 12 years ago
- Status changed from New to InProgress
- Assignee set to Wim Dumon
- Target version set to 3.2.3
Updated by Wim Dumon about 12 years ago
Hey Stefan,
This should fix it:
void WString::checkUTF8Encoding(std::string& value)
{
const char *c = value.c_str();
for (; c < value.c_str() + value.length();) {
const char *at = c;
try {
char *dest = 0;
rapidxml::xml_document<>::copy_check_utf8(c, dest);
} catch (rapidxml::parse_error& e) {
for (const char *i = at; i < c && i < value.c_str() + value.length();
++i)
value[i - value.c_str()] = '?';
}
}
}
Will be in git soon.
BR,
wim.
Updated by Koen Deforche about 12 years ago
- Status changed from InProgress to Resolved
Updated by Koen Deforche about 12 years ago
- Status changed from Resolved to Closed
Actions