Project

General

Profile

JSON and Unicode

Added by Miguel Revilla over 12 years ago

Hi,

If I try to use a JSON parser with utf8 encoded strings, I get a nice exception from boost spirit. The same process with plain ascii works well. Looks like wt should have a #define BOOST_SPIRIT_UNICODE somewhere. Should I open a bug on this or is there any workaround?

Thanks

This is the final part of the backtrace with gdb:

#0 0x00007ffff5fd4fa5 in raise () from /lib/libc.so.6

#1 0x00007ffff5fd6428 in abort () from /lib/libc.so.6

#2 0x00007ffff68b3c2d in _gnu_cxx::_verbose_terminate_handler() () from /lib/libstdc.so.6

#3 0x00007ffff68b1d36 in ?? () from /lib/libstdc.so.6

#4 0x00007ffff68b1d63 in std::terminate() () from /lib/libstdc.so.6

#5 0x00007ffff68b1f8e in __cxa_throw () from /lib/libstdc.so.6

#6 0x00007ffff79b9536 in void boost::throw_exception<boost::spirit::qi::expectation_failure<_gnu_cxx::_normal_iterator<char const*, std::string> > >(boost::spirit::qi::expectation_failure<_*gnu_cxx::_normal_iterator<char const*, std::string> > const&) () from /lib/libwt.so.33

#7 0x00007ffff79b9650 in bool boost::spirit::qi::detail::expect_function<_*gnu_cxx::
_normal_iterator<char const*, std::string>, boost::spirit::context<boost::fusion::cons<boost::spirit::unused_type&, boost::fusion::nil>, boost::fusion::vector0 >, boost::spirit::qi::detail::unused_skipper<boost::spirit::qi::char_class<boost::spirit::tag::char_code<boost::spirit::tag::space, boost::spirit::char_encoding::ascii> > >, boost::spirit::qi::expectation_failure<_*gnu_cxx::_normal_iterator<char const**, std::string> > >::operator()<boost::spirit::qi::literal_char<boost::spirit::char_encoding::standard, true, false> >(boost::spirit::qi::literal_char<boost::spirit::char_encoding::standard, true, false> const&) const () from /lib/libwt.so.33


Replies (2)

RE: JSON and Unicode - Added by Koen Deforche over 12 years ago

Hey,

It seems that we are indeed relying on a wrong 'character parser' class:

using ascii::char_;

Perhaps replacing that with standard::char_ will solve the issue (we do not need special interpretation of the UTF-8 content at that level, we have a proper fromUTF8() step

later on).

So yes, please file a bug.

Regards,

koen

    (1-2/2)