Actions
Bug #11321
closedSemantics of boost::split and Utils:split are not properly preserved when translating to Java
Start date:
02/09/2023
Due date:
% Done:
100%
Estimated time:
Description
boost::split
's behavior is notably different from Java's String#split
:
- Splitting "," on the ',' character yields a list of two empty strings with
boost::split
(and in many other programming languages, e.g. Python, JavaScript, Rust,...), whereas Java yields an empty list. - We use
boost::is_any_of
, which interprets its arguments as a sequence of characters to split on, butString#split
treats its argument as a regular expression!
This means that:
- Translating
Utils::split
fromweb/StringUtils.h
toeu.webtoolkit.jwt.StringUtils#split
is incorrect, sinceStringUtils#split
simply usesString#split
- Translating
boost::split($0;,$1;,boost::is_any_of($2;))
to$0; = new ArrayList<String>(Arrays.asList($1;.split($2;)))
is incorrect
We should write a StringUtils#split
that does have the correct semantics, unit test it, and translate all of our usages of Utils::split
and boost::split
to use it.
I noticed this while working on WEmailEdit
and WEmailValidator
, where I noticed that JWt would validate lists of multiple email addresses differently from C++ (issue #7279).
Updated by Roel Standaert almost 2 years ago
- Status changed from InProgress to Review
- Assignee deleted (
Roel Standaert)
Updated by Roel Standaert over 1 year ago
- Status changed from Review to Implemented @Emweb
- Assignee set to Roel Standaert
- % Done changed from 0 to 100
Updated by Roel Standaert over 1 year ago
- Status changed from Implemented @Emweb to Resolved
Updated by Roel Standaert over 1 year ago
- Status changed from Resolved to Closed
Actions