Project

General

Profile

Actions

Bug #11321

closed

Semantics of boost::split and Utils:split are not properly preserved when translating to Java

Added by Roel Standaert about 1 year ago. Updated 10 months ago.

Status:
Closed
Priority:
Normal
Assignee:
Roel Standaert
Category:
-
Target version:
Start date:
02/09/2023
Due date:
% Done:

100%

Estimated time:

Description

boost::split's behavior is notably different from Java's String#split:

  • Splitting "," on the ',' character yields a list of two empty strings with boost::split (and in many other programming languages, e.g. Python, JavaScript, Rust,...), whereas Java yields an empty list.
  • We use boost::is_any_of, which interprets its arguments as a sequence of characters to split on, but String#split treats its argument as a regular expression!

This means that:

  1. Translating Utils::split from web/StringUtils.h to eu.webtoolkit.jwt.StringUtils#split is incorrect, since StringUtils#split simply uses String#split
  2. Translating boost::split($0;,$1;,boost::is_any_of($2;)) to $0; = new ArrayList<String>(Arrays.asList($1;.split($2;))) is incorrect

We should write a StringUtils#split that does have the correct semantics, unit test it, and translate all of our usages of Utils::split and boost::split to use it.

I noticed this while working on WEmailEdit and WEmailValidator, where I noticed that JWt would validate lists of multiple email addresses differently from C++ (issue #7279).

Actions #1

Updated by Roel Standaert about 1 year ago

  • Status changed from InProgress to Review
  • Assignee deleted (Roel Standaert)
Actions #2

Updated by Roel Standaert about 1 year ago

  • Status changed from Review to Implemented @Emweb
  • Assignee set to Roel Standaert
  • % Done changed from 0 to 100
Actions #3

Updated by Roel Standaert 10 months ago

  • Status changed from Implemented @Emweb to Resolved
Actions #4

Updated by Roel Standaert 10 months ago

  • Status changed from Resolved to Closed
Actions

Also available in: Atom PDF