Skip to content Skip to sidebar Skip to footer

Order Of String Replacement Function Invocations When Used With RTL Languages

When calling String.replace with a replacement function we're able to retrieve offsets of the matched substrings. var a = []; 'hello world'.replace(/l/g, function (m, i) { a.push(i

Solution 1:

I have searched the ECMA-262 5.1 Edition/June 2011 with the keyword "format control", "right to left" and "RTL", and there is no mention of them, except for where it says format control characters are allowed in the string literal and regular expression literal.

From section 7.1

It is useful to allow format-control characters in source text to facilitate editing and display. All format control characters may be used within comments, and within string literals and regular expression literals.

Annex E

7.1: Unicode format control characters are no longer stripped from ECMAScript source text before processing. In Edition 5, if such a character appears in a StringLiteral or RegularExpressionLiteral the character will be incorporated into the literal where in Edition 3 the character would not be incorporated into the literal

With this, I conclude that JavaScript doesn't operate any differently on Right-to-Left characters. It only knows about the UTF-16 code units stored in the string, and works based on the logical order.


Post a Comment for "Order Of String Replacement Function Invocations When Used With RTL Languages"