Next: , Up: Converting Strings   [Contents][Index]


5.4.1 String encoding

: native_bytes = unicode2native (utf8_str, codepage)
: native_bytes = unicode2native (utf8_str)

Convert UTF-8 string utf8_str to byte stream using codepage.

The character vector utf8_str is converted to a byte stream native_bytes using the code page given by codepage. The string codepage must be an identifier of a valid code page. Examples for valid code pages are "ISO-8859-1", "Shift-JIS", or "UTF-16". For a list of supported code pages, see https://www.gnu.org/software/libiconv. If codepage is omitted or empty, the system default codepage is used.

If any of the characters cannot be mapped into the codepage codepage, they are replaced with the appropriate substitution sequence for that codepage.

See also: native2unicode.

: utf8_str = native2unicode (native_bytes, codepage)
: utf8_str = native2unicode (native_bytes)

Convert byte stream native_bytes to UTF-8 using codepage.

The numbers in the vector native_bytes are rounded and clipped to integers between 0 and 255. This byte stream is then mapped into the code page given by the string codepage and returned in the string utf8_str. Octave uses UTF-8 as its internal encoding. The string codepage must be an identifier of a valid code page. Examples for valid code pages are "ISO-8859-1", "Shift-JIS", or "UTF-16". For a list of supported code pages, see https://www.gnu.org/software/libiconv. If codepage is omitted or empty, the system default codepage is used.

If native_bytes is a string vector, it is returned as is.

See also: unicode2native.


Next: Numerical Data and Strings, Up: Converting Strings   [Contents][Index]