Using ICU strings

There are multiple ways of initializing an ICU Unicode string in C, but what I found the most convenient is the u_strFromUTF8() function.

The following example will initialize a Unicode string and write it to a file with the write_utf8_file() function developed in a previous article.

#include "unicode/ustring.h"

int main() {
    /* create a UChar array large enough to hold the text and the NULL character */
    UChar str[9];

    /* initialize the string */
    UErrorCode err = U_ZERO_ERROR;
    u_strFromUTF8(str, 9, NULL, "1000 さくら", -1, &err);

    /* write the Unicode string to a file */
    write_utf8_file("utf8_test_out.txt", str);

The parameters to the u_strFromUTF8() function are:

  1. the destination string, str
  2. the capacity of the destination array (which is equal to the length of the UTF-8 string plus one NULL termination character), in this case 9
  3. a pointer to an integer variable in which to store the number of characters actually written (skipping this with a NULL pointer)
  4. the source string
  5. the length of the char array holding the source string, or -1 to automatically count the length of a NULL terminated string
  6. pointer to a variable to store the error code