Main Page | Modules | Data Structures | File List | Data Fields | Globals

svn_utf.h File Reference

UTF-8 conversion routines. More...

#include <apr_xlate.h>
#include "svn_error.h"
#include "svn_string.h"

Go to the source code of this file.

Functions

svn_error_tsvn_utf_stringbuf_to_utf8 (svn_stringbuf_t **dest, const svn_stringbuf_t *src, apr_pool_t *pool)
 Set *dest to a utf8-encoded stringbuf from native stringbuf src; allocate *dest in pool.

svn_error_tsvn_utf_string_to_utf8 (const svn_string_t **dest, const svn_string_t *src, apr_pool_t *pool)
 Set *dest to a utf8-encoded string from native string src; allocate *dest in pool.

svn_error_tsvn_utf_cstring_to_utf8 (const char **dest, const char *src, apr_pool_t *pool)
 Set *dest to a utf8-encoded C string from native C string src; allocate *dest in pool.

svn_error_tsvn_utf_cstring_to_utf8_ex (const char **dest, const char *src, const char *frompage, const char *convset_key, apr_pool_t *pool)
 Set *dest to a utf8-encoded C string from frompage C string src; allocate *dest in pool.

svn_error_tsvn_utf_stringbuf_from_utf8 (svn_stringbuf_t **dest, const svn_stringbuf_t *src, apr_pool_t *pool)
 Set *dest to a natively-encoded stringbuf from utf8 stringbuf src; allocate *dest in pool.

svn_error_tsvn_utf_string_from_utf8 (const svn_string_t **dest, const svn_string_t *src, apr_pool_t *pool)
 Set *dest to a natively-encoded string from utf8 string src; allocate *dest in pool.

svn_error_tsvn_utf_cstring_from_utf8 (const char **dest, const char *src, apr_pool_t *pool)
 Set *dest to a natively-encoded C string from utf8 C string src; allocate *dest in pool.

svn_error_tsvn_utf_cstring_from_utf8_ex (const char **dest, const char *src, const char *topage, const char *convset_key, apr_pool_t *pool)
 Set *dest to a frompage encoded C string from utf8 C string src; allocate *dest in pool.

const char * svn_utf_cstring_from_utf8_fuzzy (const char *src, apr_pool_t *pool)
 Return a fuzzily native-encoded C string from utf8 C string src, allocated in pool.

svn_error_tsvn_utf_cstring_from_utf8_stringbuf (const char **dest, const svn_stringbuf_t *src, apr_pool_t *pool)
 Set *dest to a natively-encoded C string from utf8 stringbuf src; allocate *dest in pool.

svn_error_tsvn_utf_cstring_from_utf8_string (const char **dest, const svn_string_t *src, apr_pool_t *pool)
 Set *dest to a natively-encoded C string from utf8 string src; allocate *dest in pool.


Detailed Description

UTF-8 conversion routines.

Definition in file svn_utf.h.


Function Documentation

svn_error_t* svn_utf_cstring_from_utf8_ex const char **  dest,
const char *  src,
const char *  topage,
const char *  convset_key,
apr_pool_t *  pool
 

Set *dest to a frompage encoded C string from utf8 C string src; allocate *dest in pool.

Use convset_key as the cache key for the charset converter; if it's NULL, don't cache the converter.

const char* svn_utf_cstring_from_utf8_fuzzy const char *  src,
apr_pool_t *  pool
 

Return a fuzzily native-encoded C string from utf8 C string src, allocated in pool.

A fuzzy recoding leaves all 7-bit ascii characters the same, and substitutes "?\\XXX" for others, where XXX is the unsigned decimal code for that character.

This function cannot error; it is guaranteed to return something. First it will recode as described above and then attempt to convert the (new) 7-bit UTF-8 string to native encoding. If that fails, it will return the raw fuzzily recoded string, which may or may not be meaningful in the client's locale, but is (presumably) better than nothing.

### Notes:

Improvement is possible, even imminent. The original problem was that if you converted a UTF-8 string (say, a log message) into a locale that couldn't represent all the characters, you'd just get a static placeholder saying "[unconvertible log message]". Then Justin Erenkrantz pointed out how on platforms that didn't support conversion at all, "svn log" would still fail completely when it encountered unconvertible data.

Now for both cases, the caller can at least fall back on this function, which converts the message as best it can, substituting ?\XXX escape codes for the non-ascii characters.

Ultimately, some callers may prefer the iconv "//TRANSLIT" option, so when we can detect that at configure time, things will change. Also, this should (?) be moved to apr/apu eventually.

See http://subversion.tigris.org/issues/show_bug.cgi?id=807 for details.

svn_error_t* svn_utf_cstring_to_utf8_ex const char **  dest,
const char *  src,
const char *  frompage,
const char *  convset_key,
apr_pool_t *  pool
 

Set *dest to a utf8-encoded C string from frompage C string src; allocate *dest in pool.

Use convset_key as the cache key for the charset converter; if it's NULL, don't cache the converter.


Generated on Mon Oct 18 17:33:14 2004 for Subversion by doxygen 1.3.5