[How-to] Convert multi byte character to Unicode.

I was working on some integration that required to pass data (if multi byte like Chinese, Japanese, etc) into unicode format to the other application. Search through the php.net, not standard PHP function was found for this purpose.

At last, found a solution from web for unicode conversion.

function str2unicode2hex($data) {
	$mb_hex = '';
	for ($i=0; $i<mb_strlen($data, 'UTF-8'); $i++) {
		$c = mb_substr($data, $i, 1, 'UTF-8');
		$mb_chars .= '{'. ($c). '}';
		$o = unpack('N', mb_convert_encoding($c, 'UCS-4BE', 'UTF-8'));
		if ($ord==10 OR $ord==13) {
				$mb_hex .= "000D000A";
		} else {
				$tmp = $this->hex_format($o[1]);
				$mb_hex .= (strlen($tmp)==2) ? "00{$tmp}" : $tmp;
		}
	}
	return $mb_hex;
}
 
function hex_format($o) {
	$h = strtoupper(dechex($o));
	$len = strlen($h);
	if ($len % 2 == 1)
		$h = "0$h";
	return $h;
}

Include above 2 PHP functions and call str2unicode2hex by passing the message to encode.

//Sample of function call
$msg = "生日快乐 Happy Birthday";
$unicode_msg = str2unicode2hex($msg);

One Comment to “[How-to] Convert multi byte character to Unicode.”

  1. Developer 19 May 2012 at 1:06 am #

    Add RSS feed please.


Leave a Reply