Chinese filenames

Do you have a question? Post it now! No Registration Necessary.  Now with pictures!

Threaded View
Hi everyone,

Firstly, I would like to know if you can open chinese filenames under  
win2000 using PHP 5.0? I have a file named 中国.php, and try  
to open it  
fopen(‘中国.php','r');. I save the source file as UTF
-8.  I get the  

Warning: fopen(中国.php) [function.fopen]: failed to open str
eam: No
such file or directory in E:\Translation\Website
Development\webroot\testing\zhongguo.php on line 8

I have triple checked that the file exists. Changing the source code
encoding to 'Unicode (UCS-2)' leads to no output in the browser  

Does PHP even support opening chinese filenames?

Secondly, I'm having some trouble accessing filenames with chinese  
characters that have been uploaded via HTTP. I'm using PHP 5 and  
Apache 2.2.  When I
attempt to upload a file with chinese filenames, the file name gets
mutated into dashes, pretty much matching the behaviour described at
' '. However, I need the original
filename (to store in a DB). The post on the manual website by
kweechang at yahoo dot com at
features.file-upload.php describes using javascript to set a hidden
field.   This would work fine but for now I'm trying not to resort to
javascript on my webpage. Does anyone know how the original filename
can be retrieved without using javascript? Maybe there is a setting in



Re: Chinese filenames

Quoted text here. Click to load it

I don't know how exactly fopen() handles strings containing characters
other that ASCII, but it is better to not rely on the underlying file
system for portability reasons. Always use simple ASCII characters. For
files uploaded via HTTP, store their original name in a DB table properly
created to support UTF-8.

Quoted text here. Click to load it

0. Ensure your PHP script be properly UTF-8 encoded. This is important
if it contains some literal string.

1. Ensure the page containing the FORM be UTF-8 encoded. For example:

header("Content-Type: text/html; charset=UTF-8");
<FORM method=post
Photo (accepted GIF, PNG o JPEG, max 500 KB):
<INPUT type=hidden name=MAX_FILE_SIZE value=512000>
<INPUT type=file name=PHOTO size=50 maxlength=512000><p>
<INPUT type=submit name=save_button value=Save>

The file name returned from the client will have the same encoding of the
page containing the FORM, that is UTF-8.

2. The name of the file can be acquired as a UTF-8 string:

$field = "PHOTO";

if( ! isset($_FILES) || ! isset($_FILES[$field]) )
    die("No file uploaded.");

$error    = (int)    $_FILES[$field]['error'];
$name     = (string) $_FILES[$field]['name'];
$type     = (string) $_FILES[$field]['type'];
$size     = (int)    $_FILES[$field]['size'];
$tmp_name = (string) $_FILES[$field]['tmp_name'];

if( $error !== 0 )
    die("Upload error code $error.");

# Here: check actual UTF-8 encoding and max length for $name.
# Here: check actual MIME $type against the allowed MIME types.
# Here: check actual $size limit.
# Here: store the file $tmp_name in a proper place with a proper name.

3. Ensure the DB you are using really has support for UTF-8. For example,
retrieve the file name once saved and compare it with the string just
acquired from the POST.

4. Don't try to save the file under the underlying file system using the
name provided by the client, always use some other identifier, for example
the primary key assigned by the DB (typically a simple number). Since the
file name now contain only simple ASCII chars, fopen() should not give

Best regards,
/_|_\  Umberto Salsi

Re: Chinese filenames

Hey Umberto,

Quoted text here. Click to load it

I actually thought of doing this, but was wondering if PHP could in  
fact do it. Never mind, I'll go along with your suggestion.

As for the file names after uploading, I had done everything you  
suggested. I noticed that after choosing the filename in Firefox 1.5,  
the filename got mangled in the text input section of the file input  
(where the filename goes after you choose a file when browsing)! I  
tried it on IE 6 and it worked correctly!! This implies to me that  
there might be a bug in FF. I'll file a bug report on mozilla and see  
what happens.


Site Timeline