593 – %U macro doesn't take non-ascii chars

Bug 593 - %U macro doesn't take non-ascii chars

Summary: %U macro doesn't take non-ascii chars

Status:	RESOLVED LATER

Alias:	None

Product:	Samba 3.0
Classification:	Unclassified
Component:	Extended Characters (show other bugs)
Version:	3.0.20
Hardware:	All All

Importance:	P3 normal
Target Milestone:	none
Assignee:	Jeremy Allison
QA Contact:	Samba QA Contact

URL:
Keywords:

Depends on:
Blocks:	2345
	Show dependency tree / graph

Reported:	2003-10-10 00:22 UTC by Shiro Yamada
Modified:	2007-08-28 11:52 UTC (History)
CC List:	2 users (show)

See Also:

Attachments
Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Shiro Yamada 2003-10-10 00:22:17 UTC

When non-ascii characters are used in %U macro, Samba would convert them into
many '_'s.

For instance, if I specify a user with Japanese name in username.map file,
and if I set comment on a [data] share be:

-- smb.conf -----------------------------------
[global]
		:
	username map = /etc/samba/username.map

[data]
	path = /opt/data
	comment = "User name is: %U"
-----------------------------------------------

and then I use rpcclient to display the comment on [data], I get
something like:

$ rpcclient _SERVER_ -U _NAME_IN_JAP_%_PASSWD_ -c 'netshareenum'
netname: data
        remark: User name is: ____
        path:   C:\opt\data
        password:

where _NAME_IN_JAP_ is the username I specified in username.map

I've also observed that %m has same problem. Samba creates a faulty log
file name when it has been accessed by a non-ascii NetBIOS name client.

        -- smb.conf ----------------------------
        log file = /var/log/samba/log.%m
        ----------------------------------------

        outcome: /var/log/samba/log.______

This problem may be applicable to other macros.

Comment 1 Andrew Bartlett 2003-10-22 16:28:01 UTC

This is by design, for security reasons.  Remember the ../../ exploit in earlier
samba versions?  We don't want a repeat, and the %U macro is the name the user
specified, not always a valid name on the system.

Comment 2 Shiro Yamada 2003-10-24 00:46:06 UTC

Yes, I am aware that there are security risks in the expansion of macros.
However, current implementation throws away all the harmless multibyte characters
as well as those illegal ascii characters, leaving only the valid ascii characters.

I am currently figuring out whether there are potentially dangerous characters
in multibyte characters. If they appear to be safe, then there is no need to
prohibit the use of multibyte characters, or if they appeared to be dangerous, an
apporpriate action should be taken for those dangerous ones.

This problem is encoding dependent and you may argue that they should be dealt
under iconv(), but let me have a time to do more research on this area.

Comment 3 Alexander Bokovoy 2003-12-09 01:22:53 UTC

Have you had a chance to perform promised research?

Comment 4 Shiro Yamada 2003-12-10 21:21:21 UTC

Here they are.
We've focused on to the CJK encodings only, but I guess they are the most
problematic languages of all. As you have suggested, some of these characters
do contain some dangerous ascii codes within themselves.

For CP932 (Japanese), it may contain some ascii codes in its second byte, ranging
(0x40-0x7E). These codes also appear in the second byte of Big5 (Taiwanese).
  40-7E = @ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}~

The second byte of GB18030 (Chinese) may contain code range of (0x40-0x7E).
The fourth byte of GB18030 (Chinese) may also contain range of (0x30-0x39).
  40-7E = @ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}~
  30-39 = 0123456789

UHC (Korean) may contain (41-5A, 61-7A) in its second byte.
  41-5A = ABCDEFGHIJKLMNOPQRSTUVWXYZ
  61-7A = abcdefghijklmnopqrstuvwxyz

Reference: ftp://ftp.ora.com/pub/examples/nutshell/ujip/doc/cjk.inf

Comment 5 Gerald (Jerry) Carter (dead mail address) 2005-02-08 21:27:38 UTC

ab, any progess on this one ?

Comment 6 Gerald (Jerry) Carter (dead mail address) 2005-02-15 09:34:37 UTC

update to 3.0.11 and assigning to jeremy.  He seems to 
think this must work.

Comment 7 Gerald (Jerry) Carter (dead mail address) 2005-02-15 09:35:16 UTC

*** Bug 2345 has been marked as a duplicate of this bug. ***

Comment 8 Pascal Terjan 2005-02-15 10:15:09 UTC

(In reply to comment #7)
> *** Bug 2345 has been marked as a duplicate of this bug. ***

I agree this is strongly related, however my bug was not asking to stop
replacing by __ but about the fact that after replacement it does not work (i.e.
the replacment seems to not occur everywhere).

Comment 9 Gerald (Jerry) Carter (dead mail address) 2005-06-09 05:40:53 UTC

got another report of this on the samba ml.
http://lists.samba.org/archive/samba/2005-June/106675.html

Comment 10 Gerald (Jerry) Carter (dead mail address) 2006-04-08 07:43:39 UTC

Cleaning up versions.  There was no 3.0.15 so leaving it in bugzilla 
is causing some confusion.  Moving these nuder 3.0.20.
Originally files against 3.0.15preX.

Comment 11 Gerald (Jerry) Carter (dead mail address) 2007-08-28 11:52:29 UTC

This is current;y by design I think.