784 – unix charset = UTF-8 Problem in a workgroup name

Bug 784 - unix charset = UTF-8 Problem in a workgroup name

Summary: unix charset = UTF-8 Problem in a workgroup name

Status:	CLOSED FIXED

Alias:	None

Product:	Samba 3.0
Classification:	Unclassified
Component:	nmbd (show other bugs)
Version:	3.0.0
Hardware:	All Linux

Importance:	P3 normal
Target Milestone:	none
Assignee:	Jeremy Allison
QA Contact:

URL:
Keywords:

Depends on:
Blocks:

Reported:	2003-11-17 00:29 UTC by MORIYAMA Masayuki
Modified:	2005-08-24 10:24 UTC (History)
CC List:	1 user (show)

See Also:

Attachments
A patch for extending NETBIOS name (6.33 KB, patch) 2004-03-11 23:49 UTC, Shiro Yamada	no flags	Details
View All Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description MORIYAMA Masayuki 2003-11-17 00:29:37 UTC

1. smb.conf.ja.utf8 is created using the following script.

--- cut here ---
workgroup_hex="A<b1><b2><b3><b4><b5><b6><b7><b8><b9><ba><bb><bc><bd><be>"

workgroup_cp932="`echo -n $workgroup_hex | perl -pe 's/<(..)>/chr(hex
($1))/ge;'`"
workgroup_utf8="`echo -n $workgroup_cp932 | iconv -f cp932 -t utf-8`"

echo "Length of workgrup_cp932 = " "`echo -n $workgroup_cp932 | wc -c`"
echo "Length of workgrup_utf8  = " "`echo -n $workgroup_utf8 | wc -c`"

echo "Creating smb.conf.ja.utf8"
cat <<EOF > smb.conf.ja.utf8
[global]
    dos charset = CP932
    unix charset = UTF-8
    workgroup = $workgroup_utf8
    preferred master = Yes
    local master = Yes
    log level = 3
EOF
--- cut here ---

2. nmbd is started.

% $PREFIX/sbin/nmbd -s smb.conf.ja.utf8 -D

3. The following errors are outputted to log.nmbd.

<snip>
[2003/11/17 16:58:00, 0] nmbd/nmbd_workgroupdb.c:create_workgroup(63)
  create_workgroup: workgroup name 
A<ef><bd><b1><ef><bd><b2><ef><bd><b3><ef><bd><b4><ef><bd><b5><ef><bd><b6><ef><bd
><b7><ef><bd><b8><ef><bd><b9><ef><bd><ba><ef><bd><bb><ef><bd><bc><ef><bd><bd><ef
><bd><be> is too long. Truncating to 
A<ef><bd><b1><ef><bd><b2><ef><bd><b3><ef><bd><b4><ef><bd>
[2003/11/17 16:58:00, 3] nmbd/nmbd_namelistdb.c:add_name_to_subnet(236)
  add_name_to_subnet: Added netbios name *<00> with first IP 10.1.0.140 ttl=0 
nb_flags= 0 to subnet 10.1.0.140
[2003/11/17 16:58:00, 3] nmbd/nmbd_namelistdb.c:add_name_to_subnet(236)
  add_name_to_subnet: Added netbios name *<20> with first IP 10.1.0.140 ttl=0 
nb_flags= 0 to subnet 10.1.0.140
[2003/11/17 16:58:00, 3] nmbd/nmbd_namelistdb.c:add_name_to_subnet(236)
  add_name_to_subnet: Added netbios name __SAMBA__<20> with first IP 10.1.0.140 
ttl=0 nb_flags= 0 to subnet 10.1.0.140
[2003/11/17 16:58:00, 3] nmbd/nmbd_namelistdb.c:add_name_to_subnet(236)
  add_name_to_subnet: Added netbios name __SAMBA__<00> with first IP 10.1.0.140 
ttl=0 nb_flags= 0 to subnet 10.1.0.140
[2003/11/17 16:58:00, 3] lib/charcnv.c:convert_string_allocate(444)
  convert_string_allocate: Conversion error: Illegal multibyte sequence
(<ef><bd>)
<snip>

Comment 1 Shiro Yamada 2004-01-29 23:52:33 UTC

Let me add some points to his report.

As you may be aware of, in the world of Windows Network, dos charset is used
for communication among servers, and Samba convert these characters to unix
charset whenever necessary. Also, it is worth mentioning that NetBIOS names
are handled in `nstring' data format, which is effectively 16 bytes of
character strings with a terminating null code attatched at the end.

This is problematic for a certain conversion, where a size of string in dos
charset differs to the size of the same string in unix charset.
(Note: a typical example of it is a conversion between CP932 and UTF-8, where
CP932 characters in general weigh 2bytes, whereas some UTF-8 characters
cost 3 bytes) 

If a NetBIOS name is defined in 15 bytes dos charset, then the conversion
to unix charser results in the overflow of the buffer space `nstring'.
As the consequence, string in unix charset would be terminated in middle,
and further conversions will be invalidated.

Comment 2 Shiro Yamada 2004-03-11 23:49:53 UTC

Created attachment 434 [details]
A patch for extending NETBIOS name

Currently, the NetBIOS name is limited to 16bytes in unix charset. But sometime

this limit is not valid as the limit in SMB protocol is 16bytes in dos charset.

If string in dos charset has been converted to unix charset and the resulting
string exceeds the limit of 16bytes, nmbd fails.

Comment 3 Gerald (Jerry) Carter (dead mail address) 2004-03-12 08:23:02 UTC

ooohh....nmbd and mb strings.  has jra written all over it :-)

Comment 4 Jeremy Allison 2004-03-12 16:09:36 UTC

Only problem I can see with this patch is changing :

+#define NSTRING_LEN 128
+typedef char nstring[NSTRING_LEN];

Why is this done ? I don't think this is correct. If you disagree
can you please explain.

Thanks,

Jeremy.

Comment 5 Jeremy Allison 2004-03-12 16:57:16 UTC

The right way to do this is to change pull_ascii_nstring to take a destination
length, not to change the length of nstring to be 128 bytes. I will fix this
before applying the rest of the patch.
Jeremy.

Comment 6 Jeremy Allison 2004-03-12 18:44:44 UTC

I think my current patch fixes this in SAMBA_3_0 and HEAD.
Jeremy.

Comment 7 Shiro Yamada 2004-03-17 17:58:20 UTC

I've tested it using the latest SAMBA_3_0 branch and it seems as though
you have fixed it.

Comment 8 SATOH Fumiyasu 2004-03-17 19:30:40 UTC

The buffer_len must has a number of UCS2 characters in buffer,
not a number of bytes in buffer. Need the following patch:

Index: source/lib/charcnv.c
===================================================================
RCS file: /cvsroot/samba/source/lib/charcnv.c,v
retrieving revision 1.55.2.59
diff -u -p -r1.55.2.59 charcnv.c
--- source/lib/charcnv.c	17 Mar 2004 19:23:48 -0000	1.55.2.59
+++ source/lib/charcnv.c	18 Mar 2004 03:25:38 -0000
@@ -834,10 +834,10 @@ size_t push_ascii_nstring(void *dest, co
 	smb_ucs2_t *buffer;
 
 	conv_silent = True;
-	buffer_len = push_ucs2_allocate(&buffer, src);
-	if (buffer_len == (size_t)-1) {
+	if (push_ucs2_allocate(&buffer, src) == (size_t)-1) {
 		smb_panic("failed to create UCS2 buffer");
 	}
+	buffer_len = strlen_w(buffer);
 
 	dest_len = 0;
 	for (i = 0; buffer[i] != 0 && (i < buffer_len); i++) {

or:

Index: source/lib/charcnv.c
===================================================================
RCS file: /cvsroot/samba/source/lib/charcnv.c,v
retrieving revision 1.55.2.59
diff -u -p -r1.55.2.59 charcnv.c
--- source/lib/charcnv.c	17 Mar 2004 19:23:48 -0000	1.55.2.59
+++ source/lib/charcnv.c	18 Mar 2004 03:24:28 -0000
@@ -838,6 +838,7 @@ size_t push_ascii_nstring(void *dest, co
 	if (buffer_len == (size_t)-1) {
 		smb_panic("failed to create UCS2 buffer");
 	}
+	buffer_len /= sizeof(smb_ucs2_t);
 
 	dest_len = 0;
 	for (i = 0; buffer[i] != 0 && (i < buffer_len); i++) {

Comment 9 Jeremy Allison 2004-03-18 09:59:32 UTC

Correct patch - thanks, applied.
Jeremy.

Comment 10 SATOH Fumiyasu 2004-03-19 04:05:45 UTC

You added odd lines.

In source/lib/charconv.c:

X          buffer_len = strlen_w(buffer);
 
Y         /* We're using buffer_len below to count ucs2 characters, not bytes. */
Y         buffer_len /= sizeof(smb_ucs2_t);

Please remove 'X' line or 'Y' lines.

Comment 11 SATOH Fumiyasu 2004-03-19 09:34:05 UTC

Sorry for Additional Comment #10. It's my mistake.

Comment 12 Gerald (Jerry) Carter (dead mail address) 2005-08-24 10:24:16 UTC

sorry for the same, cleaning up the database to prevent unecessary reopens of bugs.