Bug 1822 - Samba shares access fails to report valid errno to multithreaded Solaris program
Summary: Samba shares access fails to report valid errno to multithreaded Solaris program
Status: CLOSED FIXED
Alias: None
Product: Samba 3.0
Classification: Unclassified
Component: libsmbclient (show other bugs)
Version: 3.0.6
Hardware: Sparc Solaris
: P3 major
Target Milestone: none
Assignee: Derrell Lipman
QA Contact: Samba QA Contact
URL:
Keywords:
Depends on:
Blocks: 5024
  Show dependency treegraph
 
Reported: 2004-09-24 19:05 UTC by David S. Collier-Brown
Modified: 2007-10-18 07:00 UTC (History)
0 users

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description David S. Collier-Brown 2004-09-24 19:05:16 UTC
narayana.pattipati@wipro.com reported this as follows:

I have issues in accessing samba shares from an MT application in
Solaris. 

The program architecture is something like this:

  Nautilus (GNOME file manager)
        |
  gnome-vfs (GNOME virtual file system library)
        |
  libsmbclient.so (samba libraries) 

Nautilus and gnome-vfs are MT enabled and they depend on threads heavily
to show any file system contents. I am facing an issue when gnome-vfs is
trying to connect to a samba server to access shares. The problem is
seen in Solaris only. Its not seen in Linux, as thread implementation is
different there. The issue is: 

When gnome-vfs tries to open any samba share by calling opendir function 
of samba context, samba library give the following error on the command 
line:

Error connecting to xxx.xxx.xxx.xxx (Error 0)
Error connecting to xxx.xxx.xxx.xxx (Error 0)

Where xxx.xxx.xxx.xxx is the ip address of master server in the windows 
domain. This error comes because of socket connection failure in the code
samba/sources/lib/util_sock.c:open_socket_out()

    ret = connect(res,(struct sockaddr *)&sock_out,sizeof(sock_out));

  /* Some systems return EAGAIN when they mean EINPROGRESS */
    if (ret < 0 && (errno == EINPROGRESS || errno == EALREADY ||
                  errno == EAGAIN) && (connect_loop < timeout) ) {
            smb_msleep(connect_loop);
            timeout -= connect_loop;
            connect_loop += increment;
            if (increment < 250) {
            /* After 8 rounds we end up at a max of 255 msec */
            increment *= 1.5;
            }
        goto connect_again;
    }

In the above code, socket is set to non-blocking before connect. Since 
it is non-blocking, while connecting, first try will give "operation in
progress" (EINPROGRESS) error. But when the code makes comparison of 
errno with the error codes, it is failing to match EINPROGRESS. This is 
because errno is set to "0". Hence, the code does not attempt to connect
again and fails to connect to any windows server permanently.

Root cause for this is that samba libraries are linked from 
nautilus/gnome-vfs, which is a multi-threaded program. Since samba is not
built with "-mt" compiler flag, accessing errno gives inconsistent values.
The thread which calls samba code can only update its thread specific 
"errno", not the global "errno". Since the samba library is not built
for MT, the errno will point to global errno, which a thread can't
update.

If samba is build with "-mt" flags, then the errno will be pointed 
to __errno() macro, instead of global errno, and thread executing samba 
code will be able to access correct errno.

---

Following a dicussion on samba-tchnical, I proposed a
necessary and sufficient cirection would be

1) for the caller to use suitable lcoking to make
   it serially reusable (a classic approach), and
2) on Solaris, to compile with the -mt option,
   which ensures corerect behavior on threaded as
Comment 1 David S. Collier-Brown 2004-09-24 19:07:18 UTC
   ... well as non-threaded programs.

--dave (playing scribe) c-b
Comment 2 David S. Collier-Brown 2004-09-28 08:37:09 UTC
  Just a followup to this fix for the team:  in the discussion
below, we've used the Solaris cc option "-mt" as a short form.
Setting it does two things:
    it sets -D_REENTRANT on the compiler command line
    it adds -lthread to the library list
  Setting -D_REENTRANT causes a number of small changes to
support reentrant and/or threaded code, the relevant part
being defining errno to be ___errno(), the thread-safe
errno function.

  So in principle, all that needs to change is the generated
make line for libsmbclient.so, from

---
bin/libsmbclient.so: $(LIBSMBCLIENT_PICOBJS)
    @echo Linking libsmbclient shared library $@
    @$(SHLD) $(LDSHFLAGS) -o $@ $(LIBSMBCLIENT_PICOBJS) \
        $(LDFLAGS) $(DYNEXP) $(LIBS) \
              $(KRB5LIBS) $(LDAP_LIBS) \
        -h `basename $@`.$(LIBSMBCLIENT_MAJOR)

to
---
bin/libsmbclient.so: $(LIBSMBCLIENT_PICOBJS)
    @echo Linking libsmbclient shared library $@
    @$(SHLD) $(LDSHFLAGS) -D_REENTRANT \
        -o $@ $(LIBSMBCLIENT_PICOBJS) \
        $(LDFLAGS) $(DYNEXP) $(LIBS) \
              $(KRB5LIBS) $(LDAP_LIBS) -lthread \
        -h `basename $@`.$(LIBSMBCLIENT_MAJOR)

... for Solaris versions up to 9. On Solaris 10, you don't
need to add -lthread, as it was folded into the standard
c library.
Comment 3 David S. Collier-Brown 2004-09-28 09:27:49 UTC
The previous comemnt was in error: 
The -D_REENTRANT needs to be on the compilation of the main
members of LIBSMBCLIENT_OBJ, which are
    libsmb/libsmbclient.o
    libsmb/libsmb_compat.o and
    libsmb/libsmb_cache.o
if all the errno references are confined to them.
Comment 4 David S. Collier-Brown 2004-10-08 09:37:17 UTC
[Just so the fix is in the big report --dave]


On Tue, 28 Sep 2004, Sean McGrath - Sun Microsystems Ireland wrote:$ diff -u
configure configure.new
> --- configure   Wed Sep 22 16:29:40 2004
> +++ configure.new       Tue Sep 28 17:35:10 2004
> @@ -3553,7 +3553,8 @@
>                                 rm -fr conftest.c
>                                 case "$ac_cv_gcc_compiler_version_number" in
>                                         *"gcc version 2.6"*|*"gcc version 2.7"*)
> -                                               CPPFLAGS="$CPPFLAGS
-D_LARGEFILE64_SOURCE"
> +                                               CPPFLAGS="$CPPFLAGS
-D_LARGEFILE64_SOURCE -D_REENTRANT"
> +                                               LDFLAGS="$LDFLAGS -lthread"
>
>  cat >>confdefs.h <<\_ACEOF
>  #define _LARGEFILE64_SOURCE 1
> @@ -3561,7 +3562,8 @@
>
>                                                 ;;
>                                         *)
> -                                               CPPFLAGS="$CPPFLAGS
-D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64"
> +                                               CPPFLAGS="$CPPFLAGS
-D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64 -D_REENTRANT"
> +                                               LDFLAGS="$LDFLAGS -lthread"
>
>  cat >>confdefs.h <<\_ACEOF
>  #define _LARGEFILE64_SOURCE 1
> @@ -3575,7 +3577,8 @@
>                                                 ;;
>                                 esac
>                         else
> -                               CPPFLAGS="$CPPFLAGS -D_LARGEFILE_SOURCE
-D_FILE_OFFSET_BITS=64"
> +                               CPPFLAGS="$CPPFLAGS -D_LARGEFILE_SOURCE
-D_FILE_OFFSET_BITS=64 -D_REENTRANT"
> +                               LDFLAGS="$LDFLAGS -lthread"
>
>  cat >>confdefs.h <<\_ACEOF
>  #define _LARGEFILE64_SOURCE 1
>

Richard Sharpe wrote: 
OK, I believe I know where to fix this in configure.in, 
Comment 5 Richard Sharpe 2004-10-10 22:15:13 UTC
Applied the change above to configure.in.

Change #2905.
Comment 6 Richard Sharpe 2004-10-10 22:27:34 UTC
Please try out the new changes on Solaris.
Comment 7 Derrell Lipman 2005-03-29 12:18:30 UTC
Richard made the requested changes to configure on 2004-10-10 22:27 with no 
comments received since then.  Presumed fixed.  Re-open if there is still a problem.
Comment 8 Gerald (Jerry) Carter (dead mail address) 2005-08-24 10:20:35 UTC
sorry for the same, cleaning up the database to prevent unecessary reopens of bugs.