10269 – byteorder.h macros break strict aliasing on 32bit

Bug 10269 - byteorder.h macros break strict aliasing on 32bit

Summary: byteorder.h macros break strict aliasing on 32bit

Status:	RESOLVED FIXED

Alias:	None

Product:	Samba 4.1 and newer
Classification:	Unclassified
Component:	Build (show other bugs)
Version:	4.1.0
Hardware:	All All

Importance:	P5 normal (vote)
Target Milestone:	---
Assignee:	Karolin Seeger
QA Contact:	Samba QA Contact

URL:
Keywords:

Depends on:
Blocks:

Reported:	2013-11-14 17:40 UTC by Andreas Schneider
Modified:	2013-11-18 09:42 UTC (History)
CC List:	2 users (show)

See Also:

Attachments
v4-1-test patch (4.46 KB, patch) 2013-11-15 09:43 UTC, Andreas Schneider	vl: review+	Details
v4-0-test patch (4.46 KB, patch) 2013-11-15 09:44 UTC, Andreas Schneider	vl: review+	Details
View All Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Andreas Schneider 2013-11-14 17:40:43 UTC

The byteorder.h macros which don't use careful alignment break strict aliasing on 32bit. Cause of this the compiler isn't able to optimize the code.

If you compile Samba with optimization on i686 you get 342 warnings like this:

lib/util/genrand.c:294:2: warning: dereferencing type-punned pointer will break strict-aliasing rules [-Wstrict-aliasing]

These warnings normally mean that the compiler can't optimize the code. For the long answer read [1].

The reason is that we have a define CAREFUL_ALIGNMENT in byteorder.h. There is a comment which states:

--------------
The distinction between 386 and other architectures is only there as
an optimisation. You can take it out completely and it will make no
difference. The routines (macros) in byteorder.h are totally byteorder
independent. The 386 optimsation just takes advantage of the fact that
the x86 processors don't care about alignment, so we don't have to
align ints on int boundaries etc. If there are other processors out
there that aren't alignment sensitive then you could also define
CAREFUL_ALIGNMENT=0 on those processors as well.
--------------

Well, to show that this is not an optimization but breaks the compiler optimization I wrote a test with it:

----- schnipp -----
asn@samba32:~/workspace/projects/test> cat g.c

#include <stdint.h>
#include <stdio.h>

#include "byteorder.h"

uint32_t generate_random(void)
{
    uint8_t v[4];

    v[0] = 1;
    v[1] = 2;
    v[2] = 3;
    v[3] = 4;

    return IVAL(v, 0);
}

int main(void)
{
    uint32_t i;

    i = generate_random();

    printf("i=%u", i);

    return 0;
}
----- schnapp -----

You need lib/util/byteorder.h from Samba and then compile it:

----- schnipp -----
asn@samba32:~/workspace/projects/test> gcc -O3 -Wall -S g.c -o g
g.c: In function ‘generate_random’:
g.c:15:5: warning: dereferencing type-punned pointer will break strict-aliasing rules [-Wstrict-aliasing]
     return IVAL(v, 0);
     ^
----- schnapp -----

So the next step was to compile an assembler file

----- schnipp -----
asn@samba32:~/workspace/projects/test> gcc -O3 -Wall -S g.c -o g_without_careful_alignment.s 
g.c: In function ‘generate_random’:
g.c:15:5: warning: dereferencing type-punned pointer will break strict-aliasing rules [-Wstrict-aliasing]
     return IVAL(v, 0);
     ^
asn@samba32:~/workspace/projects/test> vi byteorder.h
asn@samba32:~/workspace/projects/test> gcc -O3 -Wall -S g.c -o g_with_careful_alignment.s
----- schnapp -----

If you create a diff, you can see that if we use the careful alignment functions, we don't have any strict aliasing problems and the compiler can optimize the source code. generate_random() is reduced by 4 instructions.

----- schnipp -----
asn@samba32:~/workspace/projects/test> diff -u g_without_careful_alignment.s g_with_careful_alignment.s
--- g_without_careful_alignment.s       2013-11-14 18:31:40.609949754 +0100
+++ g_with_careful_alignment.s  2013-11-14 18:32:16.371947782 +0100
@@ -6,11 +6,7 @@
 generate_random:
 .LFB11:
        .cfi_startproc
-       subl    $16, %esp
-       .cfi_def_cfa_offset 20
        movl    $67305985, %eax
-       addl    $16, %esp
-       .cfi_def_cfa_offset 4
        ret
        .cfi_endproc
 .LFE11:
@@ -31,10 +27,9 @@
        movl    %esp, %ebp
        .cfi_def_cfa_register 5
        andl    $-16, %esp
-       subl    $32, %esp
+       subl    $16, %esp
        movl    $67305985, 4(%esp)
        movl    $.LC0, (%esp)
-       movl    $67305985, 28(%esp)
        call    printf
        xorl    %eax, %eax
        leave
----- schnapp -----



[1] http://cellperformance.beyond3d.com/articles/2006/06/understanding-strict-aliasing.html

Comment 1 Andreas Schneider 2013-11-15 09:43:29 UTC

Created attachment 9423 [details]
v4-1-test patch

Comment 2 Andreas Schneider 2013-11-15 09:44:44 UTC

Created attachment 9424 [details]
v4-0-test patch

Comment 3 Andreas Schneider 2013-11-15 10:41:13 UTC

Karolin, please add the patches to the relevant branches. Thanks!

Comment 4 Karolin Seeger 2013-11-15 10:49:42 UTC

Pushed to autobuild-v4-0-test and autobuild-v4-0-test.

Comment 5 Karolin Seeger 2013-11-18 09:42:41 UTC

(In reply to comment #4)
> Pushed to autobuild-v4-0-test and autobuild-v4-0-test.

Pushed to v4-1-test and v4-0-test.
Closing out bug report.

Thanks!