Bug 13302 - smbd does not support AESNI on AMD platforms
Summary: smbd does not support AESNI on AMD platforms
Status: RESOLVED FIXED
Alias: None
Product: Samba 4.1 and newer
Classification: Unclassified
Component: File services (show other bugs)
Version: unspecified
Hardware: x86 All
: P5 normal (vote)
Target Milestone: 4.7
Assignee: Karolin Seeger
QA Contact: Samba QA Contact
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2018-02-27 07:17 UTC by eric.vannier
Modified: 2018-04-10 07:23 UTC (History)
4 users (show)

See Also:


Attachments
Fixes bug (adds AESNI support for AMD processors) (2.40 KB, patch)
2018-03-22 07:07 UTC, eric.vannier
bjacke: review-
Details
git-am fix for master. (1.56 KB, patch)
2018-03-22 23:49 UTC, Jeremy Allison
no flags Details
Revised git-am fix for master. (1.39 KB, patch)
2018-03-22 23:54 UTC, Jeremy Allison
no flags Details
Proposed patch for all processors (3.57 KB, patch)
2018-03-23 04:36 UTC, eric.vannier
bjacke: review+
Details
git-am fix for 4.7.next, 4.8.next. (1.79 KB, patch)
2018-03-27 18:33 UTC, Jeremy Allison
bjacke: review+
Details

Note You need to log in before you can comment on or make changes to this bug.
Description eric.vannier 2018-02-27 07:17:34 UTC
AESNI acceleration is not enabled on any AMD processors (and hence on Ryzens). As a result, when configuring samba with encryption the performance is very poor.

Looking at the code, I can see that the problems like in lib/crypto/aes.c where the code checks for cpuid reg0 (has_intel_aes_instructions) (pseudo code):
if (cpu_results != "IntelGenuine")
  return false

Obviously on AMD processors, the value is AuthenticAMD, and therefore the code fails to detect the acceleration. I changed the code to recognize the processor, and I was then able to use AESNI (or at least performance went up, CPU usage down, and perf record shows the aesni assembly being invoked ;-)) on either my Ivy Bridge (where it was working before the patch), and Ryzen machines.

I am not sure why the code is specifically checking for Intel instead of accepting AMD and Intel (or for that matter any other clone), so I might be missing something, but either removing the if on genuineintel, or or'ing with authenticamd should work just fine (I did not see such checks in the linux kernel, but I might have missed it).
One concern with my patch is that I do have access to an Ivy Bridge and Ryzen, but I do not have access to a battery of various processors to see if they exhibit a weird behavior that the previous code was handling.

related to bug 13008 which added support for Intel processors (but not amd's).
Comment 1 Ralph Böhme 2018-02-27 14:04:11 UTC
*scratches head*
ENOPATCH?
Comment 2 eric.vannier 2018-02-28 06:12:24 UTC
Sorry, my employer has not agreed yet to my contributing the patch. Though, the patch is trivial, so should somebody have the time to write the couple of lines that I suggested, it might get done before my employer approves my contribution. Thanks for understanding.
Comment 3 eric.vannier 2018-03-22 07:07:45 UTC
Created attachment 14067 [details]
Fixes bug (adds AESNI support for AMD processors)
Comment 4 Björn Jacke 2018-03-22 07:58:15 UTC
Comment on attachment 14067 [details]
Fixes bug (adds AESNI support for AMD processors)

you already mentioned that the check for the vendor should be removed or extended. As the cpuid flag 25 is always the aes instruction set on x86, the check should just be removed actually here.
Comment 5 Jeremy Allison 2018-03-22 23:44:08 UTC
Björn, as the __cpuid(cpuid_results, 1) seem to be standard, as 

https://en.wikipedia.org/wiki/CPUID

show cpuid(1) with is supported on a pentium or above. Looks like we could just remove the check for vendor and always just directly check for bit 25 and if so assume AES instructions - correct ?
Comment 6 Jeremy Allison 2018-03-22 23:49:09 UTC
Created attachment 14076 [details]
git-am fix for master.

Bjorn and Eric, does this patch work for you ?
Comment 7 Jeremy Allison 2018-03-22 23:54:56 UTC
Created attachment 14077 [details]
Revised git-am fix for master.

Slightly better fix that doesn't zero out cpuid_results[] each time we check the has_aes_instructions bool, but only once on initial check.
Comment 8 eric.vannier 2018-03-23 04:36:58 UTC
Created attachment 14078 [details]
Proposed patch for all processors

Jeremy,

lgtm. Though, I am not sure the zero'ing is necessary: cpuid asm will store into the four locations (maybe I am missing something).
Maybe, as well, rename has_intel ... to has_aes since this is no longer about intel (and the instructions are called AESNI) ?
See proposed patch.
Comment 9 Jeremy Allison 2018-03-26 21:47:48 UTC
Comment on attachment 14078 [details]
Proposed patch for all processors

OK, LGTM I'm ok with this. Björn or Ralph can you +1 and I'll push to master & cherry-pick for back-ports.
Comment 10 Björn Jacke 2018-03-26 22:48:36 UTC
Comment on attachment 14078 [details]
Proposed patch for all processors

lgtm, thanks!
Comment 11 Jeremy Allison 2018-03-27 03:45:46 UTC
Comment on attachment 14078 [details]
Proposed patch for all processors

Hang on a minute. I didn't stop this initially as it doesn't break on gcc, but this patch renames the has_intel_aes_instructions() function to has_aes_instructions(), which can conflict with the static static int has_aes_instructions = -1;

I don't think this breaks on gcc, but might on clang or other proprietary compilers, plus it's very bad form.

I'll commit a version of this that doesn't rename has_intel_aes_instructions().

Jeremy.
Comment 12 Jeremy Allison 2018-03-27 18:33:14 UTC
Created attachment 14087 [details]
git-am fix for 4.7.next, 4.8.next.

Cherry-picked from master. Applies cleanly to 4.7.next, 4.8.next.
Comment 13 Karolin Seeger 2018-04-04 10:16:10 UTC
(In reply to Jeremy Allison from comment #12)
Pushed to autobuild-v4-[8,7]-test.
Comment 14 Karolin Seeger 2018-04-10 07:23:12 UTC
(In reply to Karolin Seeger from comment #13)
Pushed to both branches.
Closing out bug report.

Thanks!