Bug 15591 - Investigating NT_STATUS_IO_TIMEOUT Error in Samba 4.19.5
Summary: Investigating NT_STATUS_IO_TIMEOUT Error in Samba 4.19.5
Status: NEW
Alias: None
Product: Samba 4.1 and newer
Classification: Unclassified
Component: AD: LDB/DSDB/SAMDB (show other bugs)
Version: 4.19.5
Hardware: All All
: P5 normal (vote)
Target Milestone: ---
Assignee: Samba QA Contact
QA Contact: Samba QA Contact
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2024-02-25 19:38 UTC by Elias
Modified: 2025-05-26 16:13 UTC (History)
0 users

See Also:


Attachments
drs_repl log (2.00 MB, text/x-log)
2024-02-25 19:38 UTC, Elias
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Elias 2024-02-25 19:38:19 UTC
Created attachment 18259 [details]
drs_repl log

Hi,

I've been encountering an NT_STATUS_IO_TIMEOUT error on all 4 of my DCs, and I'm seeking insights into the discrepancies observed when running Samba's kcc_periodic.c with and without log level 5.

Here's a snippet of the error in question:

[2024/01/19 14:02:05.917120, 0] source4/dsdb/kcc/kcc_periodic.c:790(samba_kcc_done)
  source4/dsdb/kcc/kcc_periodic.c:790: Failed samba_kcc - NT_STATUS_IO_TIMEOUT

Interestingly, when I enabled logging at level 5 and monitored DC1, the kcc_periodic.c ran correctly with the message:

[2024/01/19 16:04:30.732009, 3] ../../source4/dsdb/kcc/kcc_periodic.c:793(samba_kcc_done)
  Completed samba_kcc OK

I'm using Samba version 4.19.5. I noticed that /usr/sbin/samba_kcc is written in Python. Does it have a connection to source4/dsdb/kcc/kcc_periodic.c, which is written in C?

I recall reading some older posts suggesting that Samba might be moving towards a new code for kcc, possibly in Python, closer to what Microsoft uses. Can someone confirm this?

I've also configured the audit log for drs_repl in smb.conf. The log level was set to 5.

I'd appreciate it if someone could take a look at the log and help me understand its contents and potentially shed light on the NT_STATUS_IO_TIMEOUT error.
Comment 1 Douglas Bagnall 2025-05-23 03:06:01 UTC
Yes, samba_kcc is a python script that tries to follow the Microsoft KCC algorithm. It is run periodically in a subprocess.

AS to why it would timeout at log level 0 and not at 5, well that is a mystery.

You can trigger a run with 

  samba-tool drs kcc

(maybe you need -U, -H, -s options, I don't know). Does it take a long time? the timeout is supposed to be 40 seconds.

  samba-tool drs kcc -d10 

will give you debug logs from the samba-tool client point of view.

If you can find the samba_kcc script, you can probably run that directly.
Comment 2 Elias 2025-05-26 16:13:21 UTC
(In reply to Douglas Bagnall from comment #1)
Hello Douglas,

I ran the command and, in principle, the return is ok.

[0000] 00 00 00 00 ....
Consistency check on dc1.campus.sertao.ifrs.edu.br successful.

TIMEOUT errors have been occurring for a long time. I ended up leaving them aside because I couldn't find out anything about them.

Today's logs

root@dc1:~# tail -f /var/log/samba/log.samba
[2025/05/25 07:55:10.207968, 0] source4/dsdb/kcc/kcc_periodic.c:790(samba_kcc_done)
  source4/dsdb/kcc/kcc_periodic.c:790: Failed samba_kcc - NT_STATUS_IO_TIMEOUT
[2025/05/25 09:38:11.533829, 0] source4/dsdb/repl/drepl_out_helpers.c:1314(dreplsrv_update_refs_done)
  UpdateRefs failed with NT_STATUS_IO_TIMEOUT
[2025/05/25 15:36:21.396763, 0] source4/dsdb/repl/drepl_out_helpers.c:1314(dreplsrv_update_refs_done)
  UpdateRefs failed with NT_STATUS_IO_TIMEOUT
[2025/05/25 23:00:22.735122, 0] source4/dsdb/repl/drepl_out_helpers.c:1314(dreplsrv_update_refs_done)
  UpdateRefs failed with NT_STATUS_IO_TIMEOUT
[2025/05/26 09:39:34.061783, 0] source4/dsdb/repl/drepl_out_helpers.c:1314(dreplsrv_update_refs_done)
  UpdateRefs failed with NT_STATUS_IO_TIMEOUT