Re: [PATCH v2 4/4] driver core: Disable driver deferred probe timeout by default
From: Andrew Halaney
Date: Wed Nov 16 2022 - 14:16:11 EST
On Wed, Nov 16, 2022 at 01:02:36PM +0100, Javier Martinez Canillas wrote:
> The driver_deferred_probe_timeout value has a long history. It was first
> set to -1 when was introduced by commit 25b4e70dcce9 ("driver core: allow
> stopping deferred probe after init"), meaning that the driver core would
> defer the probe forever unless a subsystem would opt-in by checking if the
> initcalls where done using the driver_deferred_probe_check_state() helper,
> or if a timeout was explicitly set with a "deferred_probe_timeout" param.
This or statement here sounds like you either opt-in, or the timeout
affects you (at least that's how I read it).
A subsystem has to opt-in to get either result by using
driver_deferred_probe_check_state()!
>
> Only the power domain, IOMMU and MDIO subsystems currently opt-in to check
> if the initcalls have completed with driver_deferred_probe_check_state().
>
> Commit c8c43cee29f6 ("driver core: Fix driver_deferred_probe_check_state()
> logic") then changed the driver_deferred_probe_check_state() helper logic,
> to take into account whether modules have been enabled or not and also to
> return -EPROBE_DEFER if the probe deferred timeout work was still running.
>
> Then in commit e2cec7d68537 ("driver core: Set deferred_probe_timeout to a
> longer default if CONFIG_MODULES is set"), the timeout was increased to 30
> seconds if modules are enabled. Because seems that some of the subsystems
> that were opt-in to not return -EPROBE_DEFER after the initcall where done
s/where/were/
> could still have dependencies whose drivers were built as a module.
>
> This commit did a fundamental change to how probe deferral worked though,
> since now the default was not to attempt probing for drivers indefinitely
> but instead to timeout after 30 seconds, unless a different timeout is set
> using the "deferred_probe_timeout" command line parameter.
>
> The behavior was changed even more with commit ce68929f07de ("driver core:
> Revert default driver_deferred_probe_timeout value to 0"), since the value
> was set to 0 by default. Meaning that the probe deferral would be disabled
> after the initcalls where done. Unless a timeout was set in the cmdline.
>
> Notice that the commit said that it was reverting the default value to 0,
> but this was never 0. The default was -1 at the beginning and then changed
> to 30 in a later commit.
>
> This default value of 0 was reverted again by commit f516d01b9df2 ("Revert
> "driver core: Set default deferred_probe_timeout back to 0."") and set to
> 10 seconds instead. Which was still less than the 30 seconds that was set
> at some point, to allow systems with drivers built as modules and loaded
> later by user-land to probe drivers that were still in the deferred list.
>
> The 10 seconds timeout isn't enough in some cases, for example the Fedora
> kernel builds as much drivers as possible as modules. And this leads to an
> Snapdragon SC7180 based HP X2 Chromebook to not have display, due the DRM
> driver failing to probe if CONFIG_ARM_SMMU=y and CONFIG_SC_GPUCC_7180=m.
>
> So let's change the default again to -1 as it was at the beginning. That's
> how probe deferral always worked. The kernel should try to avoid guessing
> when it should be safe to give up on deferred drivers to be probed.
>
> The reason why the default "deferred_probe_timeout" was changed from -1 to
> the other values was to allow drivers that have only optional dependencies
> to probe even if the suppliers are not available.
>
> But now there is a "fw_devlink.timeout" parameter to timeout the links and
> allow drivers to probe even when the dependencies are not present. Let's
> set the default for that timeout to 10 seconds, to give the same behaviour
> as expected by these driver with optional device links.
>
> Signed-off-by: Javier Martinez Canillas <javierm@xxxxxxxxxx>
This sounds like a reasonable solution to me:
Acked-by: Andrew Halaney <ahalaney@xxxxxxxxxx>
> ---
>
> Changes in v2:
> - Mention in the commit messsage the specific machine and drivers that
> are affected by the issue (Greg).
> - Double check the commit message for accuracy (John).
> - Add a second workqueue to timeout the devlink enforcing and allow
> drivers to probe even without their optional dependencies available.
>
> drivers/base/dd.c | 8 ++------
> 1 file changed, 2 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/base/dd.c b/drivers/base/dd.c
> index ea448df94d24..5f18cd497850 100644
> --- a/drivers/base/dd.c
> +++ b/drivers/base/dd.c
> @@ -256,12 +256,8 @@ static int deferred_devs_show(struct seq_file *s, void *data)
> }
> DEFINE_SHOW_ATTRIBUTE(deferred_devs);
>
> -#ifdef CONFIG_MODULES
> -static int driver_deferred_probe_timeout = 10;
> -#else
> -static int driver_deferred_probe_timeout;
> -#endif
> -static int fw_devlink_timeout = -1;
> +static int driver_deferred_probe_timeout = -1;
> +static int fw_devlink_timeout = 10;
>
> static int __init deferred_probe_timeout_setup(char *str)
> {
> --
> 2.38.1
>