GKI 16-6.12 android-mainline errata

This page describes important issues and bug fixes found on android-mainline that might be significant to partners.

November 1, 2024

  • Linux 6.12-rc4 landing
    • Summary: CONFIG_OF_DYNAMIC potentially causing severe regressions for faulty drivers.
    • The details: While merging Linux 6.12-rc1 into android-mainline we noticed issues with out-of-tree drivers failing to load. The change that exposed the driver bugs was identified as commit 274aff8711b2 ("clk: Add KUnit tests for clks registered with struct clk_parent_data") and we temporarily reverted it in aosp/3287735. The change selects CONFIG_OF_OVERLAY, which selects CONFIG_OF_DYNAMIC. With !OF_DYNAMIC, ref-counting on of_node_get() and of_node_put() is effectively disabled as they are implemented as noops. Enabling OF_DYNAMIC again exposes issues in drivers wrongly implementing ref-counting for struct device_node. This causes various types of errors like memory corruption, use-after-free, and memory leaks.
    • All uses of OF parsing related APIs must be inspected. The following list is partial, but contains cases we have been observing:
      • Use after free (UAF):
        • Reuse of the same device_node argument: Those functions call of_node_put() on the node given, potentially need to add an of_node_get() before calling them (for example, when calling repeatedly with the same node as argument):
          • of_find_compatible_node()
          • of_find_node_by_name()
          • of_find_node_by_path()
          • of_find_node_by_type()
          • of_get_next_cpu_node()
          • of_get_next_parent()
          • of_get_next_child()
          • of_get_next_available_child()
          • of_get_next_reserved_child()
          • of_find_node_with_property()
          • of_find_matching_node_and_match()
        • Use of device_node after any type of exit from certain loops:
          • for_each_available_child_of_node_scoped()
          • for_each_available_child_of_node()
          • for_each_child_of_node_scoped()
          • for_each_child_of_node()
        • Keeping direct pointers to char * properties from device_node around, for example, using:
          • const char *foo = struct device_node::name
          • of_property_read_string()
          • of_property_read_string_array()
          • of_property_read_string_index()
          • of_get_property()
      • Memory leaks:
        • Getting a device_node and forgetting to unref it (of_node_put()). Nodes returned from these need to be freed at some point:
          • of_find_compatible_node()
          • of_find_node_by_name()
          • of_find_node_by_path()
          • of_find_node_by_type()
          • of_find_node_by_phandle()
          • of_parse_phandle()
          • of_find_node_opts_by_path()
          • of_get_next_cpu_node()
          • of_get_compatible_child()
          • of_get_child_by_name()
          • of_get_parent()
          • of_get_next_parent()
          • of_get_next_child()
          • of_get_next_available_child()
          • of_get_next_reserved_child()
          • of_find_node_with_property()
          • of_find_matching_node_and_match()
      • Keeping a device_node from a loop iteration. If you're returning or breaking from within the following, you need to drop the remaining reference at some point:
        • for_each_available_child_of_node()
        • for_each_child_of_node()
        • for_each_node_by_type()
        • for_each_compatible_node()
        • of_for_each_phandle()
    • The earlier mentioned change was restored while landing Linux 6.12-rc4 (see aosp/3315251) enabling CONFIG_OF_DYNAMIC again and potentially exposing faulty drivers.