🚀 Matrix Kernel Technical Changelog Project: GKI 5.10.249

🚀 Matrix Kernel Technical Changelog Project: GKI 5.10.249

zixine

Version: Precision Build


🛡️ Anti-Random Reboot Guard (CRITICAL)

STACK PROTECTOR PER TASK EVALUATED

Description: Disables the per-task stack canary. On many Mediatek devices, the stack protector implementation often conflicts with vendor-specific register usage (like x18).

Purpose: This is the primary fix for the "Orange State" or sudden random reboots. By disabling this, we prevent the kernel from triggering a panic when it misinterprets valid firmware behavior as a stack attack.


⚡ Performance & Smoothness


CONFIG HZ 500

use version 300 immediately if you experience problems

Description: Increases the kernel tick rate to 500Hz.

Purpose: Provides a balance between responsiveness and battery. It makes the UI feel "snappier" and touch inputs more immediate compared to the stock 300Hz.


add NO HZ

Description: Enables a Tickless Kernel.

Purpose: Reduces the number of timer interrupts when the CPU is idle. This improves power efficiency and reduces system overhead.


TCP CONG BBR & DEFAULT BBR

Description: Implements Google's Bottleneck Bandwidth and RTT (BBR) congestion control.

Purpose: Significantly improves network throughput and reduces latency (ping) in congested network environments, which is great for gaming.


CPU FREQ GOV SCHEDUTIL

Description: Enables the Schedutil CPU governor.

Purpose: Uses kernel scheduler data to make smarter frequency scaling decisions, resulting in a smoother experience and better power management.


🧬 BPF Performance (System Optimization)


BPF SYSCALL & BPF JIT, etc.

Description: Enables Berkeley Packet Filter with Just-In-Time compilation.

Purpose: Accelerates system services and networking tasks by compiling BPF instructions into native machine code. This reduces CPU cycles used for monitoring and filtering.


🏗️ Base Stability & Memory Management


(CPU/Memory/IO) to manage background tasks better.

MEMCG & MEMCG SWAP: Enables Memory Control Groups. Allows the system to manage RAM usage per-app more granularly, preventing a single process from hogging all memory.

ANDROID LOW MEMORY KILLER=y: Ensures the kernel handles low-memory situations correctly according to Android standards.

LOCALVERSION AUTO=n: Keeps the kernel version string clean by not appending random Git hash suffixes.


1. The Core Problem: RCU Stall and Watchdog Timers

The primary cause of "Random Reboots" on Mediatek GKI kernels is often a Hard Lockup or Soft Lockup. When you enable CONFIG_RCU_EXPERT and CONFIG_RCU_BOOST, you are fundamentally changing how the kernel handles internal cleanup and synchronization.

How it leads to Reboots: If the RCU booster thread cannot complete its task within the allotted time (due to Mediatek’s aggressive core migration or power-saving states), the kernel triggers an RCU Stall.

The Result: The System Health Watchdog sees that the CPU is "stuck" in a grace period for too long. To prevent data corruption, the Watchdog force-reboots the device. This is the "Random Reboot" your members are experiencing.


2. RCU_NOCB_CPU: The Affinity Trap

You previously considered CONFIG_RCU_NOCB_CPU=y to offload callbacks. On a heterogeneous architecture like the Helio G99 (2x Cortex-A76, 6x Cortex-A55):

The Conflict: Offloading RCU callbacks to worker threads requires precise CPU affinity. If the Android ramdisk or the Mediatek scheduler pushes these heavy RCU tasks to the "Little" cores that are already throttled for thermal savings, the callback queue overflows.

The Crash: When the queue overflows, the kernel runs out of memory or enters a deadlock state. Since MT6789 firmware is not designed to handle custom RCU offloading, the result is an immediate "Orange State" or a silent crash to the bootloader.


3. RCU_BOOST_DELAY: Timing Misalignment

You suggested CONFIG_RCU_BOOST_DELAY=500 (and later 100).

The Risk: Mediatek's proprietary Power Management Integrated Circuit (PMIC) and its associated drivers expect specific timing for hardware interrupts.

The Mismatch: By forcing a delay in RCU boosting, you are essentially telling the kernel to wait before cleaning up "dead" memory pointers. In a high-pressure environment (like gaming), this creates a race condition where the hardware tries to access a memory address that the kernel hasn't finished cleaning up yet. This leads to an Alignment Fault or Instruction Fetch Error, triggering a reboot.


4. Why Default (GKI Standard) is Superior for Stability

By removing these "Expert" tweaks, you revert to the Generic Kernel Image (GKI) standard.

Tested Synchrony: The default RCU settings in GKI 5.10 are tested across millions of devices. They are tuned to be "conservative," meaning they prioritize integrity over micro-speed.

Firmware Compatibility: Mediatek's binary blobs (camera, modem, GPU drivers) are compiled against a standard RCU behavior. When you change the RCU logic, these closed-source drivers often fail to synchronize properly, leading to the "Random Reboots" you see in the logs.


Conclusion

Removing aggressive RCU configs is not "losing performance"; it is gaining uptime. On the MT6789, the marginal gains in "smoothness" from RCU boosting are invisible to the user but highly visible to the kernel's error-checking mechanisms. To reach the next level of stability, you must stop treating the kernel like a playground for every "tweak" you find on GitHub and start respecting the architecture's constraints.

Report Page