## Revised Analysis: Host-Specific Intel Ultra 7 Issue This is a **very significant clue** that reframes the entire problem. --- ### What This Tells Us The fact that this **only occurs on the Intel Ultra 7 (Meteor Lake/Arrow Lake)** host changes the diagnosis from a software/build issue to a **host CPU feature exposure problem**. --- ### Intel Ultra 7 — Relevant Architecture Details ``` Intel Core Ultra 7 (Meteor Lake / Arrow Lake) ├── New hybrid architecture (P-cores + E-cores + LP E-cores) ├── New TSC/clock behaviour ├── New CPUID leaves not present on older CPUs ├── Changes to FSGSBASE, WAITPKG, HRESET, etc. └── Different handling of 32-bit legacy compatibility ``` --- ### The Likely Mechanism #### QEMU CPU Model Mismatch ``` QEMU Standard PC (Q35 + ICH9) │ ├── If using -cpu host → exposes Ultra 7 features to guest │ └── Guest kernel (32-bit compat) may not handle │ new CPUID/clock features correctly │ └── If using default QEMU cpu model → may behave differently on Ultra 7 host than older hosts ``` When QEMU runs on an Ultra 7 with `-cpu host` or similar, it exposes CPU features that **older QEMU CPU models don't present**. The guest kernel's 32-bit `clock_gettime` path may then select a **clock source or vDSO path that doesn't work correctly** under that configuration. --- ### Specific Suspects on Ultra 7 #### 1. TSC Deadline / Always-Running TSC Behaviour ``` Ultra 7 has: ├── TSC_ADJUST MSR behaviour changes ├── Invariant TSC (but with new topology awareness) └── Different TSC sync behaviour across P/E cores │ └── QEMU may expose an inconsistent TSC to the guest causing clock_gettime function pointer to be set up incorrectly or not at all ``` #### 2. CPUID Leaf Exposure ``` New CPUID leaves on Meteor Lake/Arrow Lake: ├── May confuse guest kernel clock source selection ├── 32-bit compat path may hit untested code branch └── Function pointer left NULL after failed init ``` #### 3. KVM Behaviour Difference ``` KVM on Ultra 7: ├── Different KVM clock calibration ├── Possible KVM bug with new CPU features └── kvmclock or pvclock init may fail silently │ └── Leaving clock_gettime dispatch pointer as NULL ``` --- ### Confirming the Theory #### Check KVM/CPU Exposure in QEMU Config ```bash # On the Ultra 7 host, check how QEMU is invoking the VM ps aux | grep qemu # or cat /proc/$(pgrep qemu)/cmdline | tr '\0' ' ' # Look specifically for: # -cpu host ← exposes all Ultra 7 features # -cpu qemu64 ← generic, less likely to cause this # -enable-kvm ← KVM path vs pure emulation ``` #### Compare CPUID Exposed to Guest ```bash # Inside the VM on crashing host: cpuid -1 # Inside the VM on working host: cpuid -1 # diff the outputs — unexpected feature flags are your culprit ``` #### Check KVM Clock on Ultra 7 Host ```bash # On Ultra 7 host: dmesg | grep -i "kvm\|tsc\|clock\|kvmclock" cat /sys/devices/system/clocksource/clocksource0/current_clocksource cat /sys/devices/system/clocksource/clocksource0/available_clocksource # Compare with working host ``` #### Check for KVM Bugs ```bash # On Ultra 7 host: dmesg | grep -i "kvm\|vmx\|unrestricted" # Check KVM module version modinfo kvm_intel | grep version ``` --- ### Specific Test to Isolate ```bash # Test 1: Force a specific CPU model in QEMU (not 'host') # Change QEMU launch to use: -cpu Skylake-Client-v4 # known stable older model # If this fixes it → CPU feature exposure is the problem # Test 2: Disable KVM -no-kvm # or remove -enable-kvm # Pure emulation, no host CPU bleed-through # If this fixes it → KVM interaction with Ultra 7 is the problem # Test 3: Disable TSC in guest # Add to kernel cmdline: clocksource=hpet # or acpi_pm # If this fixes it → TSC behaviour on Ultra 7 is the problem # Test 4: Pin QEMU to P-cores only (not E-cores) taskset -c 0,2,4,6 qemu-system-i386 ... # Ultra 7 TSC sync issues between P and E cores # If this fixes it → hybrid core TSC skew is the problem ``` --- ### Most Probable Cause Ranking | Rank | Cause | Confidence | |------|-------|------------| | 1 | **KVM exposing Ultra 7 TSC/clock features that confuse 32-bit guest clock init** | High | | 2 | **CPUID leaves from Meteor Lake causing wrong clock source selection** | High | | 3 | **TSC skew between P-cores and E-cores causing clock calibration failure** | Medium | | 4 | **KVM bug specific to Ultra 7 VMX implementation** | Medium | | 5 | **QEMU version not yet updated for Ultra 7 quirks** | Medium | --- ### Recommended Fix Path ``` 1. Identify QEMU -cpu argument being used on Ultra 7 host │ ├── If -cpu host → change to explicit model (e.g. Skylake-Client-v4) │ └── If already explicit → check QEMU and KVM versions 2. Update QEMU and kernel on Ultra 7 host (Meteor Lake support improved in QEMU 8.1+, kernel 6.6+) 3. If using -cpu host, add specific feature suppression: -cpu host,-tsc-deadline,-kvmclock,-kvm_pv_unhalt 4. Check for KVM errata for Meteor Lake on your host kernel version dmesg | grep -i "errata\|microcode\|VMX" ``` --- ### Bottom Line > The crash is almost certainly caused by **KVM or QEMU exposing Intel Ultra 7 (Meteor Lake/Arrow Lake) CPU features** that the **32-bit compatibility syscall path in the Yocto guest kernel doesn't handle correctly**, resulting in a NULL clock function pointer. The fix is likely a **QEMU CPU model configuration change** or **QEMU/host kernel update** with proper Meteor Lake support.