install.sh bug fixes, autostart cgroup detachment, vault key security #20

Merged
obel1x merged 9 commits from unbrot/fedora-OEMDRV:main into main 2026-05-01 17:59:24 +02:00
Contributor

install.sh: four bugs fixed during live testing

Free-space start alignment, locale-independent partition creation, correct new-partition detection on disks with gap-numbered partitions, infinite loop on EOF stdin.

Autostart service cgroup detachment

KDE Plasma runs each autostart .desktop entry as a systemd user unit with KillMode=control-group. Long-running daemons started inside the logon script were being killed when logon_script.sh exited. Fixed for gocryptfs (systemd-run --scope), kwalletd6 (systemd-run --scope), Nextcloud Desktop Client and Nextcloud Talk (systemd-run --no-block transient service; --scope left systemd-run itself in the autostart cgroup). Talk also needed Delegate=yes for Electron's zygote.

Security: vault key in memory-only storage

mount_ecrypt_home.sh wrote the IPA KRA vault key to /var/tmp (persistent on-disk). Replaced with ${XDG_RUNTIME_DIR}/IPAVAULTKEY (per-user tmpfs, memory-only, mode 0700, wiped on logout).

install.sh: four bugs fixed during live testing Free-space start alignment, locale-independent partition creation, correct new-partition detection on disks with gap-numbered partitions, infinite loop on EOF stdin. Autostart service cgroup detachment KDE Plasma runs each autostart .desktop entry as a systemd user unit with KillMode=control-group. Long-running daemons started inside the logon script were being killed when logon_script.sh exited. Fixed for gocryptfs (systemd-run --scope), kwalletd6 (systemd-run --scope), Nextcloud Desktop Client and Nextcloud Talk (systemd-run --no-block transient service; --scope left systemd-run itself in the autostart cgroup). Talk also needed Delegate=yes for Electron's zygote. Security: vault key in memory-only storage mount_ecrypt_home.sh wrote the IPA KRA vault key to /var/tmp (persistent on-disk). Replaced with ${XDG_RUNTIME_DIR}/IPAVAULTKEY (per-user tmpfs, memory-only, mode 0700, wiped on logout).
unbrot added 8 commits 2026-05-01 17:37:16 +02:00
Free-space start alignment
  parted reports free space starting at 0,02 MiB (before the GPT
  alignment boundary). The collect_free_space awk now rounds the start
  up to the next whole MiB (ceiling) and enforces a minimum of 1 MiB,
  then recomputes the usable size from the adjusted start. This prevents
  parted from being asked to create a partition at 0 MiB, which it
  cannot do.

Locale-independent partition creation
  The previous `printf 'Yes\n' | parted mkpart` relied on parted
  accepting an English answer to its alignment-confirmation prompt.
  On a German-locale system parted asks "Ist dies noch akzeptabel?"
  and ignores "Yes", causing mkpart to fail. Replaced with `parted -s`
  (script/non-interactive mode), consistent with every other parted
  call in the script.

Correct new-partition detection on disks with gaps
  The old heuristic took the highest partition number after partprobe.
  On a disk where existing partitions are numbered 2/3/4, a new
  partition in the gap before them receives number 1 — making the
  old heuristic point at partition 4 (the existing btrfs volume) and
  subsequently run mkfs.btrfs on it. The new awk matches by start
  position (OEMDRV_START ± 1 MiB) instead, which is unambiguous
  regardless of how numbers are assigned.

Infinite loop on EOF stdin
  When the selection while-loop's `read` hits EOF (e.g. stdin exhausted
  after sudo consumed a piped password), it returns exit code 1 with an
  empty INPUT, which falls through to "Invalid input." and spins
  forever. Added `|| { echo; echo "Aborted."; exit 0; }` to all three
  read calls in the loop.

install.md: drop stale install_from_repo.sh reference from title;
clarify that REPO_URL/REPO_BRANCH overrides are optional.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
KDE Plasma runs each autostart .desktop entry as a systemd user unit.
systemd tracks service liveness by cgroup membership, not just the
main PID. Any process forked inside the service — even via setsid or &
— stays in the service's cgroup and keeps app-logon_script.sh@autostart
in active (running) state indefinitely after logon_script.sh exits.

mount_ecrypt_home.sh: wrap the gocryptfs mount call with
  systemd-run --user --scope --unit=gocryptfs-home
The FUSE daemon that gocryptfs forks now lives in its own transient
scope cgroup. Exit-code propagation is unchanged because systemd-run
--scope returns the main process's exit code.

0050_nextcloud_desktopclient/user_run.sh: replace
  /usr/bin/setsid ... &
with
  systemd-run --user --scope --unit=nextcloud-client ... &
setsid creates a new session but does not move the process out of the
cgroup; systemd-run --scope does.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Same root cause as the gocryptfs and Nextcloud fixes: kwalletd6 is a
long-running daemon that stays alive for the entire KDE session.
Launching it with setsid keeps it in the autostart service cgroup,
preventing app-logon_script.sh@autostart from reaching finished state.

Replace setsid with systemd-run --user --scope so kwalletd6 runs in
its own transient scope cgroup.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
setsid -f forks the process into a new session but leaves it in the
calling service's cgroup. systemd-run --user --scope moves it into its
own transient scope cgroup so the autostart service can finish normally.

Added & to background the launch, replacing the fork that setsid -f
was providing. Tested: scope is created and Talk starts correctly.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
/var/tmp is persistent on-disk storage. The encryption key must never
be written to disk, even temporarily. Replaced all occurrences of
/var/tmp/IPAVAULTKEY.txt with ${XDG_RUNTIME_DIR}/IPAVAULTKEY, which
is a per-user tmpfs directory (/run/user/<UID>) created by
systemd-logind: guaranteed memory-only, mode 0700, wiped on logout.

Also removed the TODO comment that tracked this exact issue.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Nextcloud Talk is an Electron app. Electron uses a zygote process to
fork sandboxed child processes (GPU, renderer, network service) into
their own sub-cgroups. systemd-run --scope without Delegate=yes locks
down the cgroup — sub-cgroups cannot be created — so the zygote fails,
causing the GPU process to crash immediately on startup.

Adding --property=Delegate=yes hands cgroup management to the scope,
allowing flatpak/bubblewrap and Electron's zygote to create the
sub-cgroups they need. Tested: no GPU crash with this flag set.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
systemd-run --scope ... & left the systemd-run binary running as a
background process inside the autostart service's cgroup. When
logon_script.sh exited, systemd's KillMode=control-group sent SIGTERM
to all remaining cgroup processes, including systemd-run. systemd-run,
on receiving SIGTERM while monitoring a scope, stopped the scope and
killed the Nextcloud client -- at exactly the same moment the autostart
service ended.

--no-block with --scope is not supported. Switch to a transient service
unit (drop --scope, add --no-block). systemd-run registers the unit and
returns immediately, leaving the cgroup before logon_script.sh ends.
The Nextcloud process then runs as an independent systemd user service,
unaffected by the autostart service lifecycle. Tested: Nextcloud keeps
running after systemd-run exits.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
--scope ... & had two problems:
1. systemd-run stayed alive in the autostart service cgroup;
   KillMode=control-group sent it SIGTERM when logon_script.sh exited,
   tearing down the scope and killing Talk mid-initialization.
2. The scope lacked Delegate=yes, preventing Electron's zygote from
   creating sub-cgroups for the GPU/renderer processes.

The previous commit added Delegate=yes but kept --scope, so problem 1
remained: the scope was still torn down on service exit, causing the
GPU/network service crash visible in talk.log.

Switch to a transient service unit identical to the Nextcloud Desktop
Client fix: --no-block returns immediately so systemd-run is gone from
the cgroup before the service ends; --property=Delegate=yes is retained
for Electron's zygote. Tested: service active, zygote and network
service running, no GPU crash.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
unbrot added 1 commit 2026-05-01 17:53:19 +02:00
obel1x merged commit 0ad82ac4e9 into main 2026-05-01 17:59:24 +02:00
Sign in to join this conversation.
No Reviewers
No Label
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: obel1x/fedora-OEMDRV#20