chris_skyflier Profile Banner
Christoph Lutz Profile
Christoph Lutz

@chris_skyflier

Followers
178
Following
2K
Media
52
Statuses
888

Those who don‘t jump will never fly.

Joined August 2010
Don't wanna be here? Send us removal request.
@chris_skyflier
Christoph Lutz
15 hours
7/7.Fast Sync wait behavior can be observed with the following bpftrace scipt: All of the above was tested on 19.26 (with RAC on Exadata).
Tweet media one
0
0
1
@chris_skyflier
Christoph Lutz
15 hours
6/7.Computation of the adaptive sleep duration is quite complex and based on additional runtime counters maintained by fg and bg processes in the sga struct kcrf_alfs_info_ (these are not exposed). Anyway, that's a rabbit hole for another day . 🕵️‍♂️
Tweet media one
1
0
0
@chris_skyflier
Christoph Lutz
15 hours
5/7.Once the backoff sleep time reaches 160 us, it no longer increases (=> 10, 20, 40, 80, 160, 160, 160, . ). The min and max sleep times of 10 and 160 us are hardcoded and cannot be configured.
1
0
0
@chris_skyflier
Christoph Lutz
15 hours
4/7.Exponential backoff: the fg finds the on-disk scn still lower than its sync scn after sleeping and spinning. So it starts sleeping again using an exponential backoff scheme: sleep for 10 us and double the duration on every iteration until it reaches a max of 160 us.
1
0
0
@chris_skyflier
Christoph Lutz
15 hours
3/7.Spin: the fg sleeps, finds the on-disk scn lower than its sync scn when it wakes up, and starts to spin for 100 us (_fg_fast_sync_sleep_usecs). After the spin, it finds that lgwr has advanced the on-disk scn beyond its sync scn.
1
0
0
@chris_skyflier
Christoph Lutz
15 hours
2/7.Sleep: the fg sleeps and finds the on-disk scn greater than its sync scn when it wakes up. The sleep time is an adaptive moving avg, dynamically re-calculated and updated by ckpt every 3 sec (in kcrfw_alfs_con_job).
1
0
0
@chris_skyflier
Christoph Lutz
15 hours
1/7.On Exadata with pmemlog, the Fast Log File Sync dynamically tunes the log file sync sleep duration to balance responsiveness vs cpu ovherhead (spinning after wakeup). Oracle tracks three wait variants via different session stat counters:. 1. Sleep.2. Spin.3. Backoff Sleeps
Tweet media one
Tweet media two
1
0
1
@chris_skyflier
Christoph Lutz
2 days
If you've ever felt the need to manually control adaptive lgwr features, this gdb script's got you covered: 😎. It lets you enable and disable adaptive scalable lgwr, fast log file sync, and log parallelism. Highly experimental, of course!. Example 👇
Tweet media one
Tweet media two
1
1
6
@chris_skyflier
Christoph Lutz
14 days
RT @ScyllaDB: Learn how xCapture, an eBPF-based Linux thread activity management tool, can capture thread activity of all apps in your syst….
0
2
0
@chris_skyflier
Christoph Lutz
15 days
RT @bsdaemon: AMD still learning the basics about spec side-channels (the things that Intel forgot at this point). what a world.
0
3
0
@chris_skyflier
Christoph Lutz
1 month
11/10.For the curious, a bunch of script to simulate and analyze the behavior:.- Enable public redo strands using gdb: - Simulate RAL contention in a fg using gdb: - Trace lgwr strand deactivation (bpftrace):
0
0
2
@chris_skyflier
Christoph Lutz
1 month
10/10.All of the above was tested on Oracle 19.26 (Exadata with RAC and no private redo strands, and no in-memory undo). On a related note, the log parallelism feature is only available in Oracle Enterprise Edition.
1
0
0
@chris_skyflier
Christoph Lutz
1 month
9/10.lgwr uses async i/o, but if the storage subsystem can't handle the async i/o reqs in parallel, log file parallel write and log file sync times may increase. This effect is described in a great post by savvinov ( and in MOS doc id 1980199.1.
Tweet card summary image
savvinov.com
Log parallelism is an optimization introduced in 9.2 that reduces latch contention due to redo copy to the log buffer by enabling multiple public redo buffers (or “strands”). In many ca…
1
0
1
@chris_skyflier
Christoph Lutz
1 month
8/10.Reason for this is that multiple active strands increase the number of lgwr i/o requests, as lgwr must write buffers from each active strand; in other words, more active strands result in more lgwr i/o requests.
1
0
0
@chris_skyflier
Christoph Lutz
1 month
7/10.When multiple strands are active, Oracle tends to deactivate them fairly aggressively: lgwr disables one strand on every 10th log write (hard-coded in kcrfw_redo_write_driver). Example:
Tweet media one
1
0
0
@chris_skyflier
Christoph Lutz
1 month
6/10.rand_nr is a session-specific random number stored in thread local storage (tls) that is initialized at session creation (by ksurandinit - I haven't investigated this further).
1
0
0
@chris_skyflier
Christoph Lutz
1 month
5/10.With multiple active strands, the strand in step 1 is chosen randomly: .strand = rand_nr % active_strands. If the RAL get fails (in 1. or 2.), rand_nr is incremented to retry with the next strand (wrapping to strand 0 if needed). Examples:
Tweet media one
Tweet media two
1
0
0
@chris_skyflier
Christoph Lutz
1 month
4/10.This is how it works with one active strand:. 1. Attempt a no-wait RAL get on strand 0.2. If this fails, retry once.3. On second failure, activate a new strand.4. Randomly select one of the active strands and acquire its RAL in wait mode.
1
0
0
@chris_skyflier
Christoph Lutz
1 month
3/10.The number of strands is defined by _log_parallelism_max and they are pre-allocated at startup. Each strand has its own redo allocation latch (RAL) and at runtime, sessions activate and deactivate strands dynamically in response to RAL contention (_log_parallelism_dynamic).
1
0
0
@chris_skyflier
Christoph Lutz
1 month
2/10.The log parallelism feature divides the redo log buffer into multiple smaller buffers, called public redo strands. This reduces contention during redo buffer allocation by spreading activity across strands, allowing sessions to allocate space for redo entries in parallel.
1
0
0