and with this, i'm now pretty much done modifying grub's
crappy code. this experiment started in 2023 has now
pretty much concluded.
the original GNU code was poorly written, hardcoded
everywhere, and not documented or commented at all.
i had to learn what the code is doing through inference
instead, and i'm pretty sure that these explanations
cover everything. i hope?
maybe the frenchman can explain anything i missed. haha.
Signed-off-by: Leah Rowe <leah@libreboot.org>
yet another optimisation for weaker compilers - but
some modern compilers may not optimise well for this
code either.
this reduces the amount of references to the struct,
which is very expensive (48000 times per second) on
very old CPUs.
Signed-off-by: Leah Rowe <leah@libreboot.org>
the frame[] array is never actually used meaningfully.
that setting of frame[ringpos] on the decode_state is
only set here, but then the value isn't really used at
all. the entire size of the annay is used for sizeof
in print_stats, but then we can just declare that
manually. since we also know that this value never
changes, we can use a global define for the sizeof entry
in print_stats, thereby simplifying operation further
Signed-off-by: Leah Rowe <leah@libreboot.org>
make it easier to read by clearer variable naming.
this change also reduces memory accesses (fewer struct
dereferences - see: struct decoder_state), when using
much weaker/older compilers that don't optimise
properly. this, in the most active part of the code,
which is called.... 48000 times a second. peanuts on
modern CPUs, but on old (early 90s) CPUs it makes a
big difference.
Signed-off-by: Leah Rowe <leah@libreboot.org>
only use the old fallback, or /dev/urandom
/dev/random blocks on some older unix machines,
or in embedded environments that may never
have enough entropy, causing the code to hang.
urandom is most certainly expected to exist on
pretty much anything since the mid 90s.
i could probably re-add the arc4random setup
for BSDs. i'll think about it. gotta do that
portably too.
Signed-off-by: Leah Rowe <leah@libreboot.org>
if someone calls rhex fast enough, the timestamp
may not change. this mitigates that by adding
a counter value to the mix
Signed-off-by: Leah Rowe <leah@libreboot.org>
the last change was good, but this code, again,
has to do these calculations 48,000 times a second.
trivial on new computers. but now try it on a
computer from 1992.
we should try to make this as fast as possible :)
older compilers especially don't optimise these
checks. this patch shifts it to one subtraction and
one unsigned comparison, rather than checking less
than or greater than both. often used in... literally
exactly this type of program.
on a good compiler this will compile to an add, cmp
and conditional jump.
less readable, but the results (set 1 or 0) make it
pretty obvious what it does, after a few seconds.
Signed-off-by: Leah Rowe <leah@libreboot.org>
i turned this into abs() call earlier, but this isn't
obviously readable by some people.
make it absolutely clear what this does. also reduces
use of syscalls.
Signed-off-by: Leah Rowe <leah@libreboot.org>
fread() may return short reads, whereas the current
code assumes either EOF or a full read.
change if to a while. really, it's that simple.
just loop until it's done. i probably b0rked this
myself when refactoring the GNU code.
Signed-off-by: Leah Rowe <leah@libreboot.org>
i treated ftell errors as fatal, but if fttell fails
with ESPIPE, and someone's using -d, the program may
exit immediately, even though there's no problem.
instead, skip printing the offset (basically no debug).
this fixes a bug that i introduced myself, when i forked
this code, because i added that error check; the GNU
code didn't have any check whatsoever.
Signed-off-by: Leah Rowe <leah@libreboot.org>
we currently read small amounts of data with fread,
repeatedly, which is quite taxing on the CPU, on
very old systems.
48khz audio. 48000 calls to fread() per second?
yeah. let's optimise this.
performance now should be roughly O(1) in practise.
this and the other recent changes means no modulo
or division, reduced branching, memory memory roads,
and lots of buffering.
the buffering here is quite conservative, so the human
won't notice any difference. we're cutting the number
of times we call fread by a factor of several thousand,
but you'll still see text scrolling down pretty quick,
with minimal lag.
the old GNU code i forked was terrible at this.
Signed-off-by: Leah Rowe <leah@libreboot.org>