Age | Commit message (Collapse) | Author |
|
For string cases, turn `int()` into a thin `strtoll()` wrapper which
attempts to parse the initial portion of the string as a decimal integer
literal, optionally preceded by white space and a sign character.
Also introduce an optional `base` argument for string cases while we're
at it and adjust the existing stdlib test case accordingly.
The function now behaves mostly the same as ECMAScript `parseInt(val, 10)`
for string cases, means it will recognize `012` as `12` and not `10` and
it will accept trailing non-digit characters after the initial portition
of the input string.
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
- Fix segfault on passing string haystack with non-string needle argument
- Perform strict equality tests against array haystacks
- Make string searches binary safe
- Improve left index string search performance
- Improve right index array search performance
- Add missing test coverage for index() and rindex()
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
Add two new functions to deal with encoding and decoding of hexadecimal
digit strings:
- hexenc() - convert the given input value into a lower case hex digit
string, implicitely converting the input argument to a string value
if needed
- hexdec() - decode the given input hex digit string into a byte string,
skipping whitespace or optionally specified characters in the input
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
Do not expose the json-c compat functions in ucode's public headers to
avoid clashes when building on systems with modern json-c.
Also remove some explicit json-c/json-c.h includes in places where it is
not needed.
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
Extend the `uc_json()` implementation to accept readable objects in
addition to plain input strings. This allows parsing JSON input directly
from open file handles, sockets or other kinds of producer objects without
the need to store the entire JSON source string intermediately in memory.
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
lib: add date and time related functions
|
|
Add five new functions to deal with date calculation and timing:
- localtime(), gmtime() - return a broken down calendar date and time
specification from the given epoch (or now, if absent) in local and
UTC time respectively
- timelocal(), timegm() - the inverse operation for the former functions,
taking a date and time specification (interpreted as local or UTC time
respectively) and turning it into an epoch value
- clock() - return the second and nanosecond values of the system clock,
useful for time/performance measurements
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
Provide a new API function `uc_stdlib_function()` which allows to fetch
the C implementation of the given named standard library function.
This is useful for loadable modules or applications that embed ucode which
want to reuse core functions such as `sprintf()`.
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
lib: add argument position support (`%m$`) to `sprintf()` and `printf()`
|
|
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
Filter the zero padding `0` flag for `%s` formats to achieve constisten
outputs on Linux and OS X systems.
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
OS X does not implement `sigtimedwait()` used by `uc_system()` - add a
simple implementation of it.
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
- Let `require()` always evaluate the executed code in raw mode
- Let `render()` always evaluate the executed code in template mode
- Let `include()` inherit the raw mode semantics of the calling scope
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
The current `uc_render()` implementation uses a `fseek()` call on the
`open_memstream()` provided `FILE *` stream to reserve headroom for the
`uc_string_t` header. The `fseek()` call alone does not guarantee that
the underlying buffer length is updated on all libc implementations though.
This may lead to an integer underflow later on when the `uc_string_t`
header length is substracted from the buffer length after invoking a
template that did not produce any output write operations. In such a
case, a very large value is assigned to `ustr->length` leading to
uninitialized or out-of-bounds memory accesses later on.
Solve this issue by writing the header structure as data using `fwrite()`
which should yield the expected behaviour on all libc environments.
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
The most common usecase is extracting the value of a single byte at a
specific offset, e.g. to scan a string char-by-char to construct a hash.
Furthermore, constructing an array which contains the results of multiple
`ord()` invocations is trivial while efficiently extracting a single byte
value without the overhead of an intermediate array is not.
Due to that, change `ord()` to always return a single integer byte value
at the offset specified as second argument or at offset 0 in case no
argument was supplied.
That means that `ord("Abc", 0, 1, 2)` will now return `65` instead of the
former `[ 65, 98, 99 ]` result.
Code relying on the former behaviour should either perform multiple calls
to `ord()`, passing different offsets each time or switch to the `struct`
module which allows efficient unpacking of string data.
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
- Instead of treating individual program functions as managed ucode types,
demote uc_function_t values to pointers into a uc_program_t entity
- Promote uc_program_t to a managed type
- Let uc_closure_t claim references to the owning program of the enclosed
uc_function_t
- Redefine public APIs uc_compile() and uc_vm_execute() APIs to return and
expect an uc_program_t object respectively
- Remove vallist indirection for function loading and let the compiler
emit the function id directly when producing function construction code
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
The regular expression `/()/` will match the empty string, causing the
match loop to never advance. Add extra logic to deal with this case,
similar to the empty separator string logic.
Apply a similar exception to replacements of empty search strings, those
should yield the same result as empty regexp matches.
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
The regular expression `/()/` will match the empty string, causing the
match loop to never advance. Add extra logic to deal with this case.
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
The regular expression `/()/` will match the empty string, causing the
match loop to never advance. Add extra logic to deal with this case,
similar to the empty separator string logic.
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
Extend source objects with a `runpath` field which contains the original
path of the source being executed by the VM.
When instantiating source objects from file paths, the `runpath` will be
set to the `filename`. When instantiating source buffers using
`uc_source_new_buffer()`, the runpath is initially unset.
A new function `uc_source_runpath_set()` can be used to adjust the runtime
path being associated with a source object.
Extend bytecode loading logic to set the source buffer runtime path to the
precompiled bytecode file path being loaded and executed. This is required
for `sourcepath()` and relative paths in `include()` to function correctly
when executing precompiled programs.
Finally rename `uc_program_from_file()` and `uc_program_to_file()` to
`uc_program_load()` and `uc_program_write()` respectively since the load
part now operates on an `uc_source_t` input buffer instead of a plain
`FILE *` handle.
Adjust users of these API functions accordingly.
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
We need to release the compiled module function after we executed it in
our VM context.
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
A performance shortcut in `ucv_is_equal()` incorrectly led to `NaN === NaN`
being true. Fix the issue by only comparing pointers when the involved
types are not doubles.
Due to fixing `NaN !== NaN`, the `uniq()` function now requires a special
case to treat multiple NaNs equal for the sake of generating an array of
unique values.
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
The current implementation incorrectly returned `false` which got treated
as `NULL` instead of a boolean `false` value.
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
Instead of extracting and forwarding recognized conversion directives
from the user supplied format string, properly parse the format string
into its components and reassemble a canonical representation of the
conversion directive internally before handing it to the libc's sprintf()
implementation.
Also take care of selecting the proper conversion specifiers for signed
and unsigned 64bit integer values to fix broken `%d`, `%i`, `%u`, `%o`,
`%x` and `%X` formats on 32bit systems.
While reworking the format logic, also slightly improve `%s` argument
handling by not duplicate the given value if it already is a string,
which reduces the amount of required heap memory.
Ref: https://bugs.openwrt.org/index.php?do=details&task_id=4234
Ref: https://git.openwrt.org/3d3d03479d5b4a976cf1320d29f4bd4937d5a4ba
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
Previous refactoring of the code led to an invalid internal format pettern
being used to output the formatted JSON data.
Fixes: 9041e24 ("lib: fix uninitialized memory access on handling %J string formats")
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
- Move source object pointer into program entity which is referenced by
each function
- Move lineinfo related routines into source.c and use them from lexer.c
since lineinfo encoding does not belong into the lexical analyzer.
- Implement initial infrastructure for detecting source file type,
this is required later to differentiate between plaintext and
precompiled bytecode files
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
The uniq() function allows extracting all unique values from a given input
array in an efficient manner. It is roughly equivalent to the following
ucode idiom:
let seen = {}
let unique = filter(array, item => !seen[item]++);
In contrast to the code above, `uniq()` does not rely on implicit
stringification of item values but performs strict equality tests internally.
If equivalence of stringified results is desired, the following code can
be used:
let unique = uniq(map(array, item => "" + item));
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
- Parse integer literals as unsigned numeric values in order to be able
to represent the entire unsigned 64bit value range
- Stop parsing minus-prefixed integer literals as negative numbers but
treat them as separate minus operator followed by a positive integer
instead
- Only store unsigned numeric constants in bytecode
- Rework numeric comparison logic to be able to handle full 64bit
unsigned integers
- If possible, yield unsigned 64 bit results for additions
- Simplify numeric value conversion API
- Compile code with -fwrapv for defined signed overflow semantics
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
Fix various misspelling of "resource".
This commit changes the exported libucode ABI.
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
Subsequent requires of the same module returned the cached module instance
without increasing the refcount, leading to use-after-free on VM tear down
or garbage collection cycles.
Solve this issue by properly incrementing the refcount before returning
the cached module instance.
Fixes: #25
Fixes: 96f140b ("lib, vm: ensure that require() compiles modules only once")
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
When parsing the padding size specification of a `J` format, e.g. `%.4J`,
the internally called `atoi()` function might read beyond the end of the
initialized memory within the format buffer, leading to non-deterministic
results.
Avoid overreading the initialized memory by parsing the padding length
manually digit-by-digit.
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
- Ensure that most functions follow the subject_verb naming schema
- Move type related function from value.c to types.c
- Rename value.c to vallist.c
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
Ensure that all custom typedef and vector declaration type names end with
a "_t" suffix.
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
Instead of invoking exit(3) from uc_exit(), use a new EXCEPTION_EXIT
exception type to instruct the VM to shutdown cleanly.
This is required to not terminate the host program in case libucode
is embedded and loaded scripts invoke the exit() function.
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
The naming is an artifact from before the introduction of the new type
system. In the current code, there is nothing special about prototypes,
they're simple object values.
Also introduce a new singular uc_add_function() convenience macro while
we're at it.
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
Ensure to increase the refcount of the module scope value when caching it
in the global module registry to avoid a double free on VM teardown.
Fixes: 96f140b ("lib, vm: ensure that require() compiles modules only once")
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
Cache the result of a successful require operation in a global.modules
object and return the cached value for subsequent require calls on the
same module name.
This ensures that a given module is only compiled and executed once,
regardless of how many times it is used.
A reload of the module can be forced by deleting the corresponding
key in the global module table.
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
This allows us to drop some token->instruction mapping case switches
in the VM.
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
The new functions allow encoding and decoding base64 values respectively.
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
Do not walk up the entire call stack but specifically use the context of the
callframe calling the C function.
Fixes: 3e893e6 ("lib: pass-through "this" context to library function callbacks")
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
Ensure that callbacks invoked by filter(), map(), sort() and replace()
inherit the "this" context that invoked the respective C function.
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
The sourcepath() function allows querying the filesystem path of the source
file currently being executed by ucode.
The optional depth argument can be used to walk up the include stack to
determine the path of the file that included the current file, the path of
the parent file of the parent file and so on.
By specifying a truish value as second argument, only the directory portion
of the source file path is returned. This is useful to e.g. discover
ressources relative to the current source file directory.
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
The numeric return value was incorrectly stored in an unsigned size_t
variable which got later wrapped in an ucode signed 64bit integer value.
This worked by accident on 64bit systems since (int64_t)(size_t)(-1) == -1,
but it failed on 32bit ones where (int64_t)(size_t)(-1) yields 4294967295
due to the different sizes of the size_t and int64_t types.
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|