summaryrefslogtreecommitdiffhomepage
path: root/compiler.c
AgeCommit message (Collapse)Author
2022-01-04treewide: rework numeric value handlingJo-Philipp Wich
- Parse integer literals as unsigned numeric values in order to be able to represent the entire unsigned 64bit value range - Stop parsing minus-prefixed integer literals as negative numbers but treat them as separate minus operator followed by a positive integer instead - Only store unsigned numeric constants in bytecode - Rework numeric comparison logic to be able to handle full 64bit unsigned integers - If possible, yield unsigned 64 bit results for additions - Simplify numeric value conversion API - Compile code with -fwrapv for defined signed overflow semantics Signed-off-by: Jo-Philipp Wich <jo@mein.io>
2021-12-01syntax: disallow keywords in object property shorthand notationJo-Philipp Wich
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
2021-10-11syntax: introduce optional chaining operatorsJo-Philipp Wich
Introduce new operators `?.`, `?.[…]` and `?.(…)` to simplify looking up deeply nested property chain in a secure manner. The `?.` operator behaves like the `.` property access operator but yields `null` if the left hand side is `null` or not an object. Like `?.`, the `?.[…]` operator behaves like the `[…]` computed property access but yields `null` if the left hand side is `null` or neither an object or array. Finally the `?.(…)` operator behaves like the function call operator `(…)` but yields `null` if the left hand side is `null` or not a callable function. Signed-off-by: Jo-Philipp Wich <jo@mein.io>
2021-09-19compiler: properly handle jumps to offset 0Jo-Philipp Wich
When compiling certain expressions as first statement of an ucode program, e.g. a while loop in raw mode, a jump instruction to offset zero is emitted which was incorrectly treated as placeholder by the compiler. Signed-off-by: Jo-Philipp Wich <jo@mein.io>
2021-07-11treewide: harmonize function namingJo-Philipp Wich
- Ensure that most functions follow the subject_verb naming schema - Move type related function from value.c to types.c - Rename value.c to vallist.c Signed-off-by: Jo-Philipp Wich <jo@mein.io>
2021-07-11treewide: move header files into dedicated directoryJo-Philipp Wich
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
2021-07-11treewide: consolidate typedef namingJo-Philipp Wich
Ensure that all custom typedef and vector declaration type names end with a "_t" suffix. Signed-off-by: Jo-Philipp Wich <jo@mein.io>
2021-07-11treewide: eliminate dead code and unused functionsJo-Philipp Wich
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
2021-07-05compiler: don't segfault on invalid declaration expressionsJo-Philipp Wich
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
2021-06-08compiler: improve mapping of binary operator tokens to instructionsJo-Philipp Wich
Instead of relying on a switch/case mapping of token values to corresponding VM instructions, infer the instruction number arithmetically. This shrinks the compiled size on x86/64 by about 250 bytes. Also emit I_LE and I_GE instructions for `<=` and `>=` comparisons instead of transforming these into I_GT and I_LT negations. Signed-off-by: Jo-Philipp Wich <jo@mein.io>
2021-05-25lexer, compiler: separate TK_BOOL token into TK_TRUE and TK_FALSE tokensJo-Philipp Wich
The token type split allows us to drop the token value union in the reserved word list with a subsequent commit. Signed-off-by: Jo-Philipp Wich <jo@mein.io>
2021-05-18syntax: introduce `const` supportJo-Philipp Wich
Introduce support for declaring constant variables through the `const` keyword. Variables declared with `const` follow the same scoping rules as `let` declared ones. In contrast to normal variables, `const` ones may not be assigned to after their declaration. Any attempt to do so will result in a syntax error during compilation. Signed-off-by: Jo-Philipp Wich <jo@mein.io>
2021-05-18compiler, lexer: add NO_LEGACY define to disable legacy syntax featuresJo-Philipp Wich
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
2021-05-18syntax: implement `delete` as proper operatorJo-Philipp Wich
Turn `delete` into a proper operator mimicking ECMAScript semantics. Also ensure to transparently turn deprecated `delete(obj, propname)` function calls into `delete obj.propname` expressions during compilation. When strict mode is active, legacy delete() calls throw a syntax error instead. Finally drop the `delete()` function from the stdlib as it is shadowed by the delete operator syntax now. Signed-off-by: Jo-Philipp Wich <jo@mein.io>
2021-05-11compiler: fix local for-loop initializer variable declarationsJo-Philipp Wich
In a loop statement like `for (let x = 1, y = 2; ...)` the initialization statement was incorrectly interpreted as `let x = 1; y = 2` instead of the correct `let ..., y = 2`, triggering reference error exceptions in strict mode. Solve the issue by continue parsing the rest of the comma expression seqence as declaration list expression when the initializer is compiled in local mode. Signed-off-by: Jo-Philipp Wich <jo@mein.io>
2021-05-11compiler: properly parse slashes in parenthesized division expressionsJo-Philipp Wich
Due to the special code path parsing the leading label portion of a parenthesized expression, slashes following a label were improperly treated as regular expression literal delimitters, emitting a syntax error when an otherwise valid expression such as `a / 1` was being parsed as first sub expression of a parenthesized expression. Signed-off-by: Jo-Philipp Wich <jo@mein.io>
2021-05-05compiler: properly handle break/continue in nested scopesJo-Philipp Wich
When emitting byte code for break or continue statements, ensure that local variables in all containing scopes up to the loop body scope are popped, not just those in the same scope the statement is located in. Signed-off-by: Jo-Philipp Wich <jo@mein.io>
2021-05-04compiler: properly handle keyword in parenthesized property access expressionJo-Philipp Wich
Due to the special code path parsing the leading label portion of a parenthesized expression, keywords following a property access operator (TK_DOT, `.`) weren't properly handled, emitting a syntax error when an otherwise valid expression such as `value.default` was being parsed as first sub expression of a parenthesized expression. Signed-off-by: Jo-Philipp Wich <jo@mein.io>
2021-05-04compiler: fix stack mismatch on compiling `use strict` statementsJo-Philipp Wich
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
2021-05-04syntax: implement support for 'use strict' pragmaJo-Philipp Wich
Support per-file and per-function `"use strict";` statement to opt into strict variable handling from ucode source code. Signed-off-by: Jo-Philipp Wich <jo@mein.io>
2021-04-29compiler: fix segfault on parsing invalid pre/post increment expressionsJo-Philipp Wich
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
2021-04-29compiler, lexer: improve lexical state handlingJo-Philipp Wich
- Instead of disambiguating division operator vs. regexp literal by looking at the preceeding token, raise a "no regexp" flag within the appropriate parser states to tell the lexer how to treat a forward slash when parsing the next token - Introduce another "no keyword" flag which disables parsing labels into keywords when reading the next token and set it in the appropriate parser states. This allows using reserved names in object declarations and property access expressions Signed-off-by: Jo-Philipp Wich <jo@mein.io>
2021-04-27treewide: ISO C / pedantic complianceJo-Philipp Wich
- Shuffle typedefs to avoid need for non-compliant forward declarations - Fix non-compliant empty struct initializers - Remove use of braced expressions - Remove use of anonymous unions - Avoid `void *` pointer arithmetic - Fix several warnings reported by gcc -pedantic mode and clang 11 compilation Signed-off-by: Jo-Philipp Wich <jo@mein.io>
2021-04-25treewide: rework internal data type systemJo-Philipp Wich
Instead of relying on json_object values internally, use custom types to represent the different ucode value types which brings a number of advantages compared to the previous approach: - Due to the use of tagged pointers, small integer, string and bool values can be stored directly in the pointer addresses, vastly reducing required heap memory - Ability to create circular data structures such as `let o; o = { test: o };` - Ability to register custom `tostring()` function through prototypes - Initial mark/sweep GC implementation to tear down circular object graphs on VM deinit The change also paves the way for possible future extensions such as constant variables and meta methods for custom ressource types. Signed-off-by: Jo-Philipp Wich <jo@mein.io>
2021-04-24treewide: fix issues reported by clang code analyzerJo-Philipp Wich
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
2021-03-31compiler: actually expand block scope fix to for/while alt syntaxJo-Philipp Wich
Fixes: 97bf297 ("compiler: ensure that alternative if/for/while syntax has own block scope") Signed-off-by: Jo-Philipp Wich <jo@mein.io>
2021-03-31compiler: ensure that alternative if/for/while syntax has own block scopeJo-Philipp Wich
The `if ...: endif`, `for ...: endfor`, `while ...: endwhile` etc. syntax statements are supposed to have their own lexical scope, like curly brace blocks in normal statements. Without this, local variable declarations within such blocks would incorrectly shift stack offsets for the remainder of the program. Signed-off-by: Jo-Philipp Wich <jo@mein.io>
2021-03-29compiler: rework switch statement code generationJo-Philipp Wich
- Initialize stack slots belonging to skipped local variable declarations - Group switch case value tests after switch statement body Signed-off-by: Jo-Philipp Wich <jo@mein.io>
2021-03-24compiler: fix for/break miscompilationJo-Philipp Wich
When patching jump targets for break statments while compiling for-loop statments, jump beyond the instructions popping intermediate loop variables off the stack to fix a stack position mismatch between compiler and vm. Before that change, local loop body variables got popped twice, breaking the expected stack layout. Signed-off-by: Jo-Philipp Wich <jo@mein.io>
2021-03-23compiler: fix another try/catch miscompilationJo-Philipp Wich
When skipping over the catch block of a try/catch statement, make sure to emit the jump after the try scope variables have been popped off the stack in order to prevent a stack position mismatch between compiler and vm. Fixes: 9ad9afb ("compiler: fix try/catch miscompilation") Signed-off-by: Jo-Philipp Wich <jo@mein.io>
2021-03-12compiler: fix parsing of arrow functions with single expression bodyJo-Philipp Wich
Ensure that an arrow function body expression is parsed with P_ASSIGN precedence to not greedily consume comma expressions. This ensures that an expression like () => 1, 2 is parsed as function [() => 1], integer [2] and not as function [() => 1, 2]. Signed-off-by: Jo-Philipp Wich <jo@mein.io>
2021-03-11compiler: fix switch case->default fallthroughJo-Philipp Wich
Simplify handling of default case in switch statements. Instead of jumping over the default block, simply record the start address of the block since the initial switch jump is patched into the first non-default case already. This also leads to slightly smaller bytecode. Previously, when a case branch fell through into a default block, it did hit the default skip jump which jumped back into the first case which then fell through into the default skip jump, leading to an endless loop. Signed-off-by: Jo-Philipp Wich <jo@mein.io>
2021-02-26compiler: fix try/catch miscompilationJo-Philipp Wich
When skipping catch blocks with exception variables, jump beyond the instruction popping the exception variable off the stack to fix a stack position mismatch between compiler and vm. Signed-off-by: Jo-Philipp Wich <jo@mein.io>
2021-02-26lexer: improvementsJo-Philipp Wich
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
2021-02-17syntax: support ES2015 computed property namesJo-Philipp Wich
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
2021-02-17syntax: support ES2015 shorthand property namesJo-Philipp Wich
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
2021-02-17treewide: rewrite ucode interpreterJo-Philipp Wich
Replace the former AST walking interpreter implementation with a single pass bytecode compiler and a corresponding virtual machine. The rewrite lays the groundwork for a couple of improvements with will be subsequently implemented: - Ability to precompile ucode sources into binary byte code - Strippable debug information - Reduced runtime memory usage Signed-off-by: Jo-Philipp Wich <jo@mein.io>