summaryrefslogtreecommitdiffhomepage
diff options
context:
space:
mode:
authorJo-Philipp Wich <jo@mein.io>2025-02-20 12:09:22 +0100
committerJo-Philipp Wich <jo@mein.io>2025-02-20 12:09:22 +0100
commit338aa2ebf00e370c7b17c9c022571f2fbc858007 (patch)
tree481791574896d4e391be89b289bc74238529198d
parent60f05ac60bfe5c4c5b013b7ae69b9f2913a6aa90 (diff)
docs: add article with array specific information
Ref: #277 Signed-off-by: Jo-Philipp Wich <jo@mein.io>
-rw-r--r--docs/tutorials/04-arrays.md647
-rw-r--r--docs/tutorials/tutorials.json3
2 files changed, 650 insertions, 0 deletions
diff --git a/docs/tutorials/04-arrays.md b/docs/tutorials/04-arrays.md
new file mode 100644
index 0000000..789e990
--- /dev/null
+++ b/docs/tutorials/04-arrays.md
@@ -0,0 +1,647 @@
+Arrays in ucode are ordered collections that can store any ucode value. Unlike
+many other scripting languages where arrays are implemented as hash tables or
+linked lists, ucode arrays are true arrays in memory. This implementation detail
+provides several distinctive characteristics that developers should understand
+when working with arrays in ucode.
+
+## Key Characteristics of Ucode Arrays
+
+### True Memory Arrays
+
+Ucode arrays are implemented as true arrays in memory, which means:
+- They offer fast random access to elements by index
+- They are stored contiguously in memory
+- Memory allocation expands as needed to accommodate the highest used index
+
+### Sparse Array Behavior
+
+Because ucode arrays are true arrays in memory:
+- Memory is always allocated contiguously up to the highest used index
+- All positions (including unused ones) consume memory
+- "Empty" or unused positions contain `null` values
+- There is no special optimization for sparse arrays
+
+### Negative Index Support
+
+Ucode arrays provide convenient negative indexing:
+- `-1` refers to the last element
+- `-2` refers to the second-last element
+- And so on, allowing easy access to elements from the end of the array
+
+### Type Flexibility
+
+Arrays in ucode can hold any ucode value type:
+- Booleans, numbers (integers and doubles), strings
+- Objects and arrays (allowing nested arrays)
+- Functions and null values
+- No type restrictions between elements (unlike typed arrays in some languages)
+
+## Core Array Functions
+
+### Array Information Functions
+
+#### {@link module:core#length|length(x)} → {number}
+
+Returns the number of elements in an array. This is one of the most fundamental
+array operations in ucode.
+
+```
+let fruits = ["apple", "banana", "orange"];
+length(fruits); // 3
+
+let sparse = [];
+sparse[10] = "value";
+length(sparse); // 11 (includes empty slots)
+```
+
+For arrays, `length()` returns the highest index plus one, which means it
+includes empty slots in sparse arrays. If the input is not an array, string, or
+object, `length()` returns null.
+
+#### {@link module:core#index|index(arr_or_str, needle)} → {number}
+
+Searches for a value in an array and returns the index of the first matching occurrence.
+
+```
+let colors = ["red", "green", "blue", "green"];
+index(colors, "green"); // 1 (returns first match)
+index(colors, "yellow"); // -1 (not found)
+```
+
+Unlike many other languages where array search functions return -1 or null for
+non-matching items, `index()` in ucode specifically returns -1 when the value
+isn't found. It returns null only if the first argument is neither an array nor
+a string.
+
+#### {@link module:core#rindex|rindex(arr_or_str, needle)} → {number}
+
+Similar to `index()`, but searches backward from the end of the array:
+
+```
+let colors = ["red", "green", "blue", "green"];
+rindex(colors, "green"); // 3 (last occurrence)
+```
+
+#### Checking if a Value is an Array
+
+To determine if a value is an array, use the `type()` function:
+
+```
+function isArray(value) {
+ return type(value) == "array";
+}
+
+isArray([1, 2, 3]); // true
+isArray("string"); // false
+isArray({key: "value"}); // false
+isArray(null); // false
+```
+
+The `type()` function is extremely useful for defensive programming, especially
+in ucode where functions often need to determine the type of their arguments to
+process them correctly.
+
+### Manipulation Functions
+
+#### {@link module:core#push|push(arr, ...values)} → {*}
+
+Adds one or more elements to the end of an array and returns the last pushed
+value.
+
+```
+let x = [1, 2, 3];
+push(x, 4, 5, 6); // 6
+print(x); // [1, 2, 3, 4, 5, 6]
+```
+
+Returns null if the array was empty or if a non-array argument was passed.
+
+#### {@link module:core#pop|pop(arr)} → {*}
+
+Removes the last element from an array and returns it.
+
+```
+let x = [1, 2, 3];
+let lastItem = pop(x); // 3
+print(x); // [1, 2]
+```
+
+Returns null if the array was empty or if a non-array argument was passed.
+
+#### {@link module:core#unshift|unshift(arr, ...values)} → {*}
+
+Adds one or more elements to the beginning of an array and returns the last
+value added.
+
+```
+let x = [3, 4, 5];
+unshift(x, 1, 2); // 2
+print(x); // [1, 2, 3, 4, 5]
+```
+
+#### {@link module:core#shift|shift(arr)} → {*}
+
+Removes the first element from an array and returns it.
+
+```
+let x = [1, 2, 3];
+let firstItem = shift(x); // 1
+print(x); // [2, 3]
+```
+
+Returns null if the array was empty or if a non-array argument was passed.
+
+### Transformation Functions
+
+#### {@link module:core#map|map(arr, fn)} → {Array}
+
+Creates a new array populated with the results of calling a provided function on
+every element in the calling array.
+
+```
+let numbers = [1, 2, 3, 4];
+let squares = map(numbers, x => x * x); // [1, 4, 9, 16]
+```
+
+Note: The callback function receives three arguments:
+1. The current element value
+2. The current index
+3. The array being processed
+
+```
+let values = map(["foo", "bar", "baz"], function(value, index, array) {
+ return `${index}: ${value} (from array of length ${length(array)})`;
+});
+```
+
+##### Important Pitfall with Built-in Functions
+
+A common mistake when using `map()` is passing a built-in function directly as
+the callback. Consider this example attempting to convert an array of strings to
+integers:
+
+```
+// ⚠️ INCORRECT: This will not work as expected!
+let strings = ["10", "32", "13"];
+let nums = map(strings, int); // Results will be unpredictable!
+```
+
+This fails because the `map()` function calls the callback with three arguments:
+1. The current value (`"10"`, `"32"`, etc.)
+2. The current index (`0`, `1`, `2`)
+3. The original array (`["10", "32", "13"]`)
+
+So what actually happens is equivalent to:
+```
+int("10", 0, ["10", "32", "13"]) // Interprets 0 as base parameter!
+int("32", 1, ["10", "32", "13"]) // Interprets 1 as base parameter!
+int("13", 2, ["10", "32", "13"]) // Interprets 2 as base parameter!
+```
+
+The second argument to `int()` is interpreted as the numeric base, causing
+unexpected conversion results:
+
+- `"10"` in base 0 is interpreted as decimal 10 (base 0 is a special case that auto-detects the base)
+- `"32"` in base 1 produces `NaN` because base 1 is invalid (a numeral system needs at least 2 distinct digits)
+- `"13"` in base 2 produces `1` because in binary only `0` and `1` are valid digits - it converts `"1"` successfully and stops at the invalid character `"3"`
+
+The actual result would be `[10, NaN, 1]`, which is certainly not what you'd
+expect when trying to convert string numbers to integers!
+
+To fix this, wrap the function call in an arrow function or a regular function
+that controls the number of arguments:
+
+```
+// ✓ CORRECT: Using arrow function to control arguments
+let strings = ["10", "32", "13"];
+let nums = map(strings, x => int(x)); // [10, 32, 13]
+
+// Alternative approach using a named function
+function toInt(str) {
+ return int(str);
+}
+let nums2 = map(strings, toInt); // [10, 32, 13]
+```
+
+This pattern applies to many other built-in functions like `length()`, `trim()`,
+`b64enc()`, etc. Always wrap built-in functions when using them with `map()` to
+ensure they receive only the intended arguments.
+
+#### {@link module:core#filter|filter(arr, fn)} → {Array}
+
+Creates a new array with all elements that pass the test implemented by the
+provided function.
+
+```
+let numbers = [1, 2, 3, 4, 5, 6];
+let evens = filter(numbers, x => x % 2 == 0); // [2, 4, 6]
+```
+
+The callback function receives the same three arguments as in `map()`.
+
+#### {@link module:core#sort|sort(arr, fn)} → {Array}
+
+Sorts the elements of an array in place and returns the sorted array, optionally
+using a custom compare function.
+
+```
+let numbers = [3, 1, 4, 2];
+sort(numbers); // [1, 2, 3, 4]
+```
+
+With a custom compare function:
+
+```
+let people = [
+ { name: "Alice", age: 25 },
+ { name: "Bob", age: 30 },
+ { name: "Charlie", age: 20 }
+];
+
+sort(people, (a, b) => a.age - b.age);
+// [{ name: "Charlie", age: 20 }, { name: "Alice", age: 25 }, { name: "Bob", age: 30 }]
+```
+
+#### {@link module:core#reverse|reverse(arr)} → {Array}
+
+Returns a new array with the order of all elements reversed.
+
+```
+let arr = [1, 2, 3];
+reverse(arr); // [3, 2, 1]
+```
+
+This function also works on strings:
+
+```
+reverse("hello"); // "olleh"
+```
+
+Returns null if the argument is neither an array nor a string.
+
+#### {@link module:core#uniq|uniq(array)} → {Array}
+
+Creates a new array with all duplicate elements removed.
+
+```
+let array = [1, 2, 2, 3, 1, 4, 5, 4];
+uniq(array); // [1, 2, 3, 4, 5]
+```
+
+Returns null if a non-array argument is given.
+
+### Helper Functions and Recipes
+
+The ucode standard library provides essential array functions, but many common
+operations must be implemented manually. Below, you'll find example
+implementations for frequently needed array operations that aren't built into
+the core library.
+
+These recipes demonstrate how to leverage ucode's existing functions to build
+more complex array utilities.
+
+#### Array Intersection
+
+Returns a new array containing elements present in all provided arrays.
+
+```
+function intersect(...arrays) {
+ if (!length(arrays))
+ return [];
+
+ let result = arrays[0];
+
+ for (let i = 1; i < length(arrays); i++) {
+ result = filter(result, item => item in arrays[i]);
+ }
+
+ return uniq(result);
+}
+
+// Example usage:
+let a = [1, 2, 3, 4];
+let b = [2, 3, 5];
+let c = [2, 3, 6];
+intersect(a, b, c); // [2, 3]
+```
+
+This implementation takes advantage of ucode's `in` operator, which checks if a
+value exists in an array using strict equality comparison. This makes the code
+more concise than using `index()` and checking if the result is not -1.
+
+#### Array Merge/Concatenation
+
+Combines multiple arrays into a new array. Taking advantage of ucode's variadic
+`push()` function with the spread operator provides an elegant solution:
+
+```
+function merge(...arrays) {
+ let result = [];
+
+ for (arr in arrays) {
+ push(result, ...arr); // Spreads all elements from the array directly into push
+ }
+
+ return result;
+}
+
+// Example usage:
+let a = [1, 2];
+let b = [3, 4];
+let c = [5, 6];
+merge(a, b, c); // [1, 2, 3, 4, 5, 6]
+```
+
+This implementation leverages the variadic nature of `push()`, which accepts any
+number of arguments. The spread operator (`...`) unpacks each array, passing all
+its elements as individual arguments to `push()`. This is both more efficient
+and more readable than nested loops.
+
+For processing very large arrays, you might want to use a batching approach to
+avoid potential call stack limitations:
+
+```
+function mergeWithBatching(...arrays) {
+ let result = [];
+ const BATCH_SIZE = 1000;
+
+ for (arr in arrays) {
+ // Handle array in batches to avoid excessive function arguments
+ for (let i = 0; i < length(arr); i += BATCH_SIZE) {
+ let batch = slice(arr, i, i + BATCH_SIZE);
+ push(result, ...batch);
+ }
+ }
+
+ return result;
+}
+```
+
+#### Array Difference
+
+Returns elements in the first array not present in subsequent arrays.
+
+```
+function difference(array, ...others) {
+ return filter(array, item => {
+ for (other in others) {
+ if (item in other)
+ return false;
+ }
+ return true;
+ });
+}
+
+// Example usage:
+let a = [1, 2, 3, 4, 5];
+let b = [2, 3];
+let c = [4];
+difference(a, b, c); // [1, 5]
+```
+
+This implementation uses the `in` operator for concise and efficient membership
+testing, filtering out any elements from the first array that appear in any of
+the subsequent arrays.
+
+#### Array Chunk
+
+Splits an array into chunks of specified size.
+
+```
+function chunk(array, size) {
+ if (size <= 0)
+ return [];
+
+ let result = [];
+
+ for (let i = 0; i < length(array); i += size) {
+ push(result, slice(array, i, i + size));
+ }
+
+ return result;
+}
+
+// Example usage:
+let nums = [1, 2, 3, 4, 5, 6, 7, 8];
+chunk(nums, 3); // [[1, 2, 3], [4, 5, 6], [7, 8]]
+```
+
+This implementation uses a counting `for` loop combined with `slice()`, which is
+both more idiomatic and more efficient. The approach:
+
+1. Iterates through the array in steps of `size`
+2. Uses `slice()` to extract chunks of the appropriate size
+3. Automatically handles the last chunk being smaller if the array length isn't divisible by the chunk size
+
+This pattern leverages ucode's built-in functions for cleaner, more maintainable
+code. No temporary variables are needed to track the current chunk or count,
+making the implementation more straightforward.
+
+#### Array Sum
+
+Calculates the sum of all numeric elements in an array.
+
+```
+function sum(array) {
+ let result = 0;
+
+ for (item in array) {
+ if (type(item) == "int" || type(item) == "double")
+ result += item;
+ }
+
+ return result;
+}
+
+// Example usage:
+let nums = [1, 2, 3, 4, 5];
+sum(nums); // 15
+```
+
+#### Array Flatten
+
+Flattens a nested array structure.
+
+```
+function flatten(array, depth) {
+ if (depth === undefined)
+ depth = 1;
+
+ let result = [];
+
+ for (item in array) {
+ if (type(item) == "array" && depth > 0) {
+ let flattened = flatten(item, depth - 1);
+ for (subItem in flattened) {
+ push(result, subItem);
+ }
+ } else {
+ push(result, item);
+ }
+ }
+
+ return result;
+}
+
+// Example usage:
+let nested = [1, [2, [3, 4], 5], 6];
+flatten(nested); // [1, 2, [3, 4], 5, 6]
+flatten(nested, 2); // [1, 2, 3, 4, 5, 6]
+```
+
+## Advanced Array Techniques and Considerations
+
+### Memory Management
+
+When working with arrays in ucode, you should understand several important
+memory characteristics that affect performance and resource usage:
+
+#### Sparse Array Memory Implications
+
+Since ucode arrays are true arrays in memory, array memory consumption scales
+linearly with the highest index used, regardless of how many elements are
+actually stored:
+
+```
+let arr = [];
+arr[1000000] = "value"; // Allocates memory for 1,000,001 pointers
+```
+
+Important technical details about ucode array memory usage:
+- Each array element consumes pointer-sized memory (4 bytes on 32-bit systems, 8 bytes on 64-bit systems)
+- No optimizations exist for sparse arrays - every position up to the highest index is allocated
+- When an array grows beyond its capacity, it's reallocated with a growth factor of 1.5
+- Memory is allocated even for "empty" slots (which contain the `null` value)
+
+For example, on a 64-bit system, creating an array with a single element at
+index 1,000,000 would consume approximately 8MB of memory (1,000,001 * 8 bytes),
+even though only one actual value is stored.
+
+```
+// Demonstrates memory consumption
+let smallArray = [];
+for (let i = 0; i < 10; i++)
+ smallArray[i] = i;
+
+let sparseArray = [];
+sparseArray[1000000] = "far away";
+
+print(`Small array has ${length(smallArray)} elements\n`);
+print(`Sparse array has ${length(sparseArray)} elements\n`);
+// Even though only one value is actually set, memory is allocated for all positions
+```
+
+This behavior makes ucode arrays efficient for random access but potentially
+wasteful for very sparse data structures. For data with large gaps or when
+working on memory-constrained systems, consider alternative approaches like
+objects with numeric string keys.
+
+#### Negative Index Implementation
+
+While negative indices provide convenient access to elements from the end of an
+array, they involve an internal conversion process:
+
+```
+let arr = [1, 2, 3, 4, 5];
+arr[-1]; // Internally converted to arr[length(arr) - 1], or arr[4]
+arr[-3]; // Internally converted to arr[length(arr) - 3], or arr[2]
+```
+
+This conversion adds a small computational overhead compared to direct positive
+indexing. For performance-critical code processing large arrays, consider using
+positive indices when possible.
+
+#### Mixed-Type Arrays and Sorting
+
+Arrays in ucode can contain mixed types, offering great flexibility but
+requiring careful handling, especially with operations like `sort()`:
+
+```
+let mixed = ["apple", 10, true, {name: "object"}, [1, 2]];
+
+// Sort behaves differently with mixed types
+// Numbers come first, then arrays, then strings, then booleans, then objects
+sort(mixed); // [10, [1, 2], "apple", true, {name: "object"}]
+```
+
+When sorting mixed-type arrays, consider implementing a custom comparison
+function to define the sorting behavior explicitly:
+
+```
+function mixedTypeSort(a, b) {
+ // Sort by type first, then by value
+ let typeA = type(a);
+ let typeB = type(b);
+
+ if (typeA != typeB) {
+ // Define a type precedence order
+ let typePrecedence = {
+ "int": 1,
+ "double": 2,
+ "string": 3,
+ "bool": 4,
+ "array": 5,
+ "object": 6
+ };
+ return typePrecedence[typeA] - typePrecedence[typeB];
+ }
+
+ // If same type, compare values appropriately
+ if (typeA == "string" || typeA == "array")
+ return length(a) - length(b);
+ return a - b;
+}
+
+// Now sorting is more predictable
+sort(mixed, mixedTypeSort);
+```
+
+### Performance Optimization
+
+When working with large arrays, consider these optimization techniques:
+
+1. **Pre-allocation**: Where possible, create arrays with known capacity rather than growing incrementally
+2. **Batch operations**: Minimize individual push/pop/shift/unshift calls by processing in batches
+3. **Avoid unnecessary copies**: Use in-place operations when possible
+4. **Filter early**: Filter arrays early in processing pipelines to reduce subsequent operation sizes
+
+### Array Deep Copying
+
+Since arrays are reference types, creating true copies requires special handling:
+
+```
+function deepCopy(arr) {
+ if (type(arr) != "array")
+ return arr;
+
+ let result = [];
+ for (item in arr) {
+ if (type(item) == "array")
+ push(result, deepCopy(item));
+ else if (type(item) == "object")
+ push(result, deepCopyObject(item));
+ else
+ push(result, item);
+ }
+ return result;
+}
+
+function deepCopyObject(obj) {
+ if (type(obj) != "object")
+ return obj;
+
+ let result = {};
+ for (key in keys(obj)) {
+ if (type(obj[key]) == "array")
+ result[key] = deepCopy(obj[key]);
+ else if (type(obj[key]) == "object")
+ result[key] = deepCopyObject(obj[key]);
+ else
+ result[key] = obj[key];
+ }
+ return result;
+}
+```
+
+This approach ensures all nested arrays and objects are properly copied rather
+than referenced.
diff --git a/docs/tutorials/tutorials.json b/docs/tutorials/tutorials.json
index df4e339..2cab5e1 100644
--- a/docs/tutorials/tutorials.json
+++ b/docs/tutorials/tutorials.json
@@ -7,5 +7,8 @@
},
"03-memory": {
"title": "Memory Management"
+ },
+ "04-arrays": {
+ "title": "Working with Arrays"
}
}