aboutsummaryrefslogtreecommitdiff
path: root/src/js/README.md
diff options
context:
space:
mode:
Diffstat (limited to 'src/js/README.md')
-rw-r--r--src/js/README.md115
1 files changed, 102 insertions, 13 deletions
diff --git a/src/js/README.md b/src/js/README.md
index c5f75eeec..8f270ed2a 100644
--- a/src/js/README.md
+++ b/src/js/README.md
@@ -1,30 +1,119 @@
# JS Modules
+**TLDR**: If anything here changes, re-run `make js`. If you add/remove files, `make regenerate-bindings`.
+
- `./node` contains all `node:*` modules
- `./bun` contains all `bun:*` modules
- `./thirdparty` contains npm modules we replace like `ws`
+- `./internal` contains modules that aren't assigned to the module resolver
+
+Each `.ts`/`.js` file above is assigned a numeric id at compile time and inlined into an array of lazily initialized modules. Internal modules referencing each other is extremely optimized, skipping the module resolver entirely.
+
+## Builtins Syntax
+
+Within these files, the `$` prefix on variables can be used to access private property names as well as JSC intrinsics.
-When you change any of those folders, run this to bundle and minify them:
+```ts
+// Many globals have private versions which are impossible for the user to
+// tamper with. Though, these global variables are auto-prefixed by the bundler.
+const hello = $Array.from(...);
-```bash
-$ make esm
+// Similar situation with prototype values. These aren't autoprefixed since it depends on type.
+something.$then(...);
+map.$set(...);
+
+// Internal variables we define
+$requireMap.$has("elysia");
+
+// JSC engine intrinsics. These usually translate directly to bytecode instructions.
+const arr = $newArrayWithSize(5);
+// A side effect of this is that using an intrinsic incorrectly like
+// this will fail to parse and cause a segfault.
+console.log($getInternalField)
```
-These modules are bundled into the binary, but in debug mode they are loaded from the filesystem, so you do not need to rerun `make dev`. If you want to override the modules in a release build, you can set `BUN_OVERRIDE_MODULE_PATH` to the path to the repo:
+V8 has a [similar feature](https://v8.dev/blog/embedded-builtins) to this syntax (they use `%` instead)
+
+On top of this, we have some special functions that are handled by the bundle preprocessor:
+
+- `require` works, but it must be a string literal that resolves to a module within src/js. This call gets replaced with `$requireId(id)`, which is a special function that skips the module resolver and directly loads the module by it's generated numerical ID.
-```bash
-$ BUN_OVERRIDE_MODULE_PATH=/path/to/bun-repo bun ...
+- `$debug` is exactly like console.log, but is stripped in release builds. It is disabled by default, requiring you to pass one of: `BUN_DEBUG_MODULE_NAME=1`, `BUN_DEBUG_JS=1`, or `BUN_DEBUG_ALL=1`. You can also do `if($debug) {}` to check if debug env var is set.
+
+- `IS_BUN_DEVELOPMENT` is inlined to be `true` in all development builds.
+
+- `process.platform` is properly inlined and DCE'd. Do use this to run different code on different platforms.
+
+- `$bundleError()` is like Zig's `@compileError`. It will stop a compile from succeeding.
+
+## Builtin Modules
+
+In module files, instead of using `module.exports`, use the `export default` variable. Due to the internal implementation, these must be `JSCell` types (function / object).
+
+```ts
+export default {
+ hello: 2,
+ world: 3,
+};
```
-For any private types like `Bun.fs()`, add them to `./private.d.ts`
+Keep in mind that **these are not ES modules**. `export default` is only syntax sugar to assign to the variable `$exports`, which is actually how the module exports it's contents. `export var` and `export function` are banned syntax, and so is `import` (use `require` instead)
-# Builtins
+To actually wire up one of these modules to the resolver, that is done separately in `module_resolver.zig`. Maybe in the future we can do codegen for it.
-- `./builtins` contains builtins that use intrinsics. They're inlined into generated C++ code. It's a separate system, see the readme in that folder.
+## Builtin Functions
-When anything in that is changed, run this to regenerate the code:
+`./functions` contains isolated functions. Each function within is bundled separately, meaning you may not use global variables, non-type `import`s, and even directly referencing the other functions in these files. `require` is still resolved the same way it does in the modules.
-```make
-$ make regenerate-bindings
-$ make bun-link-lld-debug
+In function files, these are accessible in C++ by using `<file><function>CodeGenerator(vm)`, for example:
+
+```cpp
+object->putDirectBuiltinFunction(
+ vm,
+ globalObject,
+ identifier,
+ // ReadableStream.ts, `function readableStreamToJSON()`
+ // This returns a FunctionExecutable* (extends JSCell*, but not JSFunction*).
+ readableStreamReadableStreamToJSONCodeGenerator(vm),
+ JSC::PropertyAttribute::Function | JSC::PropertyAttribute::DontDelete | 0
+);
```
+
+## Extra Features
+
+`require` is replaced with `$requireId(id)` which allows these modules to import each other in a way that skips the module resolver. Being written in a syncronous format also makes this faster than ESM. All calls to `require` must be statically known or else this transformation is not possible.
+
+## Building
+
+Run `make js` to bundle all the builtins. The output is placed in `src/js/out/{modules,functions}/`, where these files are loaded dynamically by `bun-debug` (an exact filepath is inlined into the binary pointing at where you cloned bun, so moving the binary to another machine may not work). In a release build, these get minified and inlined into the binary (Please commit those generated headers).
+
+If you change the list of files or functions, you will have to run `make regenerate-bindings`, but otherwise any change can be done with just `make js`.
+
+## Notes on how the build process works
+
+_This isn't really required knowledge to use it, but a rough overview of how ./\_codegen/\* works_
+
+The build process is built on top of Bun's bundler. The first step is scanning all modules and assigning each a numerical ID. The order is determined by an A-Z sort.
+
+The `$` for private names is actually a lie, and in JSC it actually uses `@`; though that is a syntax error in regular JS/TS, so we opted for better IDE support. So first we have to pre-process the files to spot all instances of `$` at the start of an identifier and we convert it to `__intrinsic__`. We also scan for `require(string)` and replace it with `$requireId(n)` after resolving it to the integer id, which is defined in `./functions/Module.ts`. `export default` is transformed into `return ...;`, however this transform is a little more complicated that a string replace because it supports that not being the final statement, and access to the underlying variable `$exports`, etc.
+
+The preprocessor is smart enough to not replace `$` in strings, comments, regex, etc. However, it is not a real JS parser and instead a recursive regex-based nightmare, so may hit some edge cases. Yell at Dave if it breaks.
+
+The module is then printed like:
+
+```ts
+// @ts-nocheck
+$$capture_start$$(function () {
+ const path = __intrinsic__requireId(23);
+ // user code is pasted here
+ return {
+ cool: path,
+ };
+}).$$capture_end$$;
+```
+
+This capture thing is used to extract the function declaration afterwards, this is more useful in the functions case where functions can have arguments, or be async functions.
+
+After bundling, the inner part is extracted, and then `__intrinsic__` is replaced to `@`.
+
+These can then be inlined into C++ headers and loaded with `createBuiltin`. This is done in `InternalModuleRegistry.cpp`.