Hacker Newsnew | past | comments | ask | show | jobs | submit | Flux159's commentslogin

So in theory it should be possible, but it might require customizing the Dawn or wgpu-native builds if they don't support it (this is providing the JS bindings / wrapper around those two implementations of wgpu.h). But I've already added a special C++ method to handle draco compression natively, adding some mystral native only methods is not out of the question (however, I would want to ensure that usage of those via JS is always feature flagged so that it doesn't break when run on web).

Did you write your WebGPU chessboard using the raw JS APIs? Ideally it should work, but I just fixed up some missing APIs to get Three.js working in v0.1.0, so if there are issues, then please open up an issue on github - will try to get it working so we close any gaps.


Here's a dawn implementation with support for ray tracing that was implemented a number of years ago but never integrated into browsers. Perhaps it will help?

https://github.com/maierfelix/dawn-ray-tracing

Yes, chessboard3d.app is written with raw JS APIs and raw WebGPU. It does use the rapier physics library, which uses WASM, which might be an issue? It implements its own ray tracing but would probably run 10x faster with hardware ray tracing support.

I think you'd get a lot of attention if you had hardware ray tracing, since that's only currently available in DirectX 12 and Vulkan, requiring implementation in native desktop platforms. FWIW, if the path looks feasible, I would be interested in contributing.


WASM shouldn't be an issue since the draco decoder uses it - but it may only work with V8 (for quickjs builds it wouldn't work, but the default builds use V8+dawn). Obviously with an alpha runtime, there may be bugs.

I think it would be super cool to have some sort of extension before WebGPU (web) has it. I was taking a look at the prior example & it seems like there's good ongoing discussion linked here about it: https://github.com/gpuweb/gpuweb/issues/535. Also I believe that Metal has hardware ray tracing support now too?

Re: Implementation, a few options exist - a separate Dawn fork with RT is one path (though Dawn builds are slow, 1-2 hours on CI). Another approach would be exposing custom native bindings directly from MystralNative alongside the WebGPU APIs - that might make iteration much faster for testing feasibility. The JS API would need to be feature-flagged so the same code gracefully falls back when running on web (did this for a native draco impl too that avoids having to load wasm: https://mystralengine.github.io/mystralnative/docs/api/nativ...).


Followup comment about Apple disallowing JIT - will need to confirm if JSC is allowed to JIT or only inside of a webview. I was able to get JSC + wgpu-native rendering in an iOS build, but would need to confirm if it can pass app review.

There's 2 other performance things that you can do by controlling the runtime though - add special perf methods (which I did for draco decoding - there is currently one __mystralNativeDecodeDracoAsync API that is non standard), but the docs clearly lay out that you should feature gate it if you're going to use it so you don't break web builds: https://mystralengine.github.io/mystralnative/docs/api/nativ...

The other thing is more experimental - writing an AOT compiler for a subset of Typescript to convert it into C++ then just compile your code ("MystralScript") - this would be similar to Unity's C# AOT compiler and kinda be it's own separate project, but there is some prior work with porffor, AssemblyScript, and Static Hermes here, so it's not completely just a research project.


Is AssemblyScript good for games though? last I checked it lacks too much features for game-code coming directly from TS but might be better now? No idea how well static hermes behaves today (but probably far better due to RN heritage).

I've been down the TS->C++ road a few times myself and the big issue often comes up with how "strict" you can keep your TS code for real-life games as well as how slow/messy the official TS compiler has been (and real-life taking time from efforts).

It's better now, but I think one should probably directly target the GO port of the TS compiler (both for performance and go being a slightly stricter language probably better suited for compilers).

I guess, the point is that the TS->C++ compilation thing is potentially a rabbit-hole, theoretically not too bad, but TS has moved quickly and been hard to keep up with without using the official compiler, and even then a "game-oriented" typescript mode wants to have a slightly different semantic model from the official one so you need either a mapping over the regular type-inference engine, a separate on or a parallell one.

Mapping regular TS to "game-variants", the biggest issue is how to handle numbers efficiently, even if you go full-double there is a need to have conversion-point checking everywhere doubles go into unions with any other type (meaning you need boxing or a "fatter" union struct). And that's not even accounting for any vector-type accelerations.


AssemblyScript was just mentioned as some prior work, I don't think that AssemblyScript would work as is for games.

I realize the major issues with TS->C++ though (or any language to C++, Facebook has prior work converting php to C++ https://en.wikipedia.org/wiki/HipHop_for_PHP that was eventually deprecated in favor of HHVM). I think that iteratively improving the JS engine (Mystral.js the one that is not open source yet but is why MystralNative exists) to work with the compiler would be the first step and ensuring that games and examples built on top with a subset of TS is a starting point here. I don't think that the goal for MystralScript should be to support Three.js or any other engine to begin with as that would end up going down the same compatibility pits that hiphop did.

Being able to update the entire stack here is actually very useful - in theory parts of mystral.js could just be embedded into mystralnative (separate build flags, probably not a standard build) avoiding any TS->C++ compilation for core engine work & then ensuring that games built on top are using the strict subset of TS that does work well with the AOT compilation system. One option for numbers is actually using comment annotations (similar to how JSDoc types work for typescript compiler, specifically using annotations in comments to make sure that the web builds don't change).

Re: TS compiler - I do have some basics started here and I am already seeing that tests are pretty slow. I don't think that the tsgo compiler has a similar API though for parsing & emitters right now, so as much as I would like to switch to it (I have for my web projects & the speed is awesome), I don't think I can yet until the API work is clarified: https://github.com/microsoft/typescript-go/discussions/455


I remember reading about Ejecta a long time ago! I had completely forgotten about it, but it is similar! The funny thing is to support UI elements, I had to also support canvas2d through Skia (although not 100% yet), so maybe impact could even work at some point (would require extensive testing obviously).

Phaser is not supported right now because phaser is still using a WebGL renderer from my understanding (maybe in a v2.0.0 adding ANGLE + WebGL support is an option, but debating if that's a good idea or not).

Pixi 8 has a WebGPU renderer so that should be supported as part of a v1.0.0 release - it's on the roadmap to verify that three and pixi 8 work correctly: https://github.com/mystralengine/mystralnative/issues/7


So I am stubbing parts of the DOM api (input handling like keydown, pointer events, etc.), so you shouldn't need to rewrite any of that.

Three.js and Pixi 8 with the WebGPU renderer are part of the v1.0.0 roadmap (verifying that they can work correctly on all platforms), right now most of the testing was done against my own engine (tentatively called mystral.js which will also be open sourced as part of v1.0.0, it's already used for some of the examples, just as a minified bundle): https://github.com/mystralengine/mystralnative/issues/7


Hi, thanks! Yeah for controls I'm emulating pointerevents and keydown, keyup from SDL3 inputs & events. The goal is that the same JS that you write for a browser should "just work". It's still very alpha, but I was able to get my own WebGPU game engine running in it & have a sponza example that uses the exact key and pointer events to handle WASD / mouse controls: https://mystraldev.itch.io/sponza-in-webgpu-mystral-engine (the web build there is older, but the downloads for Windows, Mac, and Linux are using Mystral Native - you can clearly tell that it's not Electron by size (even Tauri for Mac didn't support webp inside of the WebGPU context so I couldn't use draco compressed assets w/ webp textures).

I put up a roadmap to get Three.js and Pixi 8 (webgpu renderer) fully working as part of a 1.0.0 release, but there's nothing that my JS engine is doing that is that different than Three.js or Pixi. https://github.com/mystralengine/mystralnative/issues/7

I did have to get Skia for Canvas2d support because I was using it for UI elements inside of the canvas, so right now it's a WebGPU + Canvas2d runtime. Debating if I should also add ANGLE and WebGL bindings as well in v2.0.0 to support a lot of other use cases too. Fonts support is built in as part of the Skia support as well, so that is also covered. WebAudio is another thing that is currently supported, but may need more testing to be fully compatible.


What Srouji giveth, Federighi taketh away. Apple's version of https://en.wikipedia.org/wiki/Andy_and_Bill%27s_law

This looks useful for people not using Claude Code, but I do think that the desktop example in the video could be a bit misleading (particularly for non-developers) - Claude is definitely not taking screenshots of that desktop & organizing, it's using normal file management cli tools. The reason seems a bit obvious - it's much easier to read file names, types, etc. via an "ls" than try to infer via an image.

But it also gets to one of Claude's (Opus 4.5) current weaknesses - image understanding. Claude really isn't able to understand details of images in the same way that people currently can - this is also explained well with an analysis of Claude Plays Pokemon https://www.lesswrong.com/posts/u6Lacc7wx4yYkBQ3r/insights-i.... I think over the next few years we'll probably see all major LLM companies work on resolving these weaknesses & then LLMs using UIs will work significantly better (and eventually get to proper video stream understanding as well - not 'take a screenshot every 500ms' and call that video understanding).


I keep seeing “Claude image understanding is poor” being repeated, but I’ve experienced the opposite.

I was running some sentiment analysis experiments; describe the subject and the subjects emotional state kind of thing. It picked up on a lot of little detail; the brand name of my guitar amplifier in the background, what my t shirt said and that I must enjoy craft beer and or running (it was a craft beer 5k kind of thing), and picked up on my movement through multiple frames. This was a video slicing a frame every 500ms, it noticed me flexing, giving the finger, appearing happy, angry, etc. I was really surprised how much it picked up on, and how well it connected those dots together.


I regularly show Claude Code a screenshot of a completely broken UI--lots of cut off text, overlapping elements all over the place, the works--and Claude will reply something like "Perfect! The screenshot shows that XYZ is working."

I can describe what is wrong with the screenshot to make Claude fix the problem, but it's not entirely clear to what extent it's using the screenshot versus my description. Any human with two brain cells wouldn't need the problems pointed out.


This is my experience as well. If CC does something, and I get broken results and reply with just an image it will almost always reply with "X is working!" response. Sometimes just telling it to look more closely is enough, or sometimes I have to be more specific. It seems to be able to read text from screenshots of logs just fine though and always seems to process those as I'd expect.


> Claude is definitely not taking screenshots of that desktop & organizing, it's using normal file management cli tools

Are you sure about that?

Try "claude --chrome" with the CLI tool and watch what it does in the web browser.

It takes screenshots all the time to feed back into the multimodal vision and help it navigate.

It can look at the HTML or the JavaScript but Claude seems to find it "easier" to take a screenshot to find out what exactly is on the screen. Not parse the DOM.

So I don't know how Cowork does this, but there is no reason it couldn't be doing the same thing.


I wonder if there's something to be said about screenshots preventing context poisoning vs parsing. Or in other words, the "poison" would have to be visible and obvious on the page where as it could be easily hidden in the DOM.

And I do know there are ways to hide data like watermarks in images but I do not know if that would be able to poison an AI.


Considering that very subtle not-human-visible tweaks can make vision models misclassify inputs, it seems very plausible that you can include non-human-visible content the model consumes.

https://cacm.acm.org/news/when-images-fool-ai-models/

https://arxiv.org/abs/2306.13213


Maybe at one time, but it absolutely understands images now. In VSCode Copilot, I am working on a python app that generates mesh files that are imported in a blender project. I can take a screenshot of what the mesh file looks like and ask Claude code questions about the object, in context of a Blender file. It even built a test script that would generate the mesh and import it into the Blender project, and render a screenshot. It built me a vscode Task to automate the entire workflow and then compare image to a mock image. I found its understanding of the images almost spooky.


100% confirm Opus 4.5 is very image smart.


im doing extremely detailed and extremely visual javascript uis with claude code with reactjs and tailwind. driven by lots of screenshots, which often one shot the solution


Using any plugins or skills for that work?

Claude Opus 4.5 can understand images: one thing I've done frequently in Claude Code and have had great success is just showing it an image of weird visual behavior (drag and drop into CC) and it finds the bug near-immediately.

The issue is that Claude Code won't automatically Read images by default as a part of its flow: you have to very explicitly prompt it to do so. I suspect a Skill may be more useful here.


I've done similar while debugging an iOS app I've been working on this past year.

Occasionally it needs some poking and prodding but not to a substantial degree.

I also was able to use it to generate SVG files based on in-app design using screenshots and code that handles rendering the UI and it was able to do a decent job. Granted not the most complex of SVG but the process worked.


"What Andy giveth, Bill taketh away" - https://en.wikipedia.org/wiki/Andy_and_Bill%27s_law

On a more serious note, I really only use Windows for games & I'm still always frustrated with how many updates (& restarts during updates) Windows needs. My fans are always constantly spinning on Windows too (laptop or desktop) whereas my Mac & Linux machines are generally silent outside of heavy load.


This is common for any self-updating software that you use infrequently.

A friend of mine complained that he hated how Firefox "always" wants to restart with an update. I couldn't understand what he was on about. Turned out I use Firefox daily and he uses it like once every 2 months to test something and yeah, Firefox has an update out every 2 months or so, so that fits.

It's the same with Windows (and, I assume, macOS). Use Windows more and the updater will disappear out of sight.


I update Linux maybe once a year. Sure, there are security vulnerabilities. But I'm behind a firewall. And meanwhile, I don't have to spend any time dealing with update issues.


But Windows is made for the big masses. It's definitely a good thing that Microsoft forces Auto-Updates, because otherwise 95% of people would run around with devices that have gaping security holes. And 90% of these people are not being a firewall 100% of their time.

Side effect unfortunately is that they are shoving ad- and bloatware down your throat through these updates.

But that is, because Microsoft does not care about the end user at all. It's not the fault of auto-updates.


My Mac doesn’t randomly reboot, doesn’t force updates on shutdown, doesn’t have weekly updates that require updates. IMO Apple handles updates much better than Windows.

Windows still reboots instead of shut down when you do update and shut down.



It must have regressed again, just happened this morning.


> My fans are always constantly spinning on Windows too (laptop or desktop) whereas my Mac & Linux machines are generally silent outside of heavy load.

defender seemingly needs to check every 10ms that you still don't have a virus


I'm always amused by these occasional "you still don't have any viruses" popup notifications from Defender. Well, good to know, thank you very much, I guess.


Cannot even reliably permanently disable real time scanning...


I'm now even using wine & proton for it. Thanks to Valve only very few games don't work.

And it's not that i don't like windows, it is just too damn slow for me.

And no. I do not want to upgrade my gear every 2 years or so


I feel like I've been monkeys-pawed with the downfall of Windows for gaming. I.e. rather than being at the point where everything just works best/easiest in my Bazzite install it's a game of DRM, modding tool support, feature support, and random "this game runs better on Windows, this game runs better on Bazzite" discovery. Also Windows/Steam OS clone/"normal Linux" setups all have their own very awkward corners around the non-gaming portions. I've not found one that does not require substantial tweaking to get a usable all around experience unless you buy a device to use as more a dedicated gaming console (Xbox/Steam Deck type device).

I miss the ~mid Windows 7 era. Not that everything ran perfectly without issues on Windows 7 at the time, particularly old games, but at least there was an option good enough to assume to always go with first instead of "see if the games you play work best here".


All the games that "don't work" are the games that PEOPLE ACTUALLY PLAY!

It's always hardcore multiplayer games with the actual crowd. Using linux for gaming is a great way to continue down the path to becoming a recluse.


It really depends on what you play. I've been playing online co-op regularly with a bunch of friends since Covid times. We're jumping to new (well, on sale) games regularly, and the only recent time I booted to Windows was because a 4-player mod for Remnant II _might_ not work on Linux. Can't remember the previous game that did not work on Linux. I'm so used to things working without major tinkering that I forget to check protondb most of the time.


I actually don't like to play with random people on the internet.

I prefer the comfort of knowing them, and usually do it in the old basement LAN party way.

In my young time we didn't have internet, and we were actually LESS recluse overall ;-)


So if you don't play popular multiplayer games you're a recluse?

Plenty of people play single player games, and simply socialize outside of games...


I play Helldivers 2 on Linux, and there are TONS of people playing.


And yet the battery in your Linux laptop dies 2x faster...


AI 2026 prediction from Anthropic cofounder


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: