ads

Thursday, April 30, 2026

Show HN: TRiP – a complete transformer engine in C built from scratch just by me https://ift.tt/hoJ4UjI

Show HN: TRiP – a complete transformer engine in C built from scratch just by me https://ift.tt/sJtQFvD April 30, 2026 at 11:48PM

Show HN: Phase Router – capacity-aware routing for MoE https://ift.tt/lF2nfvX

Show HN: Phase Router – capacity-aware routing for MoE https://ift.tt/5fpDFEi April 30, 2026 at 11:37PM

Show HN: A programming language where the only token is the word "vibe" https://ift.tt/iQ5j9Ia

Show HN: A programming language where the only token is the word "vibe" Fuzzy opcode windows. You don't need an exact number of vibes, just roughly right. https://wevibe.fyi April 30, 2026 at 11:14PM

Show HN: FusionCore: ROS 2 sensor fusion that outperforms robot_localization https://ift.tt/o29iO5j

Show HN: FusionCore: ROS 2 sensor fusion that outperforms robot_localization I built sensor fusion for a mobile robot and reached for robot_localization like everyone does. After spending too long fighting navsat_transform, UTM zone boundaries, and YAML covariance tuning, I wrote my own. FusionCore is a 22 state UKF that fuses IMU, wheel encoders, and GPS in ECEF directly (no coordinate projection, no extra node). It estimates IMU bias, adapts its noise covariance automatically from the innovation sequence, and gates outliers with a chi squared test on every sensor. I benchmarked it against robot_localization EKF on 6 sequences from the NCLT public dataset (University of Michigan, real robot, real GPS, RTK ground truth). It wins 5 of 6. On the 6th sequence (fall, degraded GPS over a long period) it loses badly. RL UKF diverged to NaN on all six. Configs, methodology, and full reproduce instructions are in the benchmarks/ folder. https://ift.tt/fLhu3qE April 28, 2026 at 08:46PM

Wednesday, April 29, 2026

Show HN: Generative UI Library for React https://ift.tt/Kd6wCQA

Show HN: Generative UI Library for React https://ift.tt/9bhJ4WV April 30, 2026 at 02:28AM

Show HN: Send your first Peppol e-invoice in 5 minutes (EU mandate live) https://ift.tt/7QHGuOS

Show HN: Send your first Peppol e-invoice in 5 minutes (EU mandate live) https://getpeppr.dev/ April 30, 2026 at 12:36AM

Show HN: A new benchmark for testing LLMs for deterministic outputs https://ift.tt/R8lLrVa

Show HN: A new benchmark for testing LLMs for deterministic outputs When building workflows that rely on LLMs, we commonly use structured output for programmatic use cases like converting an invoice into rows or meeting transcripts into tickets or even complex PDFs into database entries. The model may return the schema you want, but with hallucinated values like `invoice_date` being off by 2 months or the transcript array ordered wrongly. The JSON is valid, but the values are not. Structured output today is a big part of using LLMs, especially when building deterministic workflows. Current structured output benchmarks (e.g., JSONSchemaBench) only validate the pass rate for JSON schema and types, and not the actual values within the produced JSON. So we designed the Structured Output Benchmark (SOB) that fixes this by measuring both the JSON schema pass rate, types, and the value accuracy across all three modalities, text, image, and audio. For our test set, every record is paired with a JSON Schema and a ground-truth answer that was verified against the source context manually by a human and an LLM cross-check, so a missing or hallucinated value will be considered to be wrong. Open source is doing pretty well with GLM 4.7 coming in number 2 right after GPT 5.4. We noticed the rankings shift across modalities: GLM-4.7 leads text, Gemma-4-31B leads images, Gemini-2.5-Flash leads audio. For example, GPT-5.4 ranks 3rd on text but 9th on images. Model size is not a predictor, either: Qwen3.5-35B and GLM-4.7 beat GPT-5 and Claude-Sonnet-4.6 on Value Accuracy. Phi-4 (14B) beats GPT-5 and GPT-5-mini on text. Structured hallucinations are the hardest bug. Such values are type-correct, schema-valid, and plausible, so they slip through most guardrails. For example, in one audio record, the ground truth is "target_market_age": "15 to 35 years", and a model returns "25 to 35". This is invisible without field-level checks. Our goal is to be the best general model for deterministic tasks, and a key aspect of determinism is a controllable and consistent output structure. The first step to making structured output better is to measure it and hold ourselves against the best. https://ift.tt/azlci6e April 29, 2026 at 11:01PM

Tuesday, April 28, 2026

Show HN: Open Bias – proxy that enforces agent behavior at runtime https://ift.tt/SUf65jN

Show HN: Open Bias – proxy that enforces agent behavior at runtime https://ift.tt/UZCHAo7 April 29, 2026 at 01:32AM

Show HN: I built a dating SIM that prepares you for your date https://ift.tt/lYZwagB

Show HN: I built a dating SIM that prepares you for your date https://ift.tt/UNViIbW April 29, 2026 at 12:16AM

Show HN: Ragnerock, an AI data analysis tool https://ift.tt/jWTU4aX

Show HN: Ragnerock, an AI data analysis tool Hi HN, I’m Matt Mahowald, and together with my cofounder John, we’re launching the public beta of Ragnerock today. As a data scientist, you spend the majority of your time wrangling data. Even though you might have a set of techniques and tricks you like to use, how exactly you treat a particular source of data tends to be fairly bespoke, so you end up writing custom logic each time. Ragnerock was born from the observation that modern LLMs can be used to automate a lot of the grunt work involved in this process, while still allowing for fully customizable pipelines. What’s more, by leveraging techniques like constrained decoding, it’s possible to provide a unified query interface regardless of the data source - bridging raw data sources like text and images with your existing structured data living in your databases. Ragnerock has four main components: - A workflow designer that lets you build LLM-driven data processing and analysis pipelines - A job orchestration layer that runs those workflows - A query interface which lets you inspect the results of those workflows with plain SQL - A notebook system which is 100% API-compatible with Jupyter and runs on your existing kernels, so you can easily pull data into your existing environments and analyses Ragnerock also supports bring-your-own AI (OpenAI, Anthropic, and Google APIs), databases, and blob storage, so you can join with your existing datasets and have all outputs flow to your data lake. We’re particularly excited about our web crawling feature, which allows you to scrape websites and trigger workflows on updates: for example, you might point Ragnerock at your favorite blog and run a workflow to assess posts for topics and sentiment. You can try it out at https://ift.tt/gFxXoM8 ; no credit card needed and the first 20 hours of compute are free. It’s an early-stage product so we’re especially interested in feedback. Happy to answer any questions - John and I will be around in the comments today. https://ift.tt/gFxXoM8 April 28, 2026 at 11:33PM

Monday, April 27, 2026

Show HN: 49Agents – Infinite canvas IDE for AI agents https://ift.tt/sWCyP60

Show HN: 49Agents – Infinite canvas IDE for AI agents https://ift.tt/No5yS1n April 28, 2026 at 07:36AM

Show HN: Vibe-coding video games with Claude (Day 14: Tetris) https://ift.tt/7hUQtGO

Show HN: Vibe-coding video games with Claude (Day 14: Tetris) I used to run a flash games website (SWF files) years ago. I've made a few games of my own. I'm also an avid gamer and love to play games of all kinds. I'm also a software engineer, and a few days ago I decided I wanted to run a games website again. So I bought the domain gamevibe.us and with the help of Claude I've been vibe-coding one video game every day since. Happy to answer questions, take feedback, etc https://ift.tt/KUuJpWX April 27, 2026 at 11:03PM

Sunday, April 26, 2026

Show HN: WaveletLM – wavelet-based, attention-free model with O(n log n) scaling https://ift.tt/ga6ZAI9

Show HN: WaveletLM – wavelet-based, attention-free model with O(n log n) scaling WaveletLM is a wavelet-based, attention-free architecture that replaces self-attention with learned lifting wavelet decomposition, a Fast Walsh-Hadamard Transform, per-scale gated spectral mixing with SwiGLU activation, an inverse FWHT, and wavelet reconstruction. Combined with expanded MLPs and sparse product-key memory, this yields a model with O(n log n) scaling in sequence length. With 23.8 PPL on WikiText-103, WaveletLM beats both GPT-2 Medium, which was trained on 80× more data, and Transformer-XL Standard, which uses recurrence to extend its effective context. It is undertrained and underregularized due to budget constraints, so there is much room for development and improvement. I invite anyone who is curious to examine the model, test it out, and extend its capabilities further. All code and weights are fully open source, and a PG-19 run will be completed in 2-3 days. Generations can be done in 4-5 GB VRAM at 28.8 tokens/second, and the model is trainable in 16.25 hours with 20 GB of VRAM, both on a 5090. README for comparison tables, instructions, logs, and future plans: https://ift.tt/ItRvPb2 Weights: https://ift.tt/YpVdFUb Generations: https://ift.tt/GQwICRV... The following samples were chosen for coherence, not factual accuracy. Factuality will require scaling and downstream techniques such as RAG and instruction tuning. > The history of the city is reflected in its architecture, which includes the historic Old Town and New Castle County Courthouse Square Historic District. The building was designed by John H. Stevens, who also designed the Albany-Fulton Celebration in 1906 and built a steel-hulled shipyard on the lake shore. > The album was released on August 25, 2007 by Sony Music Entertainment and features several songs from the record including "Never Say Die", "The Show", "Don't Cry for Me Argentina" and a cover of "I Can Only Imagine (But You Are Not Alone)". > The species was first described by Swedish zoologist Carl Linnaeus in 1758 as Agaricus adustus. The genus name is derived from the Latin words perma "to tie", and pous ("like") means "with a large head". In 1821, French mycologists Jean-Baptiste de Lacaille placed it in section Cricetae of the order Carnivora. He later renamed it Spongiforma punctata after the Greek kribensis. https://ift.tt/ItRvPb2 April 27, 2026 at 12:48AM

Show HN: Parlor Jarvis – Realtime AI (audio+screen in, voice out) & multilingual https://ift.tt/36zk8se

Show HN: Parlor Jarvis – Realtime AI (audio+screen in, voice out) & multilingual https://ift.tt/NtDzB6c April 27, 2026 at 12:13AM

Show HN: Lambda ERP – Open-source ERP you can run through chat https://ift.tt/Q0OEYmM

Show HN: Lambda ERP – Open-source ERP you can run through chat Hi HN, I built Lambda ERP, an open-source ERP prototype where chat is the primary interface. It handles sales/purchase flows, invoices, payments, inventory, double-entry accounting, reports, and chat-generated analytics. There’s a live demo in the README with 3 years of simulated data, plus a Docker Compose setup if you want to run it locally. It is not production-ready yet; I’m looking for feedback on the architecture, the chat-first workflow, and whether this direction makes sense for small teams that can’t afford traditional ERP implementation projects. https://ift.tt/M7YRxKL April 26, 2026 at 10:33PM

Saturday, April 25, 2026

Show HN: Odozi – open-source iOS journaling app https://ift.tt/LAFR5bh

Show HN: Odozi – open-source iOS journaling app Yeah I know I hate the name too but I wasn't about to pay up for odyssey.app. It's an open source project so feel free to poke around with it / fork it. I talk about it more on the marketing website, but a few of us have been using it for the past month and kind of fun. Obviously there will be a slew of issues / feedback / nits that come from this, but c'est la vie. GH is here: https://ift.tt/AkcYhWj https://odozi.app April 25, 2026 at 10:52PM

Show HN: Quay – Menu-bar Git sync https://ift.tt/LXfNqWA

Show HN: Quay – Menu-bar Git sync I write Astro blog posts in a text editor; when I'm done I want them pushed to GitHub so Cloudflare deploys the site. To make it comfortable, I built Quay for the menu bar. Also useful for Obsidian vault syncing. Point it at a folder, connect a GitHub repo, and it stages/commits/pushes/pulls. Multiple repos, editable commit messages, branch switching, merges with conflict detection. Shows open issue and PR counts per repo. But it's is not a full Git client (no diffs, blame, cherry-pick, or rebase) and it doesn't create remote repos. Native macOS app (Swift/SwiftUI). Wraps the local git binary (prompts to install Xcode Command Line Tools if missing). No custom Git implementation. Sandboxed, no telemetry, GitHub-only. macOS. 7-day trial, €9 one-time on the App Store. https://ift.tt/Mkt4pD8 April 26, 2026 at 01:23AM

Friday, April 24, 2026

Show HN: #1 On This Day https://ift.tt/pyGO0HY

Show HN: #1 On This Day https://onthisday-theta.vercel.app April 24, 2026 at 11:12PM

Show HN: TurbineFi – Build, Backtest, Deploy Prediction Market Strategies https://ift.tt/Pfi4y5h

Show HN: TurbineFi – Build, Backtest, Deploy Prediction Market Strategies Hey HN! We just finished our first major build of TurbineFi, an AI-assisted workflow for building, backtesting, and running prediction market strategies. There are over 1,000 community strategies you can try out, there's a backtesting engine integrated in the workflow, and you get your own sandbox to execute the trades 24/7. Currently live for Kalshi, Polymarket coming soon. We developed a custom DSL to make compiling AI-assisted strategies more deterministic than raw python generation, so creating a strategy takes seconds even on low-tier models (thinking of migrating to a self-hosted model soon to reduce costs). We also worked with Locus (YCF25) to do the sandbox provisioning, so that we never manage keys for users. When a user signs up with their email, Privy creates a wallet for them, and then that wallet uses the X402 agent payment protocol to pay for their own server. We created a deployment harness around it that accepts and runs new code via a hosted API, so once it's up, every deployment is authorized by EIP-712 signatures. It keeps everything non-custodial, and code deployments happen in seconds. And users don't really realize they're using crypto rails. Turbine also includes weather and crypto historical information, so you can do things like fading the BTC-15min UP markets when it's cold in NYC, and backtest and run it in seconds. Adding sports data soon. There's a 7-day trial if you want to poke around. Would appreciate feedback on which strategies you'd want to try first, so we can make sure we have the infra to support them. Thank you! https://ift.tt/E17vHkn April 24, 2026 at 10:17PM

Thursday, April 23, 2026

Show HN: AgentSearch – Self-hosted search and MCP for AI agents, no API keys https://ift.tt/Dz6CWVR

Show HN: AgentSearch – Self-hosted search and MCP for AI agents, no API keys https://ift.tt/cEz6lvN April 24, 2026 at 01:25AM

Show HN: Turning a Gaussian Splat into a videogame https://ift.tt/t2LfW4B

Show HN: Turning a Gaussian Splat into a videogame https://ift.tt/z6F5hAH April 23, 2026 at 09:18PM

Show HN: Core – open-source AI butler that clears your backlog without you https://ift.tt/XhyIjrw

Show HN: Core – open-source AI butler that clears your backlog without you Hi HN, we're Manik, Manoj and Harshith, and we're building CORE ( https://ift.tt/9NWGre2 ), an open source AI butler that acts and clears out your backlog. Write `[ ] Fix the search auth bug` in a scratchpad. Three minutes later, without you at the keyboard, CORE picks it up, pulls the relevant context from your codebase, drafts a plan in the task description, and spins up a Claude Code session in the background to do the work. You review the output in the task chat and unblock it when it gets stuck. Every AI tool today is reactive. You open a chat, brief the agent, it responds. Before anything moves, you've already done the real work: opened the Sentry error, found the commit, read the Slack thread, grabbed the Linear ticket, and stitched it all together into a prompt. The model isn't the bottleneck. You are. Demo Video: https://www.youtube.com/watch?v=PFk4RJvQg1Y CORE removes you from that loop. The interface is a shared scratchpad, think a page you and a colleague both have open. You write what's on your mind. When you write a checkbox line like `[ ] Fix the search bug`, CORE converts it into a task and starts working on it after a short delay (long enough for you to add context if you want to). No prompt template. No workflow to configure. The reason it can do this without you re-explaining everything: CORE keeps a persistent memory built from your tasks, conversations, and connected apps (Linear, Gmail, GitHub, Slack etc.). When it spins up a Claude Code session, it arrives with your codebase and project context already loaded. A real example: we wrote `[ ] Create a widget in Linear integration`, about 14 minutes later, CORE had opened a PR . What CORE is _not_: it's not Devin (no autonomous web browsing or shell loops you can't see), and it's not "Claude Code with memory bolted on." It's the layer above it that decides what should run, gathers the context, hands it to the right agent, and keeps the receipts in one place. Today the agent backend it spins up most often is Claude Code; the orchestration, scratchpad, memory, and integrations are CORE. Open source, self-hostable with `docker compose up` and it supports multiple models. GitHub: https://ift.tt/9NWGre2 Website: https://getcore.me (you can chat with Harshith's butler there) Demo: https://www.youtube.com/watch?v=PFk4RJvQg1Y https://www.getcore.me/ April 23, 2026 at 10:14PM

Wednesday, April 22, 2026

Show HN: Netlify for Agents https://ift.tt/yUno4jd

Show HN: Netlify for Agents I launched Netlify with a Show HN more than 11 years today, for humans. Today we're launching our Agent first version of Netlify. Super early days for this, but I expect it to become as important as our original launch over time. It's as hard to perfect these flows as it was to perfect some of the initial human DX flows, since the agents are non-deterministic and keeps changing and evolving, and we'll have more to show soon on our eval tooling for this. Try it out with an agent, and we would love feedback on what works and what doesn't as we keep iterating on making Netlify better for our new agent friends. https://netlify.ai April 22, 2026 at 11:57PM

Show HN: A free tool for non-technical folks to easily publish a website https://ift.tt/aXDjhpx

Show HN: A free tool for non-technical folks to easily publish a website It's easier than ever for anyone to make a website, even without paying for a drag-and-drop builder like Squarespace. But there are still too many barriers for your average non-technical person to publish a site on the web. I'd bet most people don't know there are free ways to host a website, and even if they find an explainer, technical platforms like Cloudflare and GitHub (let alone the command line) can be intimidating. So I made weejur, which is basically a super simple UI front-end for GitHub Pages. You log in with OAuth, and then you can just paste HTML or upload files to publish a website. If you don't have a GitHub account, you can sign up right in the OAuth flow. It's completely free, and you can view the source here [1]. My hope is this makes it easier for people who don't know anything about web hosting to create and share their own websites. Feel free to try it out and please share any questions/ideas/feedback! [1] https://ift.tt/IMGqwFJ https://weejur.com April 22, 2026 at 11:06PM

Tuesday, April 21, 2026

Show HN: Ctx – a /resume that works across Claude Code and Codex https://ift.tt/6Bt2TUw

Show HN: Ctx – a /resume that works across Claude Code and Codex ctx is a local SQLite-backed skill for Claude Code and Codex that stores context as a persistent workstream that can be continued across agent sessions. Each workstream can contain multiple sessions, notes, decisions, todos, and resume packs. It essentially functions as a /resume that can work across coding agents. Here is a video of how it works: https://ift.tt/8Wgb4DE I initially built ctx because I wanted to try a workstream that I started on Claude and continue it from Codex. Since then, I’ve added a few quality of life improvements, including the ability to search across previous workstreams, manually delete parts of the context with, and branch off existing workstreams.. I’ve started using ctx instead of the native ‘/resume’ in Claude/Codex because I often have a lot of sessions going at once, and with the lists that these apps currently give, it’s not always obvious which one is the right one to pick back up. ctx gives me a much clearer way to organize and return to the sessions that actually matter. It’s simple to install after you clone the repo with one line: ./setup.sh, which adds the skill to both Claude Code and Codex. After that, you should be able to directly use ctx in your agent as a skill with ‘/ctx [command]’ in Claude and ‘ctx [command]’ in Codex. A few things it does: - Resume an existing workstream from either tool - Pull existing context into a new workstream - Keep stable transcript binding, so once a workstream is linked to a Claude or Codex conversation, it keeps following that exact session instead of drifting to whichever transcript file is newest - Search for relevant workstreams - Branch from existing context to explore different tasks in parallel It’s intentionally local-first: SQLite, no API keys, and no hosted backend. I built it mainly for myself, but thought it would be cool to share with the HN community. https://ift.tt/moFVkZp April 20, 2026 at 11:35PM

Monday, April 20, 2026

Show HN: Agentkit-CLI, one canonical context file for AI coding agents https://ift.tt/E8gBR2a

Show HN: Agentkit-CLI, one canonical context file for AI coding agents https://mikiships.github.io/agentkit-cli/ April 20, 2026 at 10:04PM

Sunday, April 19, 2026

Show HN: A privacy-first, local-LLM note app for iOS (Google Keep alternative) https://ift.tt/KrmxHaX

Show HN: A privacy-first, local-LLM note app for iOS (Google Keep alternative) https://ift.tt/P6rQswD April 19, 2026 at 11:59PM

Show HN: Free PDF redactor that runs client-side https://ift.tt/dbNrwXy

Show HN: Free PDF redactor that runs client-side I recently needed to verify past employment and to do so I was going to upload paystubs from a previous employer, however I didn't want to share my salary in that role. I did a quick search online and most sites required sign-up or weren't clear about document privacy. I conceded and signed up for a free trial of Adobe Acrobat so I could use their PDF redaction feature. I figured there should be a dead simple way of doing this that's private, so I decided to create it myself. What this does is rasterize each page to an image with your redactions burned in, then it rebuilds the PDF so the text layer is permanently destroyed and not just covered up and easily retrievable. I welcome any and all feedback as this is my first live tool, thanks! https://redactpdf.net April 20, 2026 at 01:39AM

Show HN: Faceoff – A terminal UI for following NHL games https://ift.tt/9gnRhIw

Show HN: Faceoff – A terminal UI for following NHL games Faceoff is a TUI app written in Python to follow live NHL games and browse standings and stats. I got the inspiration from Playball, a similar TUI app for MLB games that was featured on HN. The app was mostly vibe-coded with Claude Code, but not one-shot. I added features and fixed bugs by using it, as I spent way too much time in the terminal over the last few months. Try it out with `uvx faceoff` (requires uv). https://ift.tt/NVuyAMC April 20, 2026 at 12:44AM

Show HN: Google Gemini Is Scanning Your Photos – and the EU Said No https://ift.tt/q2CGbs6

Show HN: Google Gemini Is Scanning Your Photos – and the EU Said No Google has expanded its Personal Intelligence feature so that Gemini can now access your Google Photos face data, Gmail, YouTube history, and search activity to generate personalized AI images — live for US paid subscribers as of April 2026. https://ift.tt/r5T7NsL... April 19, 2026 at 11:36PM

Saturday, April 18, 2026

Show HN: AI Subroutines – Run automation scripts inside your browser tab https://ift.tt/jJLSZRb

Show HN: AI Subroutines – Run automation scripts inside your browser tab We built AI Subroutines in rtrvr.ai. Record a browser task once, save it as a callable tool, replay it at: zero token cost, zero LLM inference delay, and zero mistakes. The subroutine itself is a deterministic script composed of discovered network calls hitting the site's backend as well as page interactions like click/type/find. The key architectural decision: the script executes inside the webpage itself, not through a proxy, not in a headless worker, not out of process. The script dispatches requests from the tab's execution context, so auth, CSRF, TLS session, and signed headers get added to all requests and propagate for free. No certificate installation, no TLS fingerprint modification, no separate auth stack to maintain. During recording, the extension intercepts network requests (MAIN-world fetch/XHR patch + webRequest fallback). We score and trim ~300 requests down to ~5 based on method, timing relative to DOM events, and origin. Volatile GraphQL operation IDs are detected and force a DOM-only fallback before they break silently on the next run. The generated code combines network calls with DOM actions (click, type, find) in the same function via an rtrvr.* helper namespace. Point the agent at a spreadsheet of 500 rows and with just one LLM call parameters are assigned and 500 Subroutines kicked off. Key use cases: - record sending IG DM, then have reusable and callable routine to send DMs at zero token cost - create routine getting latest products in site catalog, call it to get thousands of products via direct graphql queries - setup routine to file EHR form based on parameters to the tool, AI infers parameters from current page context and calls tool - reuse routine daily to sync outbound messages on LinkedIn/Slack/Gmail to a CRM using a MCP server We see the fundamental reason that browser agents haven't taken off is that for repetitive tasks going through the inference loop is unnecessary. Better to just record once, and get the LLM to generate a script leveraging all the possible ways to interact with a site and the wider web like directly calling backed API's, interacting with the DOM, and calling 3P tools/APIs/MCP servers. https://ift.tt/J5mrUDp April 18, 2026 at 04:03AM

Show HN: Praxis – Lab data to publication-ready figures in one Python package https://ift.tt/itwyvOA

Show HN: Praxis – Lab data to publication-ready figures in one Python package https://ift.tt/u5Nj9xO April 19, 2026 at 01:15AM

Show HN: I built Panda to get up to 99% token savings https://ift.tt/NL73vPK

Show HN: I built Panda to get up to 99% token savings https://ift.tt/dVw9mNM April 18, 2026 at 05:00PM

Friday, April 17, 2026

Show HN: Waputer – The WebAssembly Computer https://ift.tt/nlCwDAr

Show HN: Waputer – The WebAssembly Computer Waputer is an operating system that runs entirely in the browser. When you visit the website at https://waputer.app , a kernel written in JavaScript sets up a filesystem and launches a WebAssembly program, which in turn talks to the kernel to handle the display and input. A purely terminal-based version is at https://waputer.dev . My original intention was to create programs that run in the browser that have a lot more in common with the desktop. The traditional "hello world" program is not really suited for the web. Waputer changes that. The GitHub repo at https://ift.tt/g5z06Up gives a very brief overview of compiling a C program and running it on Waputer. There is a blog available from the main site that has a long-form explanation of Waputer and my motivations if you want some additional reading. https://waputer.app April 18, 2026 at 12:46AM

Show HN: Smol machines – subsecond coldstart, portable virtual machines https://ift.tt/ZBLptF2

Show HN: Smol machines – subsecond coldstart, portable virtual machines https://ift.tt/Ur5cJgS April 18, 2026 at 12:18AM

Show HN: Bird, a CLI for Tired Brains https://ift.tt/3XBzHEO

Show HN: Bird, a CLI for Tired Brains https://ift.tt/hSZ4xpo April 18, 2026 at 12:13AM

Show HN: PanicLock – Close your MacBook lid disable TouchID –> password unlock https://ift.tt/QFPhEV5

Show HN: PanicLock – Close your MacBook lid disable TouchID –> password unlock https://ift.tt/ivusXmS April 17, 2026 at 11:38PM

Thursday, April 16, 2026

Show HN: EDDI – Multi-agent AI engine where agent logic lives in JSON, not code https://ift.tt/FLnKJU5

Show HN: EDDI – Multi-agent AI engine where agent logic lives in JSON, not code I started EDDI in 2006 as a rule-based dialog engine. Back then it was pattern matching and state machines. When LLMs showed up, the interesting question wasn't "how do I call GPT" but "how do I keep control over what the AI does in production?" My answer was: agent logic belongs in JSON configs, not code. You describe what an agent should do, which LLM to use, what tools it can call, how it should behave. The engine reads that config and runs it. No dynamic code execution, ever. The LLM cannot run arbitrary code by design. The engine is strict so the AI can be creative. v6 is the version where this actually became practical. You can have groups of agents debating a topic in five different orchestration styles (round table, peer review, devil's advocate...). Each agent can use a different model. A cascading system tries cheap models first and only escalates to expensive ones when confidence is low. It also implements MCP as both server and client, so you can control EDDI from Claude Desktop or Cursor. And Google's A2A protocol for agents discovering each other across platforms. The whole thing runs in Java 25 on Quarkus, ships as a single Docker image, and installs with one command. Open source since 2017, Apache 2.0. Would love to hear thoughts on the architecture and feature set. And if you have ideas for what's missing or what you'd want from a system like this, I'm all ears. Always looking for good input on the roadmap. https://ift.tt/Rp83Xwo April 16, 2026 at 09:11PM

Show HN: CodeBurn – Analyze Claude Code token usage by task https://ift.tt/4pnvwDZ

Show HN: CodeBurn – Analyze Claude Code token usage by task Built this after realizing I was spending ~$1400/week on Claude Code with almost no visibility into what was actually consuming tokens. Tools like ccusage give a cost breakdown per model and per day, but I wanted to understand usage at the task level. CodeBurn reads the JSONL session transcripts that Claude Code stores locally (~/.claude/projects/) and classifies each turn into 13 categories based on tool usage patterns (no LLM calls involved). One surprising result: about 56% of my spend was on conversation turns with no tool usage. Actual coding (edits/writes) was only ~21%. The interface is an interactive terminal UI built with Ink (React for terminals), with gradient bar charts, responsive panels, and keyboard navigation. There’s also a SwiftBar menu bar integration for macOS. Happy to hear feedback or ideas. https://ift.tt/dbt8nS1 April 14, 2026 at 05:57AM

Wednesday, April 15, 2026

Show HN: Dependicus, a dashboard for your monorepo's dependencies https://ift.tt/AOLem6p

Show HN: Dependicus, a dashboard for your monorepo's dependencies Late last year, I was digging into some dependency-related tech debt, and struggling with how long it takes to run pnpm's introspection commands like 'pnpm why' in a medium-size monorepo. So I started working on a simple static site generator that would let me view the output of these expensive commands all at once, to make problems clearly visible instead of requiring deep exploration one at a time. Once I had that working, I realized I had enough data to add ticket tracking. It uses the data it gathers from the package manager to keep Linear or GitHub issues updated. And by auto-assigning those issues to coding agents, I get a Dependabot-but-better experience: agents keep up with API updates in addition to just bumping versions, and group related updates automatically. It's still early days, but it's working really well for us and I think people will find value in it, so I'm sharing here! https://descriptinc.github.io/dependicus/ April 16, 2026 at 12:02AM

Show HN: MCP server gives your agent a budget (save tokens, get smarter results) https://ift.tt/zD7ofRZ

Show HN: MCP server gives your agent a budget (save tokens, get smarter results) As a consultant I foot my own Cursor bills, and last month was $1,263. Opus is too good not to use, but there's no way to cap spending per session. After blowing through my Ultra limit, I realized how token-hungry Cursor + Opus really is. It spins up sub-agents, balloons the context window, and suddenly, a task I expected to cost $2 comes back at $8. My bill kept going up, but was I really going to switch to a worse model? No. So I built l6e: an MCP server that gives your agent the ability to budget. It works with Cursor, Claude Code, Windsurf, Openclaw, and every MCP-compatible application. Saving money was why I built it, but what surprised me was that the process of budgeting changed the agent's behavior. An agent that understands the limitations of the resources doesn't try to speculatively increase the context window with extra files. It doesn't try to reach every possible API. The agent plans ahead, sticks to it, and ends work when it should. It works, and we've been dogfooding it hard. After v1 shipped, the rest of l6e was all built with it. We launched the entire docs site using frontier models for $0.99. The kicker was every time l6e broke in development, I could feel the pain. The agent got sloppy, burned through context, and output quality dropped right along with it. Install: pip install l6e-mcp Docs: https://docs.l6e.ai GitHub: https://ift.tt/Ze3Aaxg Website: https://l6e.ai Happy to answer questions about the system design, calibration models, or why I can't go back to coding without it. https://l6e.ai April 15, 2026 at 10:38PM

Tuesday, April 14, 2026

Show HN: A Claude Code–driven tutor for learning algorithms in Go https://ift.tt/wTClJbg

Show HN: A Claude Code–driven tutor for learning algorithms in Go https://ift.tt/9o8brjG April 15, 2026 at 12:41AM

Show HN: LangAlpha – what if Claude Code was built for Wall Street? https://ift.tt/nqNr1Zw

Show HN: LangAlpha – what if Claude Code was built for Wall Street? Some technical context on what we ran into building this. MCP tools don't really work for financial data at scale. One tool call for five years of daily prices dumps tens of thousands of tokens into the context window. And data vendors pack dozens of tools into a single MCP server, schemas alone can eat 50k+ tokens before the agent does anything useful. So we auto-generate typed Python modules from the MCP schemas at workspace init and upload them into the sandbox. The agent just imports them like a normal library. Only a one-line summary per server stays in the prompt. We have around 80 tools across our servers and the prompt cost is the same whether a server has 3 tools or 30. This part isn't finance-specific, it works with any MCP server. The other big thing was making research actually persist across sessions. Most agents treat a single deliverable (a PDF, a spreadsheet) as the end goal. In investing that's day one. You update the model when earnings drop, re-run comps when a competitor reports, keep layering new analysis on old. But try doing that across agent sessions, files don't carry over, you re-paste context every time. So we built everything around workspaces. Each one maps to a persistent sandbox, one per research goal. The agent maintains its own memory file with findings and a file index that gets re-read before every LLM call. Come back a week later, start a new thread, it picks up where it left off. We also wanted the agent to have real domain context the way Claude Code has codebase context. Portfolio, watchlist, risk tolerance, financial data sources, all injected into every call. Existing AI investing platforms have some of that but nothing close to what a proper agent harness can do. We wanted both and couldn't find it, so we built it and open-sourced the whole thing. https://ift.tt/gE3LiU7 April 14, 2026 at 09:48PM

Monday, April 13, 2026

Show HN: pg_grpc – Call gRPC services directly from PostgreSQL https://ift.tt/qvzCXYM

Show HN: pg_grpc – Call gRPC services directly from PostgreSQL https://ift.tt/XMWEcKJ April 14, 2026 at 12:50AM

Show HN: 15 yrs of Django in prod: patterns I keep using (agent skills) https://ift.tt/f0FaTcP

Show HN: 15 yrs of Django in prod: patterns I keep using (agent skills) https://ift.tt/umWRb9H April 13, 2026 at 10:16PM

Sunday, April 12, 2026

Show HN: Rekal – Long-term memory for LLMs in a single SQLite file https://ift.tt/kNahWFX

Show HN: Rekal – Long-term memory for LLMs in a single SQLite file I got tired of repeating myself to my LLM every session. rekal is an MCP server that stores memories in SQLite and retrieves them with hybrid search (BM25 + vectors + recency decay). One file, local embeddings, no API keys. https://ift.tt/TGSsyj8 April 13, 2026 at 04:25AM

Show HN: Claudraband – Claude Code for the Power User https://ift.tt/qvPm9yA

Show HN: T4 – a versioned datastore with branching and time-travel (S3-backed) https://ift.tt/xDNasBV

Show HN: T4 – a versioned datastore with branching and time-travel (S3-backed) Hi HN, I built t4, a datastore that stores its WAL and snapshots in S3. Instead of traditional storage, it writes append-only segments to object storage and reconstructs state from checkpoints + WAL. A side effect of this model is that the database becomes naturally versioned: - you can restore any past state - branch from any point (with copy-on-write) - replay history I started this as an experiment to replace etcd in Kubernetes, but it’s evolving into a general-purpose versioned state store. Curious what people think about: - using object storage as the primary persistence layer - whether branching/time-travel is actually useful in practice https://ift.tt/S3mcvgj April 13, 2026 at 12:22AM

Saturday, April 11, 2026

Show HN: A living Vancouver. Connor is walking dogs at the SPCA this morning https://ift.tt/csFKEgx

Show HN: A living Vancouver. Connor is walking dogs at the SPCA this morning I've spent most of my career in marketing, which for the last few years has meant building consumer personas for campaigns. I wanted to see if I could make these real, living in real neighborhoods, had real weather, real budgets, real Saturday lunches. I always wanted to build a world, not a segment. This is that. 140 people so far, split across Vancouver (100), San Francisco (20), and Tokyo (20). Each one is about 1,000 lines of profile — family, finances, daily schedule, health, worldview, media diet, the channels you'd actually reach them through and the ones that will explicitly never work on them. Demographics are census-grounded income, age, ethnicity, household composition follow normal distributions against StatsCan, ACS, and Japanese e-Stat data, so the panel is roughly representative of the city instead of representative of whatever's overrepresented in an LLM's training corpus. The specific details come from real stories. They live in real local time on a live map. Right now it's Saturday 11:32 AM in Vancouver. Connor Hughes, a 31-year-old software developer at Clio in Gastown, is on his SPCA volunteer shift, he walks shelter dogs at the Boundary Road location every other Saturday morning. Hassan Khoury is in the morning lunch rush with Tony at his Lebanese café — it's his busiest day of the week. Ahmad Noori is pulling Saturday overtime on a construction site. Jordan Whitehorse is on mid-shift at East Cafe on Hastings. Every day is unique, no two days repeat. A 3 AM job fetches live data: weather from Open-Meteo, grocery CPI from StatsCan food vectors, Metro Vancouver transit delays from Google Routes API against specific corridors, Vancouver gas prices, sunrise and sunset. Each persona has a modifier file that reacts to all of it. When Vancouver gas hits $1.85/L, Jaspreet the long-haul trucker's Coquihalla run to Calgary stops feeling worth it, his margins are thin, his mood takes a hit. When food CPI spikes, Gurinder at the Amazon warehouse stops buying the $9 Subway and brings roti from home. A health flare rolls probabilistically each morning which maybe nothing, maybe Tanya's six month old had a rough night, maybe Frank's back is acting up. The days stack up and get remembered. Every persona has a journal, today's entry in a markdown file, a week of them compressed into a "dream" of ~30 lines that keeps the shape without the texture, a month compressed into ~15 lines. It's their journal. I'm not writing it; the simulation is. Click any persona to open their detail, or hit "Talk to [name]" to have a conversation and they run on Claude Haiku with their full profile and recent diary entries as context. Not a product, not a startup, just a thing I've been quietly working on. They feel, in a way I didn't expect, like my fully grown kids. Happy to answer questions. https://brasilia-phi.vercel.app April 12, 2026 at 01:42AM

Show HN: We scanned uscis.gov for third-party trackers. The results are jarring https://ift.tt/FXehWsU

Show HN: We scanned uscis.gov for third-party trackers. The results are jarring https://ift.tt/8vgjlKP April 11, 2026 at 08:43PM

Show HN: OpenDescent, decentralised encrypted messenger, no servers, no accounts https://ift.tt/YJX29uN

Show HN: OpenDescent, decentralised encrypted messenger, no servers, no accounts https://ift.tt/3Xrn4ge April 11, 2026 at 11:33PM

Friday, April 10, 2026

Show HN: FluidCAD – Parametric CAD with JavaScript https://ift.tt/w4qkXYN

Show HN: FluidCAD – Parametric CAD with JavaScript Hello HN users, This is a CAD by code project I have been working on on my free time for more than year now. I built it with 3 goals in mind: - It should be familiar to CAD designers who have used other programs. Same workflow, same terminology. - Reduce the mental effort required to create models as much as possible. This is achieved by: - Provide live rendering and visual guidance as you type. - Allow the user to reference existing edges/faces on the scene instead of having to calculate everything. - Provide interactive mouse helpers for features that are hard to write by code: Only 3 interactive modes for now: Edge trimming, Sketch region extrude, Bezier curve drawing. - Implicit coding whenever possible: e.g: There are sensible defaults for most parameters. The program will automatically fuse intersecting objects together so you do not have to worry about what object needs to be fused with what. - It should be reasonably fast: The scene objects are cached and only the updated objects are re-computed. I think I have achieved these goals to a good extent. The program is still in early stages and there are many features I want to add, rewrite but I think it is already usable for simple models. https://fluidcad.io/ April 11, 2026 at 01:39AM

Show HN: I run AI background removal in the browser–no upload,no server https://ift.tt/4TpcuBX

Show HN: I run AI background removal in the browser–no upload,no server RMBG-1.4 + SAM running client-side via ONNX Runtime WASM. ~2s on laptop, works on mobile. Your image never leaves the browser. Built this as part of allplix.com. 19yo student in France, solo project. Happy to talk about the WASM pipeline or the pain of running ML models in a browser tab. https://ift.tt/AsePEp2 April 11, 2026 at 01:00AM

Show HN: Dynamic Map of YouTube Channels https://ift.tt/08vBwrF

Show HN: Dynamic Map of YouTube Channels https://www.ytmap.xyz/ April 11, 2026 at 12:25AM

Show HN: Figma for Coding Agents https://ift.tt/FJaVyAu

Show HN: Figma for Coding Agents Feels a bit like Figma, but for coding agents. Instead of going back and forth with prompts, you give the agent a DESIGN.md that defines the design system up front, and it generally sticks to it when generating UI. Google Stitch seems to be moving in this direction as a standard, so we put together a small collection of DESIGN.md files based on popular web sites. https://getdesign.md April 10, 2026 at 10:20PM

Thursday, April 9, 2026

Show HN: I built a Cargo-like build tool for C/C++ https://ift.tt/Z6cShWF

Show HN: I built a Cargo-like build tool for C/C++ I love C and C++, but setting up projects can sometimes be a pain. Every time I wanted to start something new I'd spend the first hour writing CMakeLists.txt, figuring out find_package, copying boilerplate from my last project, and googling why my library isn't linking. By the time the project was actually set up I'd lost all momentum. So, I built Craft - a lightweight build and workflow tool for C and C++. Instead of writing CMake, your project configuration goes in a simple craft.toml: [project] name = "my_app" version = "0.1.0" language = "c" c_standard = 99 [build] type = "executable" Run craft build and Craft generates the CMakeLists.txt automatically and builds your project. Want to add dependencies? That's just a simple command: craft add --git https://ift.tt/ynBjPso --links raylib craft add --path ../my_library craft add sfml Craft will clone the dependency, regenerate the CMake, and rebuild your project for you. Other Craft features: craft init - adopt an existing C/C++ project into Craft or initialize an empty directory. craft template - save any project structure as a template to be initialized later. craft gen - generate header and source files with starter boilerplate code. craft upgrade - keeps itself up to date. CMakeLists.extra.cmake for anything that Craft does not yet handle. Cross platform - macOS, Linux, Windows. It is still early (I just got it to v1.0.0) but I am excited to be able to share it and keep improving it. Would love feedback. Please also feel free to make pull requests if you want to help with development! https://ift.tt/a7lY2sJ April 9, 2026 at 11:04PM

Show HN: LLM-Wiki but for Early Founders https://ift.tt/WrBGOfH

Show HN: LLM-Wiki but for Early Founders https://ift.tt/nfR0S8J April 9, 2026 at 11:27PM

Show HN: I built Dirac, Hash Anchored AST native coding agent, costs -64.8 pct https://ift.tt/sambCd6

Show HN: I built Dirac, Hash Anchored AST native coding agent, costs -64.8 pct Fully open source, a hard fork of cline. Full evals on the github page that compares 7 agents (Cline, Kilo, Ohmypi, Opencode, Pimono, Roo, Dirac) on 8 medium complexity tasks. Each task, each diff and correctness + cost info on the github Dirac is 64.8% cheaper than the average of the other 6. https://ift.tt/MqRC7HG April 9, 2026 at 07:06PM

Show HN: Homebutler – I manage my homelab from chat. AI never gets raw shell https://ift.tt/otkgFfm

Show HN: Homebutler – I manage my homelab from chat. AI never gets raw shell https://homebutler.dev April 9, 2026 at 07:09PM

Show HN: CSS Studio. Design by hand, code by agent https://ift.tt/YSRUNgJ

Show HN: CSS Studio. Design by hand, code by agent Hi HN! I've just released CSS Studio, a design tool that lives on your site, runs on your browser, sends updates to your existing AI agent, which edits any codebase. You can actually play around with the latest version directly on the site. Technically, the way this works is you view your site in dev mode and start editing it. In your agent, you can run /studio which then polls (or uses Claude Channels) an MCP server. Changes are streamed as JSON via the MCP, along with some viewport and URL information, and the skill has some instructions on how best to implement them. It contains a lot of the tools you'd expect from a visual editing tool, like text editing, styles and an animation timeline editor. https://cssstudio.ai April 9, 2026 at 06:23PM

Wednesday, April 8, 2026

Show HN: Orange Juice – Small UX improvements that make HN much easier to read https://ift.tt/01qFrlm

Show HN: Orange Juice – Small UX improvements that make HN much easier to read http://oj-hn.com/ April 9, 2026 at 01:08AM

Show HN: OpenMix, open-source computational framework for formulation science https://ift.tt/CKXhxlN

Show HN: OpenMix, open-source computational framework for formulation science I built OpenMix because computational chemistry has great tools for individual molecules (RDKit, DeepChem) but nothing for mixtures. pip install openmix. Apache 2.0. Technical blog: https://ift.tt/3Fz9A04... https://ift.tt/PvsrY3Q April 9, 2026 at 12:12AM

Show HN: I built a navigation app that displays weather along the route https://ift.tt/0BAhb6P

Show HN: I built a navigation app that displays weather along the route Hello HN, I live in northern part of USA where winters are snowy. Whenever I took long trips, I always wondered what the weather along the route is going to be. For those who live in northern USA, you know weather can change frequently, so when you're traveling matters a lot, not just the route. To solve this problem for myself, I built https://navimodo.com/ . NaviModo calculates the route, and then checks weather along the route based on your start time, and displays weather along the route. Change start time, and the whole thing is recalculated. I am not expecting any commercialization for this, just wanted to scratch an itch, and just did it. I have ideas for adding additional features (suggestions for when to take breaks based on bad weather, auto-suggestion of start time to avoid bad weather, etc.,) but will add more as time goes by. Any feedback is welcome! https://navimodo.com/ April 6, 2026 at 08:58PM

Tuesday, April 7, 2026

Show HN: A reasoning hierarchical robotics pipeline you can run in the browser https://ift.tt/MIjEcqW

Show HN: A reasoning hierarchical robotics pipeline you can run in the browser This demo combines the flexible task programming and reasoning of Gemini ER (what is the scene, and what should I do?) and classical camera calibration, kinematics, motion controllers. Each layer is independently swappable, and the AI model doesn't need to know anything about the robot's embodiment. This recreates the modularity of a Sense-Plan-Act architecture while retaining the semantic reasoning of a foundation AI model. A writeup explaining the tradeoffs is linked from the page https://ift.tt/hXN82lY . https://avikde.github.io/vla-pipeline/ April 8, 2026 at 12:35AM

Show HN: Finalrun – Spec-driven testing using English and vision for mobile apps https://ift.tt/TQ4ojar

Show HN: Finalrun – Spec-driven testing using English and vision for mobile apps I wanted to test mobile apps in plain English instead of relying on brittle selectors like XPath or accessibility IDs. With a vision-based agent, that part actually works well. It can look at the screen, understand intent, and perform actions across Android and iOS. The bigger problem showed up around how tests are defined and maintained. When test flows are kept outside the codebase (written manually or generated from PRDs), they quickly go out of sync with the app. Keeping them updated becomes a lot of effort, and they lose reliability over time. I then tried generating tests directly from the codebase (via MCP). That improved sync, but introduced high token usage and slower generation. The shift for me was realizing test generation shouldn’t be a one-off step. Tests need to live alongside the codebase so they stay in sync and have more context. I kept the execution vision-based (no brittle selectors), but moved test generation closer to the repo. I’ve open sourced the core pieces: 1. generate tests from codebase context 2. YAML-based test flows 3. Vision-based execution across Android and iOS Repo: https://ift.tt/dVnrNTt Demo: https://youtu.be/rJCw3p0PHr4 In the Demo video, you’ll see the "post-development hand-off." An AI builds a feature in an IDE, and Finalrun immediately generates and executes a vision-based test for it verifying the feature developed by AI. https://ift.tt/dVnrNTt April 7, 2026 at 09:33PM

Monday, April 6, 2026

Show HN: MCP 2000 – Browser-based drum machine with AI-generated sounds https://ift.tt/JcvKdPm

Show HN: MCP 2000 – Browser-based drum machine with AI-generated sounds https://ift.tt/tgqfdoM April 6, 2026 at 10:35PM

Sunday, April 5, 2026

Show HN: A Dad Joke Website https://ift.tt/AtzsOpd

Show HN: A Dad Joke Website A dad joke website where you can rate random dad jokes, 1-5 groans. Sourced from 4 different places, all cited, all categorized, and ranked by top voted. Help me create the worlds best dadabase! https://joshkurz.net/ April 6, 2026 at 12:54AM

Show HN: Gecit – DPI bypass using eBPF sock_ops, no proxy or VPN https://ift.tt/hB3Q1YR

Show HN: Gecit – DPI bypass using eBPF sock_ops, no proxy or VPN https://ift.tt/Yr5ZBVf April 5, 2026 at 11:15PM

Show HN: A Common Lisp implementation in development https://ift.tt/TuxRvmV

Show HN: A Common Lisp implementation in development https://ift.tt/xrXknLR April 5, 2026 at 09:56PM

Show HN: Crabby – Claude Code skill that reviews code like the Rust compiler https://ift.tt/wLd9Z63

Show HN: Crabby – Claude Code skill that reviews code like the Rust compiler I built a Claude Code skill called crabby that makes Claude output diagnostics in rustc error format - severity codes, location arrows, causation spans, and a paste-able fix every time. The twist: Claude becomes Ferris the crab, grumpy and unimpressed, but technically precise. The format works for code review, writing review, architecture, strategy - anything you submit. The "writing review" example tends to surprise people: it flags passive voice in a postmortem with the exact same error[W002] format as a SQL injection. https://ift.tt/vTjqfa1 April 5, 2026 at 08:35PM

Saturday, April 4, 2026

Show HN: Kaoslabs – High-intensity AI video and visual experiments https://ift.tt/MHTX1UJ

Show HN: Kaoslabs – High-intensity AI video and visual experiments "I've been building a sandbox on a Linux VPS to push AI video generation and visualization to the extreme. It's a mix of experimental generative art and high-intensity visuals. Built with Python, running on Debian. Check it out and let me know what you think!" https://kaoslabs.org April 4, 2026 at 11:54PM

Show HN: DocMason – Agent Knowledge Base for local complex office files https://ift.tt/Xv0cLY7

Show HN: DocMason – Agent Knowledge Base for local complex office files I think everyone has already read Karpathy's Post about LLM Knowledge Bases. Actually for recent weeks I am already working on agent-native knowledge base for complex research (DocMason). And it is purely running in Codex/Claude Code. I call this paradigm is: The repo is the app. Codex is the runtime. During my daily working life, I have tons of office documents with knowledge from all teams, and as an IT Architect, I need to combine them altogether to handle complex deep research (which normal LLM definitely could not help). That is the originally reason I built DocMason, and I am using it in everyday which support me on lots of complex topics. I have already open-sourced this repo. And I think it takes Karpathy's concept a step further for real-world usage in three ways: 1. It could handle most kinds of office docs (pptx, docx, excels, even .eml). And really extract multimodal information from all IT architecture diagram or excel sheets. 2. It is running as a Real APP but not a naive RAG tool. DocMason could run smoothly and intelligently to prepare environment, auto update, and auto incrementally sync Knowledge base. 3. Most importantly it is running in Native AI Agents, which could leverage powerful AI Agents engine (e.g. Codex or Claude Code) View detail architecture diagram in DocMason Readme, and then download have a try :) You will find it could help a lot during daily work. Would love to hear your feedback and issues in Github! https://ift.tt/r8RkiAE April 4, 2026 at 11:49PM

Show HN: A game where you build a GPU https://ift.tt/en2IUtT

Show HN: A game where you build a GPU Thought the resources for GPU arch were lacking, so here we are https://ift.tt/jU87FJu April 4, 2026 at 11:45PM

Show HN: DocMason – AI Agent Knowledge Base for local complex office files https://ift.tt/v8PLpEd

Show HN: DocMason – AI Agent Knowledge Base for local complex office files https://ift.tt/ljatOG5 April 4, 2026 at 11:41PM

Friday, April 3, 2026

Show HN: Matrix OS, like Lovable, but for personal apps https://ift.tt/cAzC0kJ

Show HN: Matrix OS, like Lovable, but for personal apps hey hn, i built matrix os, a personal ai operating system that generates custom software from natural language. you get your own cloud instance at matrix-os.com. you describe what you want ("build me an expense tracker with categories") and it appears on your desktop as a real app saved as a file. tech stack: node.js, typescript, claude agent sdk as the kernel, next.js frontend, hono gateway, sqlite/drizzle. everything is a file, apps, data, settings, ai memory. git-versioned. what makes it different from chatgpt/claude artifacts: - persistent memory that learns your preferences across sessions - apps are real files you own, not ephemeral chat outputs - runs 24/7 in the cloud, not just when you have a tab open - accessible from web, telegram, whatsapp, discord, slack - open source, self-hostable came out of placing top 20 at anthropic's claude code hackathon. been building it full-time since. 2,800+ tests, 100k+ lines of typescript live: matrix-os.com github: github.com/HamedMP/matrix-os would love feedback on the approach. the core bet is that ai should be an os, not a chat window. https://matrix-os.com/ April 3, 2026 at 11:59PM

Show HN: Speck PBR – A WebGPU molecular visualizer https://ift.tt/ptKZJvQ

Show HN: Speck PBR – A WebGPU molecular visualizer This is the spiritual successor to my Speck project, which was getting a bit long in the tooth. Big improvements are path tracing, video generation, and trajectory support. Right now only imports XYZ, but happy to add more formats as requested. Thanks for looking! https://ift.tt/VfpCmHu April 3, 2026 at 11:08PM

Show HN: Aurion OS, A 1.8MB OS with a browser, try it live (C/x86 ASM) https://ift.tt/37xr15M

Show HN: Aurion OS, A 1.8MB OS with a browser, try it live (C/x86 ASM) I posted Aurion OS a few weeks ago on HN. Since then, the OS has gone from Beta to v1.0 Release with a lot of improvements: Blaze Browser: HTML/CSS/JS rendering with tabs and a developer console (local only, no full http/https support for now) Installer with user account setup and app selection Multi-resolution support (800x600 to 2560x1440, I plan to add 4096x2160 pixels in next versions) Unix-style luka@aurion prompt Serbian keyboard layout Python interpreter and Make build system 50+ terminal commands Window manager improvements and bug fixes 1.8MB ISO (entire OS including the browser and GUI) Supports QEMU, VirtualBox, VMware, and v86 You can try it live in the link above, or grab the ISO from GitHub: https://ift.tt/tL9G5IF Built solo as a hobby/learning project. I'm 13. I'd love any feedback, suggestions! https://aurionos.vercel.app/ April 3, 2026 at 11:01PM

Thursday, April 2, 2026

Show HN: Most products have no idea what their AI agents did yesterday https://ift.tt/YTx1DGB

Show HN: Most products have no idea what their AI agents did yesterday We build collaboration SDKs at Velt (YC W22). Comments, presence, real-time editing (CRDT), recording, notifications. A pattern we keep seeing: products add AI agents that write, edit, and approve things. Human actions get logged. Agent actions don't. Same workflow, different accountability. We shipped Activity Logs to fix this. Same record for humans and AI agents. Immutable by default. Auto-captures collaboration events, plus createActivity() for your own. Curious how others are handling this. https://ift.tt/GTH8Cf7 April 3, 2026 at 01:25AM

Show HN: I tested 15 free AI models at building real software on a $25/year VPS https://ift.tt/5i8TjUx

Show HN: I tested 15 free AI models at building real software on a $25/year VPS https://ift.tt/6hoSx0d April 3, 2026 at 12:13AM

Show HN: Portcullis, a review gate for curl|bash https://ift.tt/Of4r6qH

Show HN: Portcullis, a review gate for curl|bash https://ift.tt/tV9aeHd April 2, 2026 at 11:39PM

Wednesday, April 1, 2026

Show HN: Zerobox – Sandbox any command with file and network restrictions https://ift.tt/TBY9NvE

Show HN: Zerobox – Sandbox any command with file and network restrictions I'm excited to introduce Zerobox, a cross-platform, single binary process sandboxing CLI written in Rust. It uses the sandboxing crates from the OpenAI Codex repo and adds additional functionalities like secret injection, SDK, etc. Watch the demo: https://www.youtube.com/watch?v=wZiPm9BOPCg Zerobox follows the same sandboxing policy as Deno which is deny by default. The only operation that the command can run is reading files, all writes and network I/O are blocked by default. No VMs, no Docker, no remote servers. Want to block reads to /etc? zerobox --deny-read=/etc -- cat /etc/passwd cat: /etc/passwd: Operation not permitted How it works: Zerobox wraps any commands/programs, runs an MITM proxy and uses the native sandboxing solutions on each operating system (e.g BubbleWrap on Linux) to run the given process in a sandbox. The MITM proxy has two jobs: blocking network calls and injecting credentials at the network level. Think of it this way, I want to inject "Bearer OPENAI_API_KEY" but I don't want my sandboxed command to know about it, Zerobox does that by replacing "OPENAI_API_KEY" with a placeholder, then replaces it when the actual outbound network call is made, see this example: zerobox --secret OPENAI_API_KEY=$OPENAI_API_KEY --secret-host OPENAI_API_KEY=api.openai.com -- bun agent.ts Zerobox is different than other sandboxing solutions in the sense that it would allow you to easily sandbox any commands locally and it works the same on all platforms. I've been exploring different sandboxing solutions, including Firecracker VMs locally, and this is the closest I was able to get when it comes to sandboxing commands locally. The next thing I'm exploring is `zerobox claude` or `zerobox openclaw` which would wrap the entire agent and preload the correct policy profiles. I'd love to hear your feedback, especially if you are running AI Agents (e.g. OpenClaw), MCPs, AI Tools locally. https://ift.tt/tUSBVAc March 30, 2026 at 09:32PM

Show HN: Aphelo – A Redis-like store in C++ with Progressive Rehashing https://ift.tt/VDy37HR

Show HN: Aphelo – A Redis-like store in C++ with Progressive Rehashing https://ift.tt/6J8Yu4x April 1, 2026 at 11:33PM

Show HN: Real-time dashboard for Claude Code agent teams https://ift.tt/Mrd6jJh

Show HN: Real-time dashboard for Claude Code agent teams This project (Agents Observe) started as an exploration into building automation harnesses around claude code. I needed a way to see exactly what teams of agents were doing in realtime and to filter and search their output. A few interesting learnings from building and using this: - Claude code hooks are blocking - performance degrades rapidly if you have a lot of plugins that use hooks - Hooks provide a lot more useful info than OTEL data - Claude's jsonl files provide the full picture - Lifecycle management of MCP processes started by plugins is a bit kludgy at best The biggest takeaway is how much of a difference it made in claude performance when I switched to background (fire and forget) hooks and removed all other plugins. It's easy to forget how many claude plugins I've installed and how they effect performance. The Agents Observe plugin uses docker to start the API and dashboard service. This is a pattern I'd love to see used more often for security (think Axios hack) reasons. The tricky bit was handling process management across multiple claude instances - the solution was to have the server track active connections then auto shut itself down when not in use. Then the plugin spins it back up when a new session is started. This tool has been incredibly useful for my own daily workflow. Enjoy! https://ift.tt/g8Ap5FE April 1, 2026 at 11:24PM

Show HN: Max Headbox, a local agent that fits on a Raspberry Pi 5 https://ift.tt/9tjBhle

Show HN: Max Headbox, a local agent that fits on a Raspberry Pi 5 https://ift.tt/62WoTOA April 1, 2026 at 09:57PM