covid20212022
ads
ads
Friday, May 22, 2026
Show HN: Mechs.lol – a free, web-based autoshooter game https://ift.tt/5wCVx6a
Show HN: Mechs.lol – a free, web-based autoshooter game One unexpected benefit of LLMs is I can work on projects I otherwise wouldn't have taken on. I made a web-based autoshooter (with multiplayer support) heavily using AI / LLMs. This is something I'd consider "alpha" quality so don't expect a super polished experience but it's hopefully fun https://mechs.lol May 23, 2026 at 12:04AM
Show HN: Lilo – An open source personal AI assistant that lives in Telegram https://ift.tt/ctJGTPK
Show HN: Lilo – An open source personal AI assistant that lives in Telegram Hi everyone, I wanted to share an open source Telegram-based personal AI assistant I built. It’s a model-agnostic agent with memory, skills, tools (like web search, browser user, etc.) operating in a persistent workspace. It also has support for scheduled tasks, and can build powerful HTML-based apps that live in the workspace. Here are some of my favorite use cases: * Send Lilo photos of food, and it tracks your calories. * Leave a voice note on your run to pause your supplements, and Lilo adds a TODO. * Have Lilo remind you when the Knicks game starts and even send you score updates every 5 minutes. * Have Lilo read an article out loud. Or give you a summary of the top stories on Hacker News. * Forward a Uber receipt, and pull it up later to file for a reimbursement at work. * Schedule a meeting with Jess next week, ask for suggestions on location, and next week, remind you to leave for the meeting on time. While Telegram is my most frequently used channel, Lilo can also be accessed by email, WhatsApp, a website and a mobile app. Email is particularly useful: I often forward receipts, invites, etc for Lilo to handle. How is this different from OpenClaw and Hermes Agent? Here are some reasons: - Runs on a remote machine/in the cloud rather than your local machine - your local data is safe, and the assistant is available 24/7. - More visual/ more GUI - Lilo comes with a default set of apps like a TODO list that you can interact with not just by text, but also with a GUI on the mobile and web app. - The Telegram integration is very comprehensive (handles replies, voice notes, reactions, etc.). I use Lilo a ton to manage my life. Would love to hear your feedback! Github: https://ift.tt/PjCynGp May 22, 2026 at 11:03PM
Thursday, May 21, 2026
Show HN: I Made a Claude Skill for Spec-Driven Development (SDD) https://ift.tt/jihdulz
Show HN: I Made a Claude Skill for Spec-Driven Development (SDD) At my work they provided a single Claude subscription for everyone on the team. To be honest I like kiro better as it provides a way better SDD management. But the company can't provide it and I can't afford it yet. Turns out I had the skill creator skill in my claude instance so I made use of it to create this Skill. I made it fully by using Claude but I wanted to make it open source, so I asked it to help me make tests and preparations for it, even a CI to run python tests. Well, we got this results with it: - Phase 2A: 67 static assertions (Python script, runs in CI) - Phase 2B: 15 behavioral tests (live Claude Code session) - Phase 2C: 53 generation quality checks across 3 end-to-end flows All of these passed and the CI also passed (after a few tries). I made it to suit my way of prompting and coding and based it off kiro's SDD management, but I want it to be publicly available and used by many people. According to claude some of the testers need to fit the following criteria: 1. Developer starting a real new project from scratch 2. Solo dev with an active side project (greenfield or partial codebase) 3. Team lead whose team uses multiple AI tools 4. Developer with an existing codebase and no written specs 5. Developer who actively uses 3+ AI coding tools It's actually a blind test, no guiding, just try it if you can, I'd really appreciate your help. The repo is here: https://ift.tt/WBGdFby https://ift.tt/WBGdFby May 21, 2026 at 07:49PM
Show HN: Freenet, a peer-to-peer platform for decentralized apps https://ift.tt/8ZH7A6p
Show HN: Freenet, a peer-to-peer platform for decentralized apps For the past 5 years or so I've been working on a ground-up redesign of Freenet, my peer-to-peer project from the early 2000s (now renamed Hyphanet). The new Freenet has been up and running since December along with some early applications like River[1], our decentralized group chat and Delta - a decentralized CMS. Users have already started to build their own apps on Freenet including games, and we have some interesting apps in development like Atlas, a search/recommendation engine. Architecturally, this new Freenet is a global, decentralized key-value store where keys are webassembly contracts which define what values (aka "state") are valid for that key, how or when the values can be mutated, and how the state can be efficiently synchronized between peers. We've developed a unique (AFAIK) solution to the consistency problem, every contract must define a "merge" operation for the contract's associated state. This operation must be commutative, meaning that you can merge multiple states in any order and you'll get the same end result. This approach allows state updates to spread through the network like a virus[2], which typically achieves consistent global state in a few seconds or less. Like the world wide web, Freenet applications can be downloaded from the network itself and run in a web browser - similar to single-page apps on the normal web. However, rather than connecting back to an API running in a datacenter, the webapp connects locally to the Freenet peer and interacts with Freenet contracts and delegates over a local websocket connection. If you'd like to try Freenet we have convenient installers for the major desktop OSs but not yet mobile, and you can be chatting with other users on River within seconds[3]. Happy to answer any questions, you're also welcome to read our FAQ[4], or watch a talk I gave back in March[5]. [1] https://ift.tt/RYrSpj3 [2] https://ift.tt/9fU108J [3] https://ift.tt/RLUr42e [4] https://ift.tt/uUHBInq [5] https://youtu.be/3SxNBz1VTE0 https://freenet.org/ May 21, 2026 at 09:34PM
Show HN: Agent.email – sign up via curl, claim with a human OTP https://ift.tt/iKZqzuG
Show HN: Agent.email – sign up via curl, claim with a human OTP Hi HN! We're Haakam, Michael, and Adi from AgentMail- a ycs25 company. We give AI agents their own email inboxes. Recently, we ran an experiment called Agent.Email. It's a signup flow designed specifically for AI agents instead of humans. The inspiration came from a few comments we received when we did our seed launch a few months back. They all came from the very apt observation that agents not being able to sign up to a product made for agents without human credentials was ironic and unideal. This is basically the thesis we built AgentMail on: The internet was made for humans exclusively, designed to keep machines out by default. Every signup flow assumes a browser, a person reading a page, and clicking a confirmation link. Unless agents can't do that, they can't be first class users of the internet. Agents can now get an email inbox by themselves. (This also means a lot of email nobody wants to read gets processed by AI instead of your inbox being cluttered with spam and slop) Here's how agent.email works. Agent needs an inbox and hits AgentMail via curl.
Agent receives instructions via MD unless the request comes from a browser, in which case we use HTML. Agent decides agent.email is useful and then hits the sign-up endpoint with its human email as a parameter.
Agent receives a restricted inbox with credentials.
Agent emails the human asking for an OTP. Human replies with the code, and the agent is claimed and restrictions are lifted.
Until claimed, the agent can only email its own human and nobody else. Ten emails a day, and the signup endpoint is rate-limited hard by IP. Right now it's a 1:1 mapping between agent and human. The next step is many-to-one, because one person running several agents in parallel is already very common. Building agent.email also pushed us to revisit places in AgentMail where the default assumptions were built around the primary user being human. For example, the CLI outputs in a single column with consistent formatting because mixed delimiters are easy for a person to scan, but harder for an agent reasoning about structure. We also shortened messageIDs after agents started hallucinating completions on longer ones. A few things we'd like the community's take on: is restricted-until-claimed the right trust model?
Does agent self-signup feel useful in production, or is it mostly a novelty, and if it's a novelty now, what would make it actually useful?
Should agent onboarding require human approval by default, or should some agents be able to fully self-provision? What do you think are some additional measures we can take for secure sign-ups? May 21, 2026 at 11:42PM
Wednesday, May 20, 2026
Show HN: IgniteMS – batch text embeddings at 253K msg/s on 8x A100 https://ift.tt/nVdFPLY
Show HN: IgniteMS – batch text embeddings at 253K msg/s on 8x A100 https://ift.tt/hnElig3 May 21, 2026 at 12:07AM
Show HN: I made a tool for learning scales, chords, and how to combine them https://ift.tt/jAuY4n3
Show HN: I made a tool for learning scales, chords, and how to combine them This started out when I vibe-coded a guitar scale fingering generator. It came out pretty good, and I started adding stuff to it: chords, then how chords and scales interact. Then I added charts for other instruments I mess around with: piano, cello, alto recorder. There's a complexity toggle to go from basic harmony to extended/experimental stuff. It's honestly still mostly a toy, but I thought other people might be interested in playing with it. Source is on github, so it's easy enough to run locally and fork. https://ift.tt/xh8T3Ru https://ift.tt/l54DbYI May 21, 2026 at 12:44AM
Tuesday, May 19, 2026
Show HN: How Expensive Is Your (Steam) Wishlist? https://ift.tt/9rSJMyh
Show HN: How Expensive Is Your (Steam) Wishlist? A tool/toy that lets you connect to your Steam wishlist to calculate the total list/current price of all the games on it. There's a shallow, jokey purpose to it ("I could buy a BMW with this amount!"), but the real purpose is to demonstrate how we can do a better job of portraying a game catalog. I often wishlist stuff, then it pops up in a "Hey, it's on sale!" email months later. In that email, there's a banner capsule, but that doesn't help my brain remember why I added it. To that end, after you get the bill, you get a nice, flat feed of stuff about all the titles you've wishlisted over the years. It's all stuff that developers painstakingly put together, but which Steam tucks away under the fold of a game's Store page. Anyway, my wishlist came to about $250. My QA guy is up to $19k. Give it a go; hope you enjoy it! https://ift.tt/FcbPJht May 20, 2026 at 12:15AM
Show HN: Haystack – Review the PRs that need human attention https://ift.tt/p9ubg0v
Show HN: Haystack – Review the PRs that need human attention Hey HN! We're building Haystack ( https://ift.tt/wKJ875U ) to help teams deal with the explosion in the number of pull requests that need to be reviewed due to the rise of coding agents. Haystack replaces the GitHub PR review system with a queue that triages each PR before a human has to read any diffs. It looks at the diffs, the codebase, and the coding-agent conversation that produced the PR. Haystack then routes it into one of three buckets: 1. Safe to merge. This means the PR has enough evidence behind it that the team can merge it without another human's review. Some examples: -- A small UI copy change that includes a screenshot showing the final state -- A backend change where the author clearly tested the important paths and ran the changes in a real environment 2. Needs fixes. This means that the PR has bugs or violates a rule in your codebase and therefore the PR needs to be fixed by the author. Some examples: -- The agent was asked to make loading a large table faster by adding pagination, but the PR still loads every result at once and "implements" pagination in the UI -- The PR silently catches an error instead of logging, surfacing, or handling it. This violates the team's "no silent error swallowing" rule 3. Needs human review. This means that the PR could not be sufficiently verified by the author or is touching a sensitive part of the codebase (determined by user-input guidelines) and thus requires human review. Some examples: -- The PR changes a significant amount of logic in billing -- The PR changes an important user flow like onboarding, but the author only ran unit tests and never opened the app to check the flow end-to-end. That violates the team's rule that high-impact user-facing changes need manual verification. Instead of starting with line-by-line diffs, Haystack immediately tells the reviewer the goal behind the PR, what design decisions the author made (informed by their coding-agent conversation), and how much the author did to verify that the pull request works (e.g. run scripts, checked the frontend, etc.). In this way, review shifts from "what changed?" to "is this the right behavior and is there evidence that it works?". Here's a quick demo: https://ift.tt/0hK4Pra... We previously launched Haystack as a tool for understanding large PRs ( https://ift.tt/2XcURvi ). As many of you can probably relate to, the release of Opus 4.5 completely shattered our conception of how fast an engineer could craft a PR. And as coding agents got even better from 4.5, we realized that pull requests did not scale along with our coding velocity. With each member of our team being able to pump out more than 20 pull requests a day, code review quickly became cognitively exhausting and less helpful. After talking with other folks, we learned many feel similarly, and currently face the binary option of either not doing review at all or trying to keep up with a fire hose of pull requests. Haystack is our attempt at a third path. We still believe in code review, but as coding agents produce more code, human reviewer attention becomes more valuable and more expensive. Haystack helps teams spend that attention on the PRs where a human can meaningfully change the outcome of that PR. And for such PRs, Haystack shows the reviewer what the PR intended to do, whether the author showed that it works, and what design decisions need a second pair of eyes. We're still quite early and are figuring out whether Haystack truly makes code review better. We would love any and all feedback! https://ift.tt/wKJ875U May 19, 2026 at 12:44AM
Show HN: Superlog (YC P26) – Observability that installs itself and fixes bugs https://ift.tt/fyqnNzX
Show HN: Superlog (YC P26) – Observability that installs itself and fixes bugs Hey HN, we’re Nico and Arseniy, co-founders of Superlog ( https://superlog.sh ). We're building a self-installing, self healing observability tool meant not to be opened. It has a wizard that daily sets up proper logging and an agent that investigates errors and opens PRs. Super short demo: https://www.youtube.com/watch?v=xFhU9Mk247M . In our earlier startups, we tried Sentry, Datadog, Grafana, Dash0, and nothing was good enough. Proper telemetry and alerting still requires a ton of manual setup. We struggled with adding good logs, so debugging was tough, especially as codebases grow at a faster pace. Meanwhile, the Datadog/Dash0 bill kept climbing, and we still spent engineering hours to learn, configure, and maintain our observability tooling. With Sentry, we found ourselves flooded by a stream of alerts into our Slack channel, most were duplicates or lacked context, so alert fatigue/constant interrupts were a real pain. The #ops notification is consistently the worst feeling on a Saturday morning We’ve seen too many times servers run out of memory and disk, and three AWS metrics giving us three different values. Half of the graphs on dashboards are normally empty or outdated, and manually clicking through UIs, especially when the team is small, seems like a huge waste of time. At some point we realized that solving this problem would be more valuable than the things we had been working on, and we had the expertise to do it, since Arseniy had spent years at Datadog, getting paged during the night to debug production incidents. So we decided to build a platform that would just work: agent-first, MCP-native, zero-setup. Here’s how Superlog works: we have a wizard that scans your repo, and automatically instruments it with well-structured logs, traces and metrics via OpenTelemetry. We make sure to highlight main failure modes, endpoint performance, usage per tenant, and LLM/upstream cost (by callsite, tenant and model). Errors get fingerprinted and grouped into incidents, so you see one issue, not a thousand duplicates. When you get a notification from Superlog, you see a clear failure summary, its inferred severity and impact upfront. Then the agent investigates and tries to solve the issue. If it has enough context, it produces a concise and tested PR. If it doesn't, it posts its findings for the investigating team, and automatically pulls in the engineers that could contribute more context based on documentation, previous investigations and Slack threads. Either way the output is one clean PR per incident, posted in Slack, that you can
merge, ignore, or open as a Claude Code session and modify. Three things we think are different from other observability vendors: (1) We solve the setup pain. The wizard will instrument everything with native OTel SDKs, respecting the semantic conventions, with proper service and environment tagging. We’re also working on native automatic dashboards and alerts, so that you can see what’s going on in a glance and don’t miss subtle failure modes. (2) Our telemetry doesn’t decay. The wizard runs daily, and keeps adding logs, alerts and dashboards where it’s needed. You don't have to remember to instrument new features. The next time something breaks, the data you need to debug it is already there. (3) Our goal is to solve alert fatigue. We use agents to merge similar errors and refine the summaries, giving you relevant information upfront. We have a custom evaluation setup that makes sure that our summaries are dense and correct, and severity and impact is on point. We also give you confidence scores for every LLM-enhanced metric so that wrong guesses don’t get boosted. Important: superlog telemetry is vendor-neutral, so you keep all the logs/metrics/traces we install. Pricing is on the site. We're early, so expect rough edges and please tell us when you find them. You can try it at https://superlog.sh . We'd love to hear what you're using today, what's broken about it, and whether the "one mergeable PR per incident" model sounds useful or terrifying. Especially keen to hear from folks running integration-heavy products, anyone who's rolled their own observability, and anyone who has tried Sentry / Datadog MCPs and given up. Comments and feedback welcome! https://superlog.sh/ May 19, 2026 at 10:54PM
Monday, May 18, 2026
Show HN: We missed Winamp, so we built an audio player for macOS https://ift.tt/G9cAKH8
Show HN: We missed Winamp, so we built an audio player for macOS https://ift.tt/JHEaPmG May 19, 2026 at 02:20AM
Show HN: Marlin-2B: a tiny VLM to extract structured information from videos https://ift.tt/WbFSG2E
Show HN: Marlin-2B: a tiny VLM to extract structured information from videos https://ift.tt/QPzDZOh May 19, 2026 at 01:06AM
Show HN: InsForge – Open-source Heroku for coding agents https://ift.tt/0KSQ2jZ
Show HN: InsForge – Open-source Heroku for coding agents Hi HN, I'm Hang, cofounder of InsForge (YC P26). InsForge is an open-source Heroku for AI coding agents: a backend platform designed for coding agents to deploy, operate, and debug end-to-end. Open source under Apache 2.0 ( https://ift.tt/LUusSEi ). Quick demo here ( https://youtu.be/7Bax5qz0IfM ). We started InsForge because we just wanted our Claude Code to handle all the backend / infra stuff for us, instead of us jumping between dashboards doing manual config, or copy paste logs and docs back to agents. We first tried creating a folder with bunch of .MD files, and installing MCPs like Supabase, Vercel, GitHub, Context7. But soon we found MCPs have their own problems: (a) Tools get pre-loaded into context, before agents even do anything (b) bad design, payloads are returning 10k+ tokens, and (c) a lot of stuff still can’t be done by MCP: e.g. telemetry and configs. So we think, because coding agents are so good at CLI, why not just put everything in CLI and create Skills to teach them how to use it? That’s InsForge: 1 command to install our CLI + Skills, coding agents can run the entire backend platform [1]. We started with authentication and database, but we kept adding more primitives we wanted, so now we have: - frontend hosting - backend servers (microVM based) [2] - database - auth - storage - LLM model router - cron jobs - realtime - edge functions - vector We have other features to make coding agents more reliable like real backend engineers: - backend branching [3]: agents will 100% mess up, like deleting your database. So inspired by Neon, we branch the entire backend (DB, auth, storage, functions, schedules). Agents work on the branch, you review diffs and then decide to merge or discard.
- server telemetry: agents can read logs, CPU, memory, disk to find spikes and root causes themselves. - debug agent [4]: every project gets a dedicated debug agent. So your coding agent can ask questions like “why deployment fail?”, the debug agent will run diagnoses, find the root causes and propose fixes, then send the answer back. - backend advisor [5]: scans your backend daily for security and performance issues, proposes fixes. Then propose remediations, and sends to your coding agent. Give it a spin on InsForge cloud : https://insforge.dev , or read our code here: https://ift.tt/LUusSEi . We're a small team and reading every comment. Tell us what's good, what sucks, what's missing. We love feedback :) [1] https://ift.tt/3R8VatH [2] https://ift.tt/MtroiVH [3] https://ift.tt/S0ZykHb [4] https://ift.tt/vOguPQd [5] https://ift.tt/hTXncBA https://ift.tt/LUusSEi May 18, 2026 at 10:40PM
Subscribe to:
Posts (Atom)