ads
Tuesday, May 19, 2026
Show HN: Haystack – Review the PRs that need human attention https://ift.tt/p9ubg0v
Show HN: Haystack – Review the PRs that need human attention Hey HN! We're building Haystack ( https://ift.tt/wKJ875U ) to help teams deal with the explosion in the number of pull requests that need to be reviewed due to the rise of coding agents. Haystack replaces the GitHub PR review system with a queue that triages each PR before a human has to read any diffs. It looks at the diffs, the codebase, and the coding-agent conversation that produced the PR. Haystack then routes it into one of three buckets: 1. Safe to merge. This means the PR has enough evidence behind it that the team can merge it without another human's review. Some examples: -- A small UI copy change that includes a screenshot showing the final state -- A backend change where the author clearly tested the important paths and ran the changes in a real environment 2. Needs fixes. This means that the PR has bugs or violates a rule in your codebase and therefore the PR needs to be fixed by the author. Some examples: -- The agent was asked to make loading a large table faster by adding pagination, but the PR still loads every result at once and "implements" pagination in the UI -- The PR silently catches an error instead of logging, surfacing, or handling it. This violates the team's "no silent error swallowing" rule 3. Needs human review. This means that the PR could not be sufficiently verified by the author or is touching a sensitive part of the codebase (determined by user-input guidelines) and thus requires human review. Some examples: -- The PR changes a significant amount of logic in billing -- The PR changes an important user flow like onboarding, but the author only ran unit tests and never opened the app to check the flow end-to-end. That violates the team's rule that high-impact user-facing changes need manual verification. Instead of starting with line-by-line diffs, Haystack immediately tells the reviewer the goal behind the PR, what design decisions the author made (informed by their coding-agent conversation), and how much the author did to verify that the pull request works (e.g. run scripts, checked the frontend, etc.). In this way, review shifts from "what changed?" to "is this the right behavior and is there evidence that it works?". Here's a quick demo: https://ift.tt/0hK4Pra... We previously launched Haystack as a tool for understanding large PRs ( https://ift.tt/2XcURvi ). As many of you can probably relate to, the release of Opus 4.5 completely shattered our conception of how fast an engineer could craft a PR. And as coding agents got even better from 4.5, we realized that pull requests did not scale along with our coding velocity. With each member of our team being able to pump out more than 20 pull requests a day, code review quickly became cognitively exhausting and less helpful. After talking with other folks, we learned many feel similarly, and currently face the binary option of either not doing review at all or trying to keep up with a fire hose of pull requests. Haystack is our attempt at a third path. We still believe in code review, but as coding agents produce more code, human reviewer attention becomes more valuable and more expensive. Haystack helps teams spend that attention on the PRs where a human can meaningfully change the outcome of that PR. And for such PRs, Haystack shows the reviewer what the PR intended to do, whether the author showed that it works, and what design decisions need a second pair of eyes. We're still quite early and are figuring out whether Haystack truly makes code review better. We would love any and all feedback! https://ift.tt/wKJ875U May 19, 2026 at 12:44AM
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment