ads

Friday, January 24, 2025

Show HN: WebMarker – Mark web pages for use with vision-language models https://ift.tt/TrsJxIM

Show HN: WebMarker – Mark web pages for use with vision-language models WebMarker is a JavaScript library used for adding visual markers and labels to elements on a web page. This can be used for Set-of-Mark prompting, which improves visual grounding abilities of vision-language models such as GPT-4o, Claude 3.5, and Google Gemini 1.5. This library aims to: - Improve LLM performance on vision tasks referencing web pages - Enable reliable web page interactions based on LLM responses https://ift.tt/fepzAsV January 25, 2025 at 12:59AM

No comments:

Post a Comment