Some UIs (Like Open Web UI) have built in “agents” or extensions that can fetch and parse search results as part of the context, allowing LLMs to “research.” There are in fact some finetunes specializing in this, though these days you are probably best off with regular Qwen3.
This is sometimes called tool use.
I also (sometimes) use a custom python script (modified from another repo) for research, getting the LLM to search a bunch of stuff and work through it.
But fundamentally the LLM isn’t “searching” anything, you are just programmatically feeding it text (and maybe fetching its own requests for search terms).
The backend for all this is a TabbyAPI server, with 2-4 parallel slots for fast processing.
The front end.
Some UIs (Like Open Web UI) have built in “agents” or extensions that can fetch and parse search results as part of the context, allowing LLMs to “research.” There are in fact some finetunes specializing in this, though these days you are probably best off with regular Qwen3.
This is sometimes called tool use.
I also (sometimes) use a custom python script (modified from another repo) for research, getting the LLM to search a bunch of stuff and work through it.
But fundamentally the LLM isn’t “searching” anything, you are just programmatically feeding it text (and maybe fetching its own requests for search terms).
The backend for all this is a TabbyAPI server, with 2-4 parallel slots for fast processing.