Skip to content

[Bug]: QueryFusionRetriever._aretrieve blocks the event loop during query generation #21159

@gautamvarmadatla

Description

@gautamvarmadatla

Bug Description

_aretrieve() calls the synchronous _get_queries(), which blocks on self._llm.complete() instead of awaiting an async equivalent. When num_queries > 1 (the default), this blocks the current event loop during query generation and prevents other coroutines on that same loop from making progress until query expansion finishes.

Version

0.14.19

Steps to Reproduce

import asyncio
from llama_index.core.base.base_retriever import BaseRetriever
from llama_index.core.retrievers import QueryFusionRetriever
from llama_index.core.schema import NodeWithScore, QueryBundle, TextNode
from llama_index.llms.openai import OpenAI

class MockRetriever(BaseRetriever):
      def _retrieve(self, query_bundle):
          return [NodeWithScore(node=TextNode(text="Hi I am Gautam!"), score=1.0)]

retriever = QueryFusionRetriever(
      retrievers=[MockRetriever()],
      llm=OpenAI(model="gpt-5"),
      num_queries=4,
  )

async def other_work():
      await asyncio.sleep(0.1)
      print("other work ran")

task = asyncio.ensure_future(other_work())
await retriever.aretrieve("What is the capital of France?")
print(f"other_work done: {task.done()}")  

Relevant Logs/Tracbacks

other_work done: False

So basically the task had the entire duration of a real OpenAI HTTP call to complete its
0.1s sleep and still never got scheduled.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingtriageIssue needs to be triaged/prioritized

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions