Hey all AI geniuses, I have hundreds of documents in PDF, PPTX, WORD… Which I want to “chat” with “overall”. I’m looking for a tool to upload all there, and then being able to ask the AI questions about all my documents at once. Few requirements:

  • It has to be able to check all documents, and not just stop after finding answer into one single document. Indeed, many documents will answer the question.

  • It has to quote within the answer exactly where each element of answer in the outputted text was found (which document name), but also the specific page of document where the element was found, and a link to it. Some documents have 400 pages, so giving just the name is not useful enough.

Example : “What does my documents say about treatment of myocardial infarction in renal failure patients.” should output something like “[Roberts.pdf pages 1, 5, 9] says that myocardial infarction treatment should follow these rules: ABC. [Michael.pdf pages 4, 5] adds that in case of acute kidney failure associated, treatments dose should be adjusted.” Ideally, it would even highlight within the document, like does. Alternatively, it would ideally give a link to the exact page of the document. So far, I tried:

  • Odin AI : doesn’t quote within the answer, but just gives [1], [2], … at the end of the compiled answer. Moreover, link to documents doesn’t work for some reason.
  • TextCortex : the best option I found so far, but bugs with links and doesn’t say the page.
  • ChatPDF : doesn’t allow for a knowledge base really.
  • : only using keywords for search, no chat / reasoning.

what you want to achieve is still a daydream. it is certainly feasible and there are already the tools to do it but the paid apps to get it have not been invented. Openai’s API limitations probably don’t make it economical to do something like this on large volumes of data, in addition to the technical limitations you have already mentioned.
There are probably ad hoc apps but only for industrial purposes in some private intranet not available to the general public

