Posted Wed, 23 Oct 2019 20:50:59 GMT by Larry Stover President
How do I have FULL Text do all of the pages in a document other than the first 100 pages? The first 100 pages are done in all the docs. Starting at page 101 there is no full text for all docs in the file cabinet.  Using DW6.12.0.766
Posted Wed, 23 Oct 2019 21:43:59 GMT by Josef Zayats

Larry,

There is an obscure FC setting (very few of us have ever heard of it)  in Docuware Configuration - FileCAbinets - the File Cabinet -General - More Options - Configure fulltext search - Indexed Pages per file. it is set at default of 100 pages

The explanation of this setting is at the following link. I think you need to re-run FT indexing after making the change, so existing documents get processed.

http://help.docuware.com/en/#b64082t62382n88325

Posted Thu, 24 Oct 2019 12:59:55 GMT by Joe Kaufman Bell Laboratories Inc Application Development Manager
Josef,

I am a little confused by the 100-page limit... Our setup looks to be at the default 100 "shots", which the help states is equivalent to 100 pages (one shot per page). But I just ran a full text search on a very specific number of our Production file cabinet (it is a sticker they stick on a batch sheet, and DocuWare does a good job of OCRing them), and it came right up with the single document in the cabinet, moving me straight to the page where the sticker was -- page 205 out of 212.

How am I finding text on page 205 if the text shots are stopping after 100 pages? Or is the 100-page limit only for an initial shot-take? Does it run a longer full-text analysis later and get everything indexed?

I never knew about this, but am now concerned, since we have full-text turned on for several file cabinets where the documents are over 100 pages...

Thanks,
Joe Kaufman
Posted Thu, 24 Oct 2019 14:45:02 GMT by Simon H. Hellmann Toshiba TGIS GmbH | IT-Consultant - Document Management Solutions
Hi Joe,

long time no see!

As far as I know, the fulltext indexes both the beginning and the end of the document - at least that is what the colleagues from DocuWare told me in Munich two years ago (prior to 6.12 release). So the default setting of 100 would index the first 100 and the last 100 pages of your document.

In your example, your fulltext search would find anything on page 1-100 and on page 113-212, but page 100-113 would not be searchable.

Greetings from Germany,
Simon H. Hellmann
DocuWare System Consultant
Posted Thu, 24 Oct 2019 16:14:44 GMT by Joe Kaufman Bell Laboratories Inc Application Development Manager
Simon,

I did some further testing...

I found a file with 274 pages and picked one of those sticker-based numbers, slightly atilt, on page 121. Should be in between the first 100 and last 100 pages. I did a search while viewing the document, and it found the number on page 121.

Then I thought that perhaps searching for text while viewing forces DocuWare to do a quick full-text scan of ALL pages. So, I found another file with 240 pages, and just viewed a number to search for on page 103. Went back and reset the search, then searched for the number in the "Fulltext" field. It found the document and went straight to page 103 in the viewer.

Unless DocuWare immediately full-text scans any file that gets viewed, it looks like I have full coverage of all pages, not just the first and last 100, even though our default max. number of shots is still at 100. This is good news (for me), but I can't explain why other folks are stuck at the 100-page limit. I am on DW 6.11, on-premise.

Thanks,
Joe Kaufman

 
Posted Thu, 24 Oct 2019 17:31:55 GMT by Josef Zayats
Joe,
a real test would be to look up what you want to search full-text before you store the document in Docuware - using Acrobat Viewer. Then you store the document. After the document is stored and given system sufficient time to run OCR and FT index on the document, perform the fulltext search for the term you found above - prior to accessing the document in any other way.
I did not realize of the 100 pages limit till recently when a customer complained that fulltext searches were not finding everything there was to be found. (And lifting the limit to its max resolved this)
Your testing is tainted by accessing the document prior to performing the FT search - Docuware OCRs and FT-indexes all the pages when the document is accessed.
 
Posted Thu, 24 Oct 2019 17:38:18 GMT by Joe Kaufman Bell Laboratories Inc Application Development Manager
Josef,

OK, so viewing the document skews the results, which is what I was afraid of...

So, if I up the number of pages for all full-text indexed cabinets, I have the following questions:
 
  1. How high should (can) I go? can I go 999999, for example?
  2. How do I trigger a rescan of the text shots after I up the limit?
  3. Is this going to bloat textshot data in the database and cause me storage issues? (And I mean beyond the obvious growth due to documents with more than 200 pages having more pages getting processed.)
Looks like I need to fix up a bunch of file cabinet configurations!

Thanks,
Joe Kaufman
Posted Thu, 24 Oct 2019 20:01:58 GMT by Joe Kaufman Bell Laboratories Inc Application Development Manager

I see now the max number of shots is 10,000, so that is what I am using.

As far as resetting the full-text information, I click the "Reset" button and just use the defaults on the screen that comes up. It seems to be reindexing a lot of documents, so I think it is re-processing things appropriately.

Thanks,

Joe Kaufman

You must be signed in to post in this forum.