Question:
How does the Fulltext processing work?
Answer:
After storing a document into a file cabinet with enabled Fulltext support, Fulltext indexing for this document immediately begins.
OCR and Fulltext indexing tasks are generated in the dwsystem.dbo.DWTASKS table with TASK_TYPEs 0 and 2 respectively, and are then processed by the Background Process Service.
The Fullext server itself is only responsible for the Fulltext search and does no Fulltext indexing.
It however needs to be running to receive Fulltext data during Fulltext indexing.
Overview of Fulltext TASK_TYPEs in the dwsystem.dbo.DWTASKS table:
TASK_TYPE 0: OCR
TASK_TYPE 1: Intelligent Indexing textshot creation
TASK_TYPE 2: Fulltext index is updated in the DB and exported to the SOLR core so the Fulltext server has access to it
TASK_TYPE 3: SOLR cleanup after document deletion
TASK_TYPE 4: Upgrade Task from an older DocuWare Version (No longer in use since DocuWare 7; Delete this task manually if you come across it)
TASK_TYPE 5: Fulltext Reset Task (check the settings column for the current status; Documents are processed from highest to lowest dwdocid)
Fulltext data is written into the _PAG and _PGT tables of a file cabinet.
This data can be safely deleted if necessary, as long as you remove it from both _PGT and _PAG.
You will have to do a Fulltext reset if you want to generate your Fulltext index data again afterwards.
Unlike in DocuWare 6, the _SEC table does NOT contain any Fulltext data! NEVER remove rows of existing documents from the _SEC table!
Once a document is fully indexed on database side, the Fulltext index is then transferred to the SOLR core.
These cores can be found in the Fulltext index storage location defined in your Fulltext connection (configured in the admin tool -> data connections).
The cores are named after the GUID of a file cabinet. This GUID can be checked in the file cabinet configuration part of the web configuration under general -> more options.
Overview of STATUS values in the _PGT table:
0 = New
1 = Textshot successfully created
2 = Error during textshot creation
3 = Textshot successfully transferred to SOLR
4 = Error during transfer to SOLR
Only after the Fulltext data of a document was completely transferred to the SOLR you will be able to find the document via the Fulltext search.
See also:
Fulltext and SOLR: KBA-35311
Check Fulltext textshot of a document: KBA-34944
Views:
This article is valid for DocuWare versions: 7 | fulltext #FAQID_3827