The performance of an Autoindex workflow is slow

When setting up an Autoindex workflow, there are several criteria that are decisive for performance. The following are explaining them individually:

  1. Type of external data source:
    When choosing an external file source, keep in mind, that tables in a well-configured database are significantly faster than a file on the file system. If you choose a file connection, make sure (if possible) to save the file on the same server as the DocuWare server to prevent network issues.
  2. Choosing the iterator:
    The iterator option is an unassuming but very important option, that can improve performance significantly. Please notice that the iterator (depending on your choice of external source) should always be the database / file or archive which contains few entries. This doesn’t have to be the external data source, for a filtered archive could possibly contain fewer entries.
  3. Insert a filter:
    Make sure that at least one filter is set for an archive in the configuration. In very few cases, all documents have to be indexed again. At best, however, a filter should be used for the external data source as well as the archive, where the filtering of the external data source doesn’t have to be set necessarily for the Autoindex configuration.
  4. Set a database index on the match code field:
    By setting a database index on the file cabinet table match code field its potential can be further pushed. Even when using a database as an external source, this should not be ignored. By appropriate scripts or mindless trial-and-error procedures necessary indices can be found and implemented.
  5. Optimizing the database:
    An optimally designed database or database server not only helps the entire DocuWare system but also the Autoindex workflow.
  6. Sufficient system resources:
    Pay attention during the first runs of the Autoindex workflows on the system load, to recognize possible lack of resources immediately. In particular, the processes of the Content and Workflow server should be observed. This allows derived measures such as the extension of RAM resources or to schedule the jobs outside of the main access times.
  7. Scheduled jobs at the same time:
    Make sure that all your Autoindex are not scheduled at the same time. At best, schedule them 15 minutes apart from each other.