David,
I think we come at things from a very different perspective here. Our installation isn't something assembled by end users and resellers/consultants who, no offense, may or may not have a full understanding of proper data modelling. That is not the case for us. We have developed document types that are fairly well normalized. We do not mash a lot of different document types into one file cabinet and then separate them by a "document type" index. To me, that defeats the whole purpose of separate file cabinets and breaks almost all rules of proper normalization. If you need to add a "document type" index to your file cabinet, as Steve Jobs might say, "You're doing it wrong."
There is another reason for not mashing unalike documents into one file cabinet -- parent/child relationships. What if one document type occurs multiple times as it links to a parent document type and you want to make a linkage for it? You can't do that if everything is all in one file cabinet (at least I don't think you can, and if you could it would get real ugly, real fast). That's just another reason for normalization.
Whether or not a document is "unalike" another is, of course, the whole question here. It is the foundation of all data modelling, deciding what should be considered a separate data entity. If two documents share 90% of indexes, then yeah, perhaps we do some sparse indexing on that and have holes in some documents and different holes in others. In such cases we would not use a generic trigger. But if I have a file cabinet that is as specific as "HR - Wage Tax Register" and those documents are unique, then the workflows I build against that file cabinet are going to be very simple because the documents are already inherently distinct.
As far as how documents get into the system, that is highly regimented for us as well: we use scan sheets and tend to do almost everything "after the fact". We don't scan in a purchase order as soon as it is printed and then later attached the AP invoice and check to it. The invoice and PO data all go in together, and then the check goes in later and links back to the AP invoice via automated processes written behind the scenes. I see all the sales videos for DocuWare where people scan in a sales order, then the pick tickets, then the packing list, then the BOL, then the invoice, then the payment record -- and I have no idea how folks can trust that they have all the data they need in the right spot. If something doesn't index properly, who verifies that? If someone goes in six months later and cannot find the invoice, how is that handled?
We wait until the sales order is closed, slap a scan sheet on all materials for the order (sometimes 100+ pages) and scan that into our sales order/AR file cabinet. Then, our folks go page by page and make sure everything scanned in correctly. It doesn't stop there. We then run a homegrown application that matches indexes in DocuWare to systems whence the documents came from to see if we have any documents with improper indexes as well as detecting when something in our systems has no scanned materials on record in DocuWare. Consider that Auto-Index on steroids, since it can work against our Foxpro data (Auto-Index cannot work for that -- I know it cannot because there is no reliable ODBC driver for the latest Foxpro DBF file format).
I know that all sounds burdensome (and it is), but it has been in place for years (in Fortis before DocuWare) and at this point I cannot fathom trusting any other methodology to know we for sure have documentation for all of our important processes readable and stored. If we didn't do that? If we used DW printer, and Connect to Outlook, and intelligent indexing and just threw documents all over the place into DocuWare? I can't say I would trust that because our users don't check logs etc. and they ignore error warnings (or just don't see them). We had TWO MONTHS were our production folks were scanning in documents and it was erroring out every time because their destination was set to "Inbox" instead of "Auto Identification" and they didn't have an Inbox. Those documents never got into DocuWare because they were not checking that the documents made it like they were supposed to. We were just lucky we still had the hard-copies back that far (and I have since given them an Inbox and told them they need to check). They had not been checking the scan log and left the scan unattended so missed the popup stating the job had failed (might be nice to show a counter with failed jobs over the little button that displays history – not that they would have noticed that, either…).
Anyway, that is a very long-winded way of saying I think we implement things a bit differently than most people because we have always done things from an on-site, custom programming mindset. Our file cabinets are all carefully designed and normalized to the best of our business-process knowledge. We still have some cabinets we set up a bit funkily, but it all works. And we can always set up a DocID > 0 trigger to get all new documents, so it is no big deal. I just want to make sure folks understand there is definitely a reasonable rationale behind the desire for a generic "everything" trigger, and it doesn't complicate workflows because we have already designed those complications away via (what I consider) proper file cabinet structuring.
Thanks,
Joe Kaufman