Veröffentlicht Thu, 18 Jan 2018 16:09:12 GMT von Seth Jaco Support Specialist

I have a question about migrating PDF files from a different document management program. They will be exporting the PDF's along with a .CSV file that contains all the data for the previously stored files. I know that I can create an import configuration that looks  at the metadata, would that work with a .CSV file and would there be any limit on how many files it could do? So if they had 1k PDF's and a single .CSV file that contained all the data, would that be OK?

 

Thanks

Veröffentlicht Thu, 18 Jan 2018 21:48:08 GMT von Chris McFarland Sr. Document Workflow Specialist

So cool that you asked this question.  I have no help to offer, but I'm eager to see the response.  Pretty much THE standard for transporting document images and metadata (no secret), and I, too, am curious how to best configure DocuWare to get this done.

Veröffentlicht Fri, 19 Jan 2018 07:52:20 GMT von David Barlet Technical Manager

Dear all,

Yes you can do that with ASP.NET and DocuWare. You can use the easy uploading huge files : http://help.docuware.com/sdk/platform/html/c3431d10-b73e-4b98-814a-a0d23...

By experience there is no limit (or very very high). I often used that and I already did for 150 000 documents.

Best regards,

David

Veröffentlicht Thu, 25 Jan 2018 08:22:13 GMT von Tobias Getz DocuWare GmbH Team Leader Product Management

An idea without having do write one line of code would be to use DocuWare Import to store the files in DocuWare and Autoindex to index them. You need to have a unique link between the PDF files and the data record in the CSV file.

With DocuWare Import: The only index entry would be the file name (or maybe the directory name - depends on the exported file structure).

Afterwards you can use Autoindex to add the index entries from your CSV file to the documents in the file cabinet.

Veröffentlicht Mon, 29 Jan 2018 18:35:05 GMT von Pedro E. Gonzalez-Santini Gm

DocuWare does not provide this funtionality.  it does offer the DWControl file capability. The csv can be read to create a DWControl file for each record. If someone has already done this please let us know.

Veröffentlicht Mon, 29 Jan 2018 18:44:51 GMT von Joe Kaufman Bell Laboratories Inc No longer there

We converted all of our Fortis documents (over 700,000 documents) to DocuWare using DWControl files. That methodology was used from the start because I thought we would use Import Jobs. That turned out to be really slow in testing, so I ended up writing a .NET application that uploaded all the documents and generated indexes based on DWControl files.

If I had a CSV file, that would actually be even easier using the Platform SDK libraries in .NET. I only used DWControl files because I already had them all generated for the files to upload.

Is Platform SDK programming an option? That would be the most flexible way to go on this.

 

Thanks,

Joe Kaufman

Veröffentlicht Mon, 29 Jan 2018 18:49:21 GMT von Callum McGlynn Technical Solutions Manager

We have a tool you can purchase which does exactly what you are after.

 

Feel free to contact me - callum@elite-ds.co.uk

Veröffentlicht Mon, 29 Jan 2018 18:58:31 GMT von Pedro E. Gonzalez-Santini Gm

Thanks. Yes, we are considering using the SDK for this but also any existing tool.

Veröffentlicht Mon, 17 Aug 2020 06:21:12 GMT von Tiago Matos CoreForm Business Technology - Software Implementation Specialist
Hi all,

I also want to import data from an CSV file into a File cabinet in Docuware.

I'm looking at this documentation here https://myCompany.docuware.cloud/DocuWare/Platform/Schema/File/schema-0.xsd
But I would like to know if Docuware have HTTP endpoints similar to what stripe have (https://stripe.com/docs/api/customers/create?lang=curl)

i.e.
POST - https://api.stripe.com/v1/customers
GET - https://api.stripe.com/v1/customers/cus_123456789/
DELETE - https://api.stripe.com/v1/customers/cus_123456789/

Thanks
Veröffentlicht Thu, 08 Oct 2020 17:42:10 GMT von Mark Massey Cornerstone Building Brands
Joe Kaufman: You note in your response 3 years ago that importing with import jobs was really slow in testing, so you wrote an app to do so. How much faster is that approach? I've got tons of data to move from Fortis.

Thanks in advance.
Veröffentlicht Thu, 08 Oct 2020 17:59:56 GMT von Joe Kaufman Bell Laboratories Inc No longer there

Mark,

The import was very fast, mainly because I wrote it as a multi-threaded app that did various document types at the same time (wrote the utility in C#/.NET).

It actually took longer to get all the documents out of Fortis, which I did via Foxpro and the Fortis COM object. That job, too, could be split up just by running code on two or three workstations (old-fashioned multi-threading.  *smile*)

I can't really share the multi-threaded nature of the code because that depends on how documents are segmented. But if you need examples of code to upload a document and index it, I can post snippets that might go beyond what is in online documentation.

How much faster than import files? I recall it being a lot faster, even without multi-threading. But, I didn't spend that much time trying to make the import jobs faster. There are various things you can tweak, I just never saw much performance improvement. That's why I decided to roll my own solution and I never looked back. Plus, we still use parts of the application I wrote for ongoing integrations between DocuWare and our custom-developed apps.

Good luck!

Joe Kaufman

Veröffentlicht Mon, 12 Oct 2020 13:52:33 GMT von Mark Massey Cornerstone Building Brands
Joe,
I'm already using the old-fashioned multi-threading. Lots of experience with that...

If you wrote code to import multiple doc types simultaneously, then you went all-out. Did it report how many files were imported? For us, single doc types per run are fine. I would like to see a snippet or two, if possible. The code examples look fairly straightforward. Is that the case? 


Thanks,
massey
Veröffentlicht Mon, 12 Oct 2020 14:30:33 GMT von Joe Kaufman Bell Laboratories Inc No longer there
Mark,

To clarify a few things, the methodology I used required pulling all Fortis documents out and placing them in a directory, along with the dwcontrol files containing their indexes. That was one part of this, and it was written in Foxpro against the Fortis COM object. This process also created a Foxpro table with all document information, including file names, Fortis database locations, Fortis document types, and the Fortis document ID.

Once all the documents and their indexes were in the right place (which you would have to do anyway if you wanted to use Import Jobs), the C#/.NET application put them in the right file cabinets with their indexes. All cabinets involved in the migration had three fields in them so I could back-link them to Fortis:

Fortis DB Location
Fortis Document Type
Fortis Document ID

In that sense, the process was self-logging because at any point I could query DocuWare's SQL Server database and see how many documents were migrated. But the C# program also logged progress, in the GUI and with summary results at the end. Since this involved creating my own code, I could do whatever I wanted with regard to logging and progress-reporting.

As for whether or not the code is straightforward, I would say the actual calls to DocuWare were the easier part. More work was put into deciding how to organize the documents and getting the multi-threading working. The calls to DocuWare followed examples already listed in DocuWare online help. Here are examples of the main routines that stored documents to DocuWare and indexed them (same name, UploadFileToFileCabinet, overloaded static methods):
 
        public static Document UploadFileToFileCabinet(FileCabinet fileCabinet, string uploadFile, string[] indexFieldNames, dynamic[] indexValues, bool timestampsAlreadyUTC = false)
        {
            Document indexInfo = new Document();
            indexInfo.Fields = new List<DocumentIndexField>();
            int numIndexFields = Math.Min(indexFieldNames.Length, indexValues.Length);
            string fieldName = "";
            dynamic value = null;
            Type valueType = null;
            for (int i = 0; i < numIndexFields; i++)
            {
                fieldName = indexFieldNames[i];
                value = indexValues[i];
                valueType = ((ObjectHandle)value).Unwrap().GetType();
                if (valueType.Name.ToUpper() == "STRING")
                {
                    indexInfo.Fields.Add(DocumentIndexField.Create(fieldName, (string)value));
                }
                else if (valueType.Name.ToUpper() == "INT")
                {
                    indexInfo.Fields.Add(DocumentIndexField.Create(fieldName, (int)value));
                }
                else if (valueType.Name.ToUpper() == "DECIMAL")
                {
                    indexInfo.Fields.Add(DocumentIndexField.Create(fieldName, (decimal)value));
                }
                else if (valueType.Name.ToUpper() == "DATETIME")
                {
                    DateTime timestamp = (DateTime)value;
                    if (!timestampsAlreadyUTC)
                    {
                        // We need to make timestamps UTC (Universal Coordinated Time) for storage in DocuWare, as that is how
                        // all date/time fields are stored in the database.
                        timestamp = timestamp.ToUniversalTime();
                    }
                    indexInfo.Fields.Add(DocumentIndexField.Create(fieldName, timestamp));
                }
            }
            return UploadFileToFileCabinet(fileCabinet, uploadFile, indexInfo);
        }

        public static Document UploadFileToFileCabinet(FileCabinet fileCabinet, string uploadFile, Document indexInfo = null)
        {
            LastError = "";
            try
            {
                // We will use a standard upload method for smaller files, but the "Easy" version for larger files, otherwise
                // large file uploads might fail (the "Easy" version is meant for huge file uploads, according to the documentation).
                int largeFileThresholdInBytes = 10 * 1024 * 1024;    // 10 MB
                Document uploadedDoc = null;
                FileInfo fileInfo = new FileInfo(uploadFile);
                if (fileInfo.Length < largeFileThresholdInBytes)
                {
                    // Smaller file.
                    uploadedDoc = fileCabinet.UploadDocument(indexInfo, fileInfo);
                }
                else
                {
                    // Larger file.
                    uploadedDoc = fileCabinet.EasyUploadSingleDocument(fileInfo, indexInfo);
                }
                return uploadedDoc;
            }
            catch (Exception ex)
            {
                LastError = ex.Message;
                return null;
            }
        }

You need to have the DocuWare API NuGet packages installed for the above code to truly make sense. And of course these routines require helper routines to do things like get a FileCabinet object by file cabinet name, etc. These routines were a mix of using DocuWare API classes and hitting SQL Server directly to gather various GUIDs and such. I can't really post all that without posting the whole code base.

So, maybe not quite as straightforward as I said.  *smile*  But it all originated from DocuWare Platform (also known as the "SDK" or "REST API") examples. That documentation can be found here:

https://developer.docuware.com/dotNet_CodeExamples/d25612d7-49fa-4d66-bfcd-67c5591381f7.html

That's how I came up with most of my code base, and it took a while. Some of that was due to writing routines in both Foxpro and C#, though. If you go with C#/.NET for all of it, and don't worry about multi-threading, it should come together more quickly.

Let me know if you have specific questions or if the overall outline of how I did things remains unclear!

Thanks,
​​​​​​​Joe Kaufman
Veröffentlicht Mon, 12 Oct 2020 18:42:40 GMT von Mark Massey Cornerstone Building Brands
It appears you concentrated on exporting everything first and saving the data in a DB (FoxPro), then imported it all using that DB and creating any needed cabinets or fields on the fly. That's pretty slick. I don't think I'm going to want to do that 'create' step, but I'll definitely look into coding an import processor. It looks pretty straightforward to me. Might have time to try an exporter, also. 

Thansk, Mr. Kaufman. This may save us quite a bit of time.
Veröffentlicht Mon, 12 Oct 2020 19:08:52 GMT von Joe Kaufman Bell Laboratories Inc No longer there
Mark,

One more point of clarification -- my code does NOT create file cabinets. A FileCabinet object in C# is just a programmatic representation of an existing DocuWare file cabinet. I still had to create all the file cabinets by hand, saving off the XML used to create them as I went along (so I could more quickly create them when ready to go live).

My code just takes a document that is saved as a file, loads the dwcontrol file alongside it (as indexes), then uploads the file and applies indexes using the DocuWare API routines UploadDocument and EasyUploadSingleDocument (depending on document size). In fact, my code does not show the part where indexes are generated from the dwcontrol file -- that was a different routine. As long as you can load indexes into a DocuWare Document class (in code), you can use whatever indexing scheme you want.

C# can manipulate the Fortis COM object, too, I just already had the Foxpro code for that. So, you could, if you wanted, use a single C# program that does everything in one flow, one document at at time:
 
  1. Find a the document in Fortis you want to migrate.
  2. Save the document itself as a file to a known location.
  3. Also save the indexes for the documents in memory (as variables in your program)
  4. Establish a connection to DocuWare and get a FileCabinet object instanced, representing the destination for the document.
  5. Upload the document and indexing information to DocuWare, assuming the indexes on the DocuWare cabinet match the indexes on the original DocuWare document.

If any of this helps you, I am glad! Good luck!

Thanks,
Joe Kaufman

 
Veröffentlicht Mon, 12 Oct 2020 20:14:14 GMT von Mark Massey Cornerstone Building Brands
Thanks!
 

Sie müssen angemeldet sein um Beiträge in den Foren zu erstellen.