Posted Thu, 08 Feb 2024 09:47:46 GMT by Simone Sola
Hello everyone,
I'm trying to implement, using the official API REST collection, a web-interface that allows the user to:
1) Input some date intervals;
2) Select the document type (infact, this web-interface, it won't be using a specific cabinet, it has to navigate through all the cabinets - which are many - and then, based on this filter, search into a custom DB field that contains the user-inserted file type);
3) Check the found documents to decide if proceeding with the deletion or not;
4) Finally delete the documents he/she wants to delete.

The first issue that I'm facing is the inability to check for custom-made fields in the DB using the REST API collection, then, as far as I can see in that REST collection, there aren't any "ready-made" APIs to check for duplicates (the only API that I found that can do this is deprecated...), so I have to built it from scratches.

Am I missing something? Any suggestion?

Best regards

Simone Sola 
Posted Thu, 08 Feb 2024 10:05:35 GMT by Simon H. Hellmann Toshiba Tec Germany Imaging Systems GmbH IT-Consultant Document Management Solutions
Hello Simone Sola,

if you just want to delete or flag duplicates, you can use Workflow or Autoindex for that.
Check the documentation here: Find Duplicates with Autoindex or Find Duplicates with Workflow

If you need the API, refer to the documentation over at developer.docuware.com and the Platform description on your DocuWare Server: 
https://{DocuWareServer}/DocuWare/Platform/

Using the Endpoint /DocuWare/Platform/FileCabinets/ allows you to retrieve information about all file cabinets the currently logged-in user has access to.
Then you can use /DocuWare/Platform/FileCabinets/{FileCabinetGUID}/ to retrieve information about a single file cabinet, including all existing index fields in this cabinet (which the user has access to).

I am assuming here that by the term custom-made fields in the DB you mean the index fields of a given file cabinet.

Hope that helps.

Greetings from Germany,
Simon H. Hellmann
DocuWare System Consultant
Posted Thu, 08 Feb 2024 11:47:15 GMT by Simone Sola
Hello and thanks for this fast reply.
Unfortunately everything you suggested not only it was already tried by me, but doesn't really accomplish what I'm trying to achieve.

I've checked the links you gave me about using Autoindex and Workflows, but I can't access the DB field names and that doesn't give the opportunity to embed those features in a custom-made web-interface. Also, those links, didn't provide info about scanning ALL THE CABINETS and finding ALL the various document types as duplicated.

The APIs, which I tried to use, don't give me access to custom-ade fields, which aren't fields used for indexing but are used as document properties: even if I have access to all of those fields, I should prepare a scripts that checks ALL the fields in order to find a duplicated file, there isn't a dedicated API for that.

Thanks, have a nice day.
Posted Mon, 12 Feb 2024 09:52:46 GMT by Tobias Getz DocuWare GmbH Team Leader Product Management

Hi Simone Sola,

DocuWare does not offer a dedicated API to detect duplicate documents

I think we need a bit more information on how you want to do the duplicate check and what is meant about "custom-made" fields. 
Also the API that you call deprecated would be interesting to know.

If documents are in different file cabinets the search is getting more complicated as you have to make sure, the database fields share the same name, but if you do your own implementation you could possible overcome this and work around it. But for giving better suggestion we would need more information.

Regards
Tobias Getz
Team Leader Product Management
DocuWare GmbH

Posted Mon, 12 Feb 2024 10:28:44 GMT by Simone Sola
Hello Tobias and thanks for your kind reply.
I will try to give you a better overview and to explain what I would like to build: we use several file cabinets and in each of those some documents (which are kinda similar form cabiner to cabinet (let me give you an example: 5 different cabinets that contains documents that are almost identical - structurally speaking - from one cabinet to another, so despite having 5 distinguished file cabinets the file structure inside every cabinet is almost the same). I would like to search for duplicates in ALL CABINETS by comparing ALL the fields that compose the document (the fields that construct the skeleton of the file, that permits to it to be indexed correctly and stored into the DB) simultaneously, using filter parameters like chosen by the user in a web-based interface (read the first post). The fields that I want to compare are the fields of the document, not of the cabinet.
Is it clearer? Do you think that DocuWare will implement a similarly-behaving API in the future?
Best regards

Simone Sola
Posted Mon, 12 Feb 2024 13:18:51 GMT by Tobias Getz DocuWare GmbH Team Leader Product Management
Hi Simone,

thanks for your explanation. Unfortunatly, I still do not understand "The fields that I want to compare are the fields of the document, not of the cabinet."
Could you give an example on this?

Regards
Tobias Getz
Team Leader Product Management
DocuWare GmbH​​​​​​​
Posted Mon, 12 Feb 2024 13:39:01 GMT by Simone Sola
Hello and thanks again for the reply.
Checking the API REST collection, I came across several APIs to query the documents, and the nearest to what I'm searching (but still kinda far away...) is "Get All Sections from a Document". If I'm getting this correctly, by passing the ID of a document, I can retrieve ALL THE FIELDS that compose that document. Let's pretend that our document contains 5 fields, 3 are used for searching and indexing ("Date", "Company Name", "Company Address" and 2 aren't ("Total price" and "Author") - it doesn't make a lot of sense, but it's just to better understand the logics. By passing the document ID, then, I can obtain all the values of those 5 fields, right? So, by using this API, I can retrieve the IDs of all the documents in all the cabinets, and then, using this IDs, I can compare all the fields that I wan't to compare in all the files by using that API, right?
Posted Tue, 13 Feb 2024 12:13:15 GMT by Tobias Getz DocuWare GmbH Team Leader Product Management
Hi Simone Sola,

to query for documents the proper request is "Search for Documents in a Single File Cabinet" or maybe also "Search for Documents in Multiple File Cabinets". With this you get all index entries (except keywords and tables) of the found documents. If you need really all meta data for documents you have to use "Get a Specific Document From a File Cabinet".

The "Get All Sections from a Document" is about getting the data about all files of one document (e.g. a document consists of a pdf-file and a docx-file clipped together). This call would return this information.

I think the two calls for searching in a file cabinet will give you enough information about finding duplicates.

Regards
Tobias Getz
Team Leader Product Management
DocuWare GmbH​​​​​​​
 
Posted Tue, 13 Feb 2024 12:26:00 GMT by Simone Sola
Hello,
thanks for your reply. I know what that API does, and that's wy I would like to use that. Also, the solution you suggest, is really resource-consuming and very heavy: we have almost 200000 files that needs to be checked...
Thanks, I will find a solution.

Best regards

You must be signed in to post in this forum.