How can I search for all files that CDS has flagged as "document is corrupted"?
Good tips. Thanks. I stumbled upon the same technique of searching for a space.
My main problem is with PDFs flagged as corrupt, which is crippling my work. I randomly sent one to tech support, which replied that the files format was PDF 2.0, which they say CDS does not support. News to me; it's not in the documentation and should be prominently stated.
I then did a search for all PDFs, also with the search for a space character technique. I tried copying all of the "corrupt" PDFs to a new folder. (The uncovered another problem described further below; multi-file copies by using right-click no longer works, but there is a clumsy workaround for that). I have >5,000 "corrupt" PDFs that haven't been indexed (a small fraction of all that I have), although most are valid PDFs containing searchable text.
On the PDF copies in the new folder, I renamed all to *.txt. This allowed me, in my file manager (Dopus), to see the files as text and quickly flip from file to file to view the beginnings of the files in a viewer pane that Dopus has. The first line of a PDF shows the PDF version used to encode it (e.g. %PDF1.5, %PDF-2.0, etc.). The vast majority are %PDF-2.0. Of the others (%PDF-1.0 through %PDF1.7), some are image-only PDFs and the others may actually corrupt (perhaps fixable with a repair utility).
Tech support says that CDS support for %PDF2.0 files will be done certainly by a major release next spring, perhaps sooner. I hope it is sooner, since my work is hamstrung until then, unless I can find a way to convert these files to an earlier PDF version (difficult, even if possilbe, because they are filed among many folders). The PDF 2.0 format became ISO certified in 2020.
A WORK-AROUND TO COPY OR DELETE MULTIPLE FILES (as noted above)
In CDS version 8, if multiple files are selected to copy or delete, as soon as right-click is done to display the pop-up menu to make the choice, CDS immediately reverts to only one file being selected. Thus, this feature in earlier versions is broken (but tech support says they will fix it). The work-around is to use the copy or delete function in the File menu. Although that works, I find I can't break the habit of using right-click (which is so convenient), and frequently have to re-select the files I need to manipulate.
I put a single space in the search box and it should return the same number of documents as mentioned by Index Status. Afterwards, I sort the results columns by clicking on the column header related to whether an item is index correctly of not. I then count how many corruptions are lined up.
Btw, if you are getting a lot of supposedly corrupted emails (e.g. I had 1003 supposedly corrupted emails out of 70K of indexed emails in Outlook), try archiving them into a PST file - This seemed to fix the problematic-indexing of these emails for me.
The only problem thereafter was the related attachments weren't getting indexed - This is still sitting with Support... My own temporary workaround (not suggested by anyone), is to include both standalone PST and Outlook-opened PST in its indexing scope - Yes, I know it adds dupes, but I prefer to see dupes (for now,) which is better than loosing attachment visibility