|
|
|
|
|
|
Electronic Data Discovery - Processing
Before a closer look is performed on the document set, WWD can initiate an automated processing step in order to extract important data and to filter out unneeded documents. This helps eliminate a large percentage of documents that would otherwise have to be manually reviewed. Specific steps included in the processing stage are:
-
Email extraction
A large portion of critical ESI is found in email archives. WWD's EDD solution extracts all emails from the archive, including any attachments, while still keeping the logical parent/child relationship in email threads. Supported email archives include Outlook PST files, Lotus Notes NSF files, Unix mail archives, and more.
-
Text/metadata extraction and OCR
Fulltext and metadata items (author, subject, date, etc.) are extracted and associated with the document they come from, allowing for easy searching and filtering during the review step. All common file types and several hundred in total are supported. For documents that do not have fulltext associated with them (e.g. scanned PDF files), high-quality OCR is performed to allow fulltext searches to include these documents. Unicode text can be processed as well, allowing for multi-language support.
-
De-duplication
Duplicate documents can be removed from the data set or tagged as duplicates, eliminating the amount of review needed. De-duplication is customizable, comparing either file hash values (stricter de-dupe criterion), or comparing metadata values (looser de-dupe criteria).
-
File archive extraction
Files located in archive files such as .zip, .gzip, .rar, .cab, and others are extracted and the logical relationship is kept between these files.
-
Filetype restrictions and other filters
In some cases the customer will know beforehand that no relevant data is contained in certain filetypes. WWD can process all files except for those of a specified file type to reduce the workload during the review stage. Other restrictions and filters can also be applied to further reduce the amount of review and processing time required, including date restrictions, domain-name filters to eliminate spam messages from email archives, and other customized filters.
|
|
Privacy Statement & Terms | Site Map | Contact Us
© Copyright 2009 Worldwide Digital (USA), Inc. All Rights Reserved.
|