Document Identification and Classification

Expert-defined terms from the Advanced Certification in Legal Document Review (United Kingdom) course at London School of Business and Administration. Free to read, free to share, paired with a professional course.

Document Identification and Classification

Affidavit – Sworn statement, evidence – A written declaration made… #

Example: A witness provides an affidavit detailing observations of a contract breach. Practical application: Submitted to the court to establish a factual baseline before discovery. Challenge: Ensuring the affidavit is accurate and does not contain hearsay, as any inconsistency can be exploited during cross‑examination.

Amendment – Supplement, revision – A formal change to a pleading o… #

Example: Filing an amendment to a complaint to add additional causes of action after new evidence emerges. Practical application: Keeps the case record current and reflective of the evolving factual matrix. Challenge: Timing restrictions and potential prejudice to opposing counsel.

Annotation Margin note, comment – A note added to a document duri… #

Example: Annotating a clause in a lease that may be responsive to a rent‑abatement claim. Practical application: Guides later coding and assists senior counsel in triage. Challenge: Maintaining consistency across reviewers and avoiding inadvertent disclosure of privileged material.

Archive – Repository, storage – A collection of inactive or histor… #

Example: An archive of past annual reports stored for statutory retention periods. Practical application: Provides a source for historical precedent during document identification. Challenge: Locating relevant items within large, poorly indexed archives.

Attorney‑Client Privilege – Confidential communication, privilege log<… #

Example: An email from a client to counsel discussing settlement strategy is privileged. Practical application: Flagged during classification to prevent inadvertent production. Challenge: Determining the privilege scope, especially with multiple recipients or third‑party consultants.

Bates Numbering – Identification, sequential label – A system of… #

Example: Bates numbers 001‑0001 to 001‑0500 applied to a batch of PDFs. Practical application: Facilitates precise reference in correspondence and court filings. Challenge: Ensuring continuity when documents are added or removed after initial numbering.

Chain of Custody – Evidence handling, tracking – The documented pr… #

Example: A log detailing each person who accessed a hard drive containing emails. Practical application: Establishes authenticity of electronic evidence. Challenge: Maintaining an unbroken, tamper‑free record, especially with remote access and cloud storage.

Classification – Tagging, taxonomy – The act of assigning a docume… #

Example: Classifying a contract as “Financial‑Related” and “Confidential”. Practical application: Enables efficient searching, filtering, and production. Challenge: Developing a clear taxonomy that aligns with client objectives and avoids overlap.

Confidentiality Agreement – Non‑disclosure, NDA – A contract oblig… #

Example: A mutual NDA signed before sharing due‑diligence documents in a merger. Practical application: Informs the reviewer about restrictions on handling and producing documents. Challenge: Interpreting the scope of confidentiality, especially when the agreement contains vague language.

Consistent Coding – Standardisation, review protocol – Applying un… #

Example: Using the code “R” for responsive and “NR” for non‑responsive uniformly. Practical application: Produces reliable analytics for senior counsel. Challenge: Training reviewers and monitoring for drift over large projects.

Contextual Search – Proximity, phrase search – A search technique… #

Example: Searching for “termination” within ten words of “notice period”. Practical application: Narrows results to more relevant passages. Challenge: Configuring distance parameters to balance recall and precision.

Custodian – Data holder, source – An individual who possesses or c… #

Example: A CFO identified as a custodian of financial records. Practical application: Targeted collection of data from key individuals. Challenge: Negotiating custodial obligations and ensuring completeness of the data set.

Data Mapping – Information flow, architecture – The process of doc… #

Example: Mapping email archives across on‑premises Exchange servers and cloud Office 365 tenants. Practical application: Guides collection scope and identifies hidden repositories. Challenge: Dealing with dynamic environments and legacy systems.

De‑Duplication – Duplicate removal, dedup – The process of identif… #

Example: Removing 3,000 duplicate invoices from a production set. Practical application: Reduces volume and cost of review. Challenge: Ensuring that subtle differences are not lost, especially in near‑duplicate versions.

Document Management System (DMS) – Repository, workflow – Software… #

Example: A DMS that provides version control for contracts and enables audit trails. Practical application: Centralises document access for review teams. Challenge: Integrating the DMS with e‑discovery platforms and maintaining security controls.

Document Review Platform – e‑Discovery tool, interface – A special… #

Example: A platform that supports predictive coding and analytics dashboards. Practical application: Streamlines large‑scale review projects. Challenge: User adoption, licensing costs, and ensuring data integrity during migration.

Document Tagging – Metadata, labeling – Adding descriptive identif… #

Example: Tagging a file as “Sensitive – Personal Data”. Practical application: Assists in automated filtering for privileged or confidential material. Challenge: Avoiding tag fatigue and ensuring tags are applied accurately.

Electronic Discovery (e‑Discovery) – Digital evidence, production … #

Example: Extracting logs from a corporate server for a fraud investigation. Practical application: Expands the scope of discoverable material beyond paper. Challenge: Managing data volume, preserving metadata, and meeting preservation obligations.

Exhibit – Evidence, attachment – A document or item introduced in… #

Example: An email chain submitted as Exhibit A to illustrate breach of contract. Practical application: Must be identified, authenticated, and produced. Challenge: Ensuring the exhibit is complete and free from tampering.

Export Control – Regulatory restriction, cross‑border – Laws gover… #

Example: Restricting the export of encryption keys from the UK to a US reviewer. Practical application: Informs the handling and location of data during review. Challenge: Navigating differing jurisdictions and obtaining necessary licences.

Filter – Search query, rule – A predefined set of criteria used to… #

Example: A filter that excludes all documents older than five years. Practical application: Reduces the number of documents needing manual review. Challenge: Designing filters that do not inadvertently exclude relevant material.

Forensic Review – Integrity check, preservation – Examination of e… #

Example: Using hash values to confirm that a seized hard drive has not been modified. Practical application: Supports admissibility of digital evidence. Challenge: Requiring specialised tools and expertise, especially with encrypted or fragmented files.

Full‑Text Search – Keyword search, OCR – Searching the entire text… #

Example: Locating the phrase “material adverse effect” across a corpus of contracts. Practical application: Uncovers relevant passages that may be missed by metadata alone. Challenge: OCR accuracy and handling multilingual documents.

Granular Review – Document‑level, detailed coding – Examining each… #

Example: Manually reviewing each email in a high‑risk category. Practical application: Ensures precision for highly sensitive or privileged material. Challenge: Time‑intensive and costly for large data sets.

Hold Notification – Preservation notice, litigation hold – Formal… #

Example: An email sent to all finance staff to retain all invoices related to a pending lawsuit. Practical application: Prevents spoliation of evidence. Challenge: Ensuring compliance across dispersed teams and remote workers.

Identification – Recognition, flagging – The process of locating d… #

Example: Identifying all emails containing the term “settlement”. Practical application: Forms the first step in building a responsive set. Challenge: Distinguishing relevance from noise in high‑volume environments.

Immaterial Document – Irrelevant, non‑responsive – A document that… #

Example: A marketing brochure unrelated to the contract dispute. Practical application: Can be excluded from production to reduce cost. Challenge: Correctly classifying borderline documents that may contain hidden relevance.

In‑Camera Review – Private inspection, confidentiality – A review… #

Example: Senior counsel reviewing a privileged email chain in‑camera before deciding on production. Practical application: Protects confidentiality while allowing limited disclosure. Challenge: Maintaining strict chain‑of‑custody and documentation of decisions.

Indexing – Cataloguing, search optimisation – Creating a structure… #

Example: Indexing contracts by parties, dates, and jurisdiction. Practical application: Improves search speed and accuracy. Challenge: Ensuring index fields are populated consistently across diverse file types.

Information Governance – Policy, compliance – The set of policies… #

Example: A corporate policy dictating retention periods for financial records. Practical application: Aligns document handling with legal and regulatory obligations. Challenge: Integrating governance with e‑discovery workflows.

Innocent Error – Clerical mistake, non‑intentional – An unintentio… #

Example: Mis‑filing a non‑responsive document in the responsive folder. Practical application: May be remedied without sanctions if promptly corrected. Challenge: Documenting the error to demonstrate lack of bad faith.

Internal Review – Self‑assessment, quality control – A secondary e… #

Example: A senior associate reviewing a sample of documents tagged as “Privileged”. Practical application: Ensures consistency before external production. Challenge: Allocating sufficient resources without delaying timelines.

Judgment Review – Post‑trial analysis, precedent – Examining court… #

Example: Reviewing a judgment that upheld privilege over email attachments. Practical application: Informs future classification strategies. Challenge: Interpreting nuanced judicial reasoning.

Keyword Search – Term query, Boolean – A search using specific wor… #

Example: “(Breach OR default) AND (contract)” to locate relevant contract disputes. Practical application: Quickly isolates potentially responsive material. Challenge: Balancing recall and precision; overly broad terms generate false positives.

Metadata – File attributes, data about data – Information embedded… #

Example: A PDF’s creation date, author, and document title. Practical application: Aids in authentication and chronological ordering. Challenge: Metadata can be inadvertently altered, risking admissibility.

Non‑Responsive Document – Irrelevant, excluded – A document that d… #

Example: A HR policy manual unrelated to the contractual claim. Practical application: Can be filtered out to streamline production. Challenge: Determining borderline relevance, especially when documents contain mixed content.

OCR (Optical Character Recognition) – Text extraction, image conversio… #

Example: OCR‑processing a batch of scanned contracts to enable full‑text search. Practical application: Turns paper archives into searchable e‑discovery assets. Challenge: Accuracy varies with quality of the original scan and language.

Privilege Log – Privilege register, disclosure – A document listin… #

Example: A log describing each email withheld on the basis of attorney‑client privilege. Practical application: Satisfies disclosure obligations while protecting confidentiality. Challenge: Providing sufficient detail without revealing the privileged content itself.

Proactive Production – Early disclosure, voluntary – Supplying doc… #

Example: Providing all relevant contracts ahead of a subpoena. Practical application: Can reduce later disputes and focus the litigation. Challenge: Ensuring that the produced set is complete and does not waive privilege inadvertently.

Predictive Coding – Technology‑assisted review, TAR – Using machin… #

Example: Training a model on 500 manually coded emails to classify the remaining 50,000. Practical application: Dramatically cuts review time and cost. Challenge: Selecting a representative seed set and defending the methodology in court.

Production Set – Responsive collection, deliverables – The group o… #

Example: A production set comprising 10,000 emails and 500 contracts. Practical application: The final output of identification, classification, and quality control. Challenge: Ensuring completeness, correct formatting, and compliance with court orders.

Redaction – Sanitisation, blotting – The process of obscuring or r… #

Example: Blacking out personal data in a medical record. Practical application: Protects privacy and complies with data‑protection laws. Challenge: Avoiding over‑redaction that could render a document unintelligible or raise privilege disputes.

Relevant Document – Responsive, material – A document that has a l… #

Example: A contract amendment that directly affects the disputed payment terms. Practical application: Drives the core of the production set. Challenge: Defining relevance in complex, multi‑issue litigation.

Responsive Document – Relevant, required – A document that is both… #

Example: An email chain discussing settlement negotiations that falls under a subpoena for communications. Practical application: Must be produced unless a valid privilege or objection applies. Challenge: Distinguishing responsive from merely relevant material that may be outside the request’s temporal or subject limits.

Search Term – Query, keyword – A word or phrase used to locate doc… #

Example: Using “Force Majeure” as a search term to find clauses in contracts. Practical application: Forms the basis of both simple and complex queries. Challenge: Selecting terms that capture all pertinent documents without generating excessive noise.

Segregation – Separation, isolation – The practice of separating p… #

Example: Creating a privileged folder for attorney‑client communications. Practical application: Prevents accidental disclosure of privileged content. Challenge: Ensuring that segregation is thorough and that privileged documents are not inadvertently included in the production.

Spoliation – Destruction, loss – The intentional or negligent dest… #

Example: Deleting emails after a lawsuit is filed. Practical application: Courts may impose sanctions for spoliation. Challenge: Implementing preservation measures early enough to avoid accusations.

Subject Matter Expert (SME) – Domain specialist, consultant – An i… #

Example: A tax specialist reviewing financial statements for a tax‑fraud case. Practical application: Provides context to aid accurate classification. Challenge: Coordinating SMEs with legal teams and ensuring they understand review protocols.

Taxonomy – Classification scheme, hierarchy – A structured framewo… #

Example: A taxonomy that separates “Corporate Governance” from “Commercial Agreements”. Practical application: Standardises coding across large teams. Challenge: Designing a taxonomy that is both comprehensive and intuitive.

Technical Review – IT assessment, infrastructure – Examination of… #

Example: Reviewing the configuration of a SharePoint site to locate relevant documents. Practical application: Uncovers hidden repositories and informs collection strategies. Challenge: Requires specialised technical expertise.

Template Production – Standard format, consistency – Using a prede… #

Example: Providing PDFs with Bates numbers, metadata, and a Table of Contents. Practical application: Meets court rules and facilitates review by the recipient. Challenge: Adapting templates to diverse document types while preserving integrity.

Terminated Document – Closed file, finalised – A document that has… #

Example: A contract marked as “Reviewed – No Issues”. Practical application: Signals completion and allows QA to focus on pending items. Challenge: Ensuring that termination does not occur prematurely, leaving gaps.

Time‑Based Filtering – Date range, temporal – Limiting a search to… #

Example: Filtering emails sent between 1 Jan 2020 and 31 Dec 2020. Practical application: Narrows the set to the relevant period. Challenge: Accurately capturing all relevant dates, especially when timestamps differ across systems.

Top‑Level Folder – Root directory, primary – The highest level dir… #

Example: A folder named “Project X” containing all sub‑folders for that matter. Practical application: Provides a logical starting point for collection. Challenge: Ensuring that important sub‑folders are not missed due to unconventional naming.

Trusted Agent – Independent auditor, third‑party – A neutral party… #

Example: A court‑appointed auditor reviewing the integrity of a data set before production. Practical application: Adds credibility and reduces disputes. Challenge: Coordinating with counsel and maintaining confidentiality.

Undertaking – Promise, guarantee – A formal statement of commitmen… #

Example: A solicitor’s undertaking to keep client information confidential. Practical application: May be required before accessing privileged material. Challenge: Ensuring the undertaking is enforceable and clearly scoped.

Unresponsive Document – Irrelevant, excluded – A document that doe… #

Example: A marketing flyer unrelated to the contractual dispute. Practical application: Can be excluded from production to conserve resources. Challenge: Differentiating unresponsive from marginally relevant material.

Version Control – Revision tracking, change log – Managing multipl… #

Example: Tracking amendments to a lease agreement. Practical application: Prevents the accidental production of superseded drafts. Challenge: Handling versioning across disparate platforms.

Virtual Data Room (VDR) – Secure repository, online – An online pl… #

Example: A VDR used to share confidential contracts with opposing counsel. Practical application: Enables secure, audited access. Challenge: Managing user permissions and ensuring no data leakage.

White‑Letter Production – Non‑responsive disclosure, courtesy – Pr… #

Example: Sending a “white‑letter” of all corporate minutes even though they are not explicitly requested. Practical application: Can foster goodwill and reduce disputes. Challenge: Ensuring that such production does not waive privilege.

Wildcard Search – Partial term, truncation – Using symbols to repr… #

Example: Searching “contract*” to capture “contract”, “contracts”, “contractual”. Practical application: Broadens search to capture variations. Challenge: May generate excessive results, requiring additional filtering.

Workflow Automation – Process streamlining, macros – Using softwar… #

Example: An automated rule that moves all documents flagged as “Privileged” into a dedicated folder. Practical application: Increases efficiency and reduces human error. Challenge: Configuring automation correctly to avoid misrouting.

XML (eXtensible Markup Language) – Structured data, format – A mar… #

Example: Producing documents in XML to meet court‑ordered formatting requirements. Practical application: Facilitates data exchange between systems. Challenge: Converting legacy files to XML without loss of fidelity.

Yielded Document – Produced, disclosed – A document that has been… #

Example: An email yielded after a subpoena. Practical application: Becomes part of the official record and may be subject to further scrutiny. Challenge: Ensuring that yielded documents are complete and unaltered.

Zero‑Day Search – Immediate query, on‑the‑fly – A search performed… #

Example: A rapid search for “confidential” across a newly uploaded data set. Practical application: Provides a quick initial assessment. Challenge: May miss documents not yet processed or indexed.

Zoom Review – Focused inspection, micro‑analysis – A detailed exam… #

Example: Zooming in on a handwritten annotation in a scanned contract. Practical application: Uncovers nuances that broad searches overlook. Challenge: Time‑consuming and may require specialised tools.

Document Identification and Classification – Discovery foundation, tax… #

Example: Employing a combination of keyword searches, metadata filters, and SME input to build a comprehensive responsive set. Practical application: Forms the backbone of any e‑discovery project, ensuring that the correct documents are produced and that privileged material is protected. Challenge: Balancing thoroughness with cost, managing large data volumes, and maintaining consistency across multiple reviewers.

Access Control List (ACL) – Permissions, security – A list that de… #

Example: An ACL granting the litigation team read‑only access to a corporate repository. Practical application: Protects sensitive documents while allowing necessary review. Challenge: Correctly configuring ACLs across heterogeneous environments.

Amicus Curiae – Friend of the court, third‑party – A non‑party who… #

Example: An industry regulator submitting an amicus brief on data‑protection standards. Practical application: May influence the court’s approach to document classification, especially on privilege or confidentiality. Challenge: Ensuring that any amicus‑provided documents are properly identified and classified.

Annotation Layer – Review overlay, markup – A digital layer that s… #

Example: An annotation layer showing “Privileged – Attorney‑Client” tags on specific paragraphs. Practical application: Allows multiple reviewers to comment without altering the source file. Challenge: Synchronising the layer with the original when documents are converted or reformatted.

Appeal – Higher court, review – A request for a higher court to ex… #

Example: Appealing a judge’s ruling on the admissibility of certain documents. Practical application: May affect the classification decisions made during discovery. Challenge: Preparing a record that accurately reflects the original document handling.

Artificial Intelligence (AI) – Machine learning, automation – Comp… #

Example: Using AI to predict document relevance in a large‑scale review. Practical application: Reduces manual effort and speeds up classification. Challenge: Ensuring transparency, explainability, and compliance with legal standards.

Audit Trail – Log, provenance – A chronological record of all acti… #

Example: An audit trail showing who accessed, coded, and produced each file. Practical application: Provides evidence of compliance and can be crucial in spoliation disputes. Challenge: Maintaining a comprehensive trail without overwhelming storage resources.

Baseline Set – Initial collection, reference – The first group of… #

Example: A baseline set of 1,000 manually reviewed emails. Practical application: Establishes a foundation for technology‑assisted review. Challenge: Ensuring the baseline is representative of the broader corpus.

Beneficial Owner – Ultimate controller, shareholder – The natural… #

Example: Identifying the beneficial owner of a shell company in a fraud investigation. Practical application: Helps pinpoint relevant documents tied to the true decision‑maker. Challenge: Uncovering hidden ownership structures across jurisdictions.

Binary Data – Non‑textual, raw – Data stored in a format that is n… #

Example: A proprietary database file containing transaction records. Practical application: May require conversion to a readable format before classification. Challenge: Preserving fidelity and metadata during conversion.

Blind Review – Anonymous, unbiased – A review process where the re… #

Example: A blind review of emails to reduce bias in relevance determination. Practical application: Promotes objectivity in coding decisions. Challenge: Logistical complexity in anonymising large data sets.

Cache – Temporary storage, performance – A location where frequent… #

Example: A browser cache holding recently viewed PDFs. Practical application: Speeds up review but may retain sensitive data unintentionally. Challenge: Clearing caches to avoid inadvertent disclosure.

Chain of Evidence – Custody, continuity – The documented sequence… #

Example: A chain of evidence log for a seized hard drive containing email archives. Practical application: Establishes credibility and admissibility. Challenge: Ensuring no gaps, especially with multiple custodians and cross‑border transfers.

Classification Schema – Structure, hierarchy – The organized set o… #

Example: A schema with levels “Confidential → Financial → Audit”. Practical application: Standardises coding across reviewers. Challenge: Keeping the schema flexible enough to accommodate emerging issues.

Clustering – Group analysis, unsupervised learning – An AI techniq… #

Example: Clustering a set of contracts to identify common clauses. Practical application: Aids reviewers in spotting patterns and potential issues. Challenge: Interpreting clusters and aligning them with legal relevance.

Command‑Line Interface (CLI) – Text‑based control, scripting – A w… #

Example: Using a CLI to batch‑process PDF conversions. Practical application: Enables automation of repetitive tasks. Challenge: Requires technical expertise and careful scripting to avoid errors.

Compliance Review – Regulatory check, audit – Examination of docum… #

Example: Reviewing financial statements for compliance with the UK Bribery Act. Practical application: Identifies gaps before production. Challenge: Staying abreast of evolving regulations across jurisdictions.

Confidentiality Clause – Non‑disclosure provision, restriction – A… #

Example: A clause prohibiting the disclosure of trade secrets during litigation. Practical application: Informs the classification of documents as “Confidential”. Challenge: Interpreting the scope when the clause is broad or ambiguous.

Consolidated Production – Combined set, unified delivery – A singl… #

Example: A consolidated production of all emails from five senior managers. Practical application: Simplifies the exchange process. Challenge: Ensuring that no duplicate or omitted documents exist across sources.

Contextual Metadata – Environmental data, usage – Information abou… #

Example: GPS coordinates embedded in a photo’s EXIF data. Practical application: Can be used to verify authenticity or locate the source. Challenge: Privacy concerns and the potential for metadata stripping.

Contractual Obligation – Binding duty, agreement – A duty imposed… #

Example: A service‑level agreement requiring the maintenance of logs. Practical application: Guides identification of relevant operational records. Challenge: Interpreting vague obligations and linking them to specific documents.

Cross‑Reference – Linkage, citation – A reference within a documen… #

Example: A contract clause referencing Schedule 2. Practical application: Aids navigation and ensures completeness of the production. Challenge: Tracking cross‑references across multiple files and versions.

Data Breach – Security incident, unauthorized access – An event wh… #

Example: A cyber‑attack exposing client emails. Practical application: May trigger preservation duties and affect privilege claims. Challenge: Quickly identifying affected documents and mitigating legal exposure.

Data Extraction – Harvesting, pull – The process of retrieving spe… #

Example: Extracting all invoice numbers from a set of PDFs. Practical application: Enables targeted production of specific data points. Challenge: Handling diverse file formats and ensuring accuracy.

Data Minimisation – Limited collection, necessity – The principle… #

Example: Limiting the collection of employee emails to those containing the keyword “contract”. Practical application: Reduces exposure and compliance risk. Challenge: Balancing thoroughness with privacy obligations.

Data Retention Policy – Archive schedule, governance – A set of ru… #

Example: A policy mandating the deletion of marketing emails after three years. Practical application: Influences what documents are available for identification. Challenge: Ensuring the policy aligns with legal hold requirements.

Data Set – Collection, corpus – A group of electronic files gather… #

Example: A data set of 2 million emails related to a procurement dispute. Practical application: The primary object of identification and classification. Challenge: Managing size, format diversity, and storage constraints.

Database Export – Data dump, extraction – The process of pulling d… #

Example: Exporting a SQL table of client transactions to CSV for review. Practical application: Provides a structured source of documents. Challenge: Preserving relational integrity and metadata.

Defendant – Respondent, accused – The party against whom a claim i… #

Example: The supplier sued for breach of contract. Practical application: Responsible for producing documents in response to the plaintiff’s request. Challenge: Negotiating scope and dealing with privilege objections.

Deposition Transcript – Sworn testimony, record – A written record… #

Example: A transcript of a witness’s deposition regarding a disputed contract clause. Practical application: May be used to identify additional responsive documents. Challenge: Ensuring accuracy and dealing with objections to the transcript’s admissibility.

Descriptive Metadata – Labeling, attributes – Information that des… #

Example: A PDF’s metadata showing “Title: Purchase Agreement”. Practical application: Assists in search and classification. Challenge: Metadata may be incomplete or intentionally altered.

Digital Forensics – Evidence analysis, preservation – The applicat… #

Example: Analyzing a corrupted hard drive to retrieve deleted emails. Practical application: Validates authenticity and uncovers hidden data. Challenge: Requires specialised tools and may be time‑intensive.

Document Custody – Control, handling – The responsibility for main… #

Example: Assigning a senior associate as custodian of the production set. Practical application: Ensures chain‑of‑custody compliance. Challenge: Managing access across multiple jurisdictions.

June 2026 intake · open enrolment
from £90 GBP
Enrol