capability

Structure from chaos.

Most documents arriving in your enterprise are unstructured — scanned forms, PDF invoices, handwritten field reports, multi-language regulatory filings, photographed permit sheets. TeamSync's Metadata Extraction capability turns them into structured records: typed metadata, extracted text, classified document type, all stored against the document in the Intelligent Repository.

Talk to an IDP solutions engineer · Compare to ABBYY · Compare to Hyperscience

Diagram: an unstructured document (scan, PDF, photo, handwritten) entering an extraction pipeline (OCR / ICR / classification / field extraction) and emerging as structured metadata bound to the document.
Diagram: an unstructured document (scan, PDF, photo, handwritten) entering an extraction pipeline (OCR / ICR / classification / field extraction) and emerging as structured metadata bound to the document.

What's in the capability.

Component Purpose
OCR (Optical Character Recognition) 100+ languages; printed text from scans, PDFs, photos
ICR (Intelligent Character Recognition) Handwriting via ML models; checkbox + signature detection
Document classification Auto-tag document type (invoice / contract / claim / form / report) per the Intelligent Repository taxonomy
Field extraction Type-specific field extraction (amount + due date for invoice; counterparty + effective date for contract; patient ID + diagnosis for clinical note)
Confidence scoring Per-field confidence; below-threshold fields flagged for human review
Human-in-the-loop verification Review interface for low-confidence extractions; corrections feed back into model training (with consent)
Audit ledger Every extraction event anchored: source document, model version, output, human corrections

How TeamSync compares.

Capability TeamSync ABBYY Vantage Hyperscience Rossum Tungsten Automation
Native to the document platform (no integration) Standalone Standalone Standalone Standalone
Multilingual OCR (100+ languages) ✅ Strong Limited Strong (EU focus)
ICR (handwriting) ✅ Strong Limited
Per-field confidence with human-in-the-loop
Audit ledger Merkle anchor per extraction Standard log Standard log Standard log Standard log
Per-cluster pricing (no per-page metering) Per-page Per-page Per-document Per-page

Read the IDP alternative comparisons →


Frequently asked questions.

What about edge-case OCR accuracy — handwriting, low-quality scans?

For the 90% of enterprise documents (printed text, standard forms), TeamSync meets the production accuracy bar. For edge cases (handwriting on legacy forms, very-low-quality scans, exotic-language handwriting), TeamSync coexists with specialist IDP tools (ABBYY, Hyperscience) that can be invoked as workflow nodes.

Can I train a custom field extractor?

Yes. Customers train custom extractors for their document types via the human-in-the-loop verification interface. Training data is tenant-isolated.

Does it handle multi-page documents?

Yes including page-segment classification (a single PDF containing multiple document types is split and each section classified separately).


Talk to us

Bring the question on your desk this week.

A 30-minute conversation with a solutions engineer who already speaks your industry. No pitch deck.