ChatFlow Logo

Uploading Regulatory Documents

How to upload regulatory documents with automatic detection, smart chunking, and citation tracking

Uploading Regulatory Documents

ChatFlow includes advanced document processing for regulatory and compliance documents. The system automatically detects document types and applies specialized processing to preserve structure and enable precise citations.

How It Works

When you upload a document, ChatFlow automatically:

  1. Detects the document type - Regulatory, standard, or scanned
  2. Applies smart chunking - Preserves sections, chapters, and hierarchy
  3. Generates citation metadata - Section numbers, titles, and page references
  4. Enables source tracking - Shows exact sources in chatbot responses

Document Types

Regulatory Documents

Documents with formal structure like:

  • Government regulations and acts
  • Compliance manuals
  • Policy documents
  • Legal frameworks
  • Industry standards

Detected patterns:

  • PART 1, PART 2, etc.
  • CHAPTER 1, CHAPTER 2, etc.
  • Section 1.1, Section 4.2.1, etc.
  • ARTICLE I, ARTICLE II, etc.
  • Regulation, Rule, Schedule, Appendix sections

Standard Documents

General documents without regulatory structure:

  • Product manuals
  • FAQ documents
  • Training materials
  • General guides

Scanned Documents

Documents with limited extractable text:

  • Image-based PDFs
  • Scanned paper documents

Note: Scanned documents show a warning. OCR support is planned for a future update.

Smart Chunking for Regulatory Documents

Standard document chunking splits text at arbitrary points. For regulatory documents, ChatFlow uses section-aware chunking that:

  • Keeps logical sections together
  • Preserves hierarchy (Part > Chapter > Section > Subsection)
  • Adds section context to each chunk
  • Enables precise citations

Example

A regulatory document like:

PART 4 - PILOT LICENSING

Section 4.1 General Requirements
All pilots must hold a valid license issued by...

Section 4.2 Medical Certificates
4.2.1 Class 1 Medical Certificate
Required for commercial pilots...

Gets chunked with full context:

[Section 4.2.1: Class 1 Medical Certificate]

Required for commercial pilots...

Hierarchy: PART 4 > Section 4.2 > 4.2.1

Citation Metadata

Each chunk includes metadata for source tracking:

FieldDescriptionExample
section_numberSection identifier4.2.1
section_titleSection headingClass 1 Medical Certificate
page_startEstimated start page12
page_endEstimated end page13
hierarchy_pathFull pathPART 4 > Section 4.2 > 4.2.1

How Citations Appear

When your chatbot answers questions using regulatory documents, users see:

  • Document title - Which document the answer came from
  • Section reference - Exact section number and title
  • Page numbers - Where to find the original content
  • Confidence score - How relevant the source is

Example citation:

[Source: Aviation Regulations Manual, Section 4.2.1, Pages 12-13]

Uploading Regulatory Documents

Step 1: Navigate to Documents

  1. Go to Chatbots in the sidebar
  2. Select your chatbot
  3. Click the Documents tab

Step 2: Upload Your Document

  1. Click Upload Document or drag and drop
  2. Select your PDF or DOCX file
  3. Wait for processing to complete

Step 3: Review Detection Results

After processing, you'll see badges indicating:

BadgeMeaning
RegulatoryDetected as regulatory document with smart chunking
CitationsCitation metadata has been generated
StandardProcessed with standard chunking
ScannedLimited text extraction (OCR coming soon)

Step 4: Verify in Playground

  1. Go to the Playground tab
  2. Ask a question about the document
  3. Check that citations appear with section references

Best Practices

Document Preparation

  • Use text-based PDFs - Not scanned images
  • Keep formatting clean - Clear headings and sections
  • Include table of contents - Helps detection accuracy

Naming Conventions

  • Use descriptive file names
  • Include version or date if applicable
  • Example: Aviation-Regulations-2024.pdf

Content Organization

  • Upload one regulation per document
  • Split very large documents by major parts
  • Keep related documents grouped by chatbot

Detection Confidence

The system reports confidence levels:

LevelMeaning
High10+ section patterns found
Medium5-9 section patterns found
Low3-4 section patterns found

Higher confidence means more accurate section detection and citations.

Supported Formats

FormatExtensionSmart Chunking
PDF.pdfYes
Word.docxYes

Troubleshooting

Document Shows as "Standard" Instead of "Regulatory"

  • Check that section headings are on their own lines
  • Verify headings use patterns like "Section 1.1" or "CHAPTER 1"
  • Documents need at least 3 section patterns for detection

Citations Not Appearing

  • Verify document processing completed successfully
  • Check that the question relates to document content
  • Try asking about specific sections by number

Scanned Document Warning

  • The document contains mostly images, not text
  • Re-export from original source as text-based PDF
  • OCR support is planned for a future update

Next Steps