Uploading Regulatory Documents
ChatFlow includes advanced document processing for regulatory and compliance documents. The system automatically detects document types and applies specialized processing to preserve structure and enable precise citations.
How It Works
When you upload a document, ChatFlow automatically:
- Detects the document type - Regulatory, standard, or scanned
- Applies smart chunking - Preserves sections, chapters, and hierarchy
- Generates citation metadata - Section numbers, titles, and page references
- Enables source tracking - Shows exact sources in chatbot responses
Document Types
Regulatory Documents
Documents with formal structure like:
- Government regulations and acts
- Compliance manuals
- Policy documents
- Legal frameworks
- Industry standards
Detected patterns:
- PART 1, PART 2, etc.
- CHAPTER 1, CHAPTER 2, etc.
- Section 1.1, Section 4.2.1, etc.
- ARTICLE I, ARTICLE II, etc.
- Regulation, Rule, Schedule, Appendix sections
Standard Documents
General documents without regulatory structure:
- Product manuals
- FAQ documents
- Training materials
- General guides
Scanned Documents
Documents with limited extractable text:
- Image-based PDFs
- Scanned paper documents
Note: Scanned documents show a warning. OCR support is planned for a future update.
Smart Chunking for Regulatory Documents
Standard document chunking splits text at arbitrary points. For regulatory documents, ChatFlow uses section-aware chunking that:
- Keeps logical sections together
- Preserves hierarchy (Part > Chapter > Section > Subsection)
- Adds section context to each chunk
- Enables precise citations
Example
A regulatory document like:
PART 4 - PILOT LICENSING
Section 4.1 General Requirements
All pilots must hold a valid license issued by...
Section 4.2 Medical Certificates
4.2.1 Class 1 Medical Certificate
Required for commercial pilots...
Gets chunked with full context:
[Section 4.2.1: Class 1 Medical Certificate]
Required for commercial pilots...
Hierarchy: PART 4 > Section 4.2 > 4.2.1
Each chunk includes metadata for source tracking:
| Field | Description | Example |
|---|
| section_number | Section identifier | 4.2.1 |
| section_title | Section heading | Class 1 Medical Certificate |
| page_start | Estimated start page | 12 |
| page_end | Estimated end page | 13 |
| hierarchy_path | Full path | PART 4 > Section 4.2 > 4.2.1 |
How Citations Appear
When your chatbot answers questions using regulatory documents, users see:
- Document title - Which document the answer came from
- Section reference - Exact section number and title
- Page numbers - Where to find the original content
- Confidence score - How relevant the source is
Example citation:
[Source: Aviation Regulations Manual, Section 4.2.1, Pages 12-13]
Uploading Regulatory Documents
Step 1: Navigate to Documents
- Go to Chatbots in the sidebar
- Select your chatbot
- Click the Documents tab
Step 2: Upload Your Document
- Click Upload Document or drag and drop
- Select your PDF or DOCX file
- Wait for processing to complete
Step 3: Review Detection Results
After processing, you'll see badges indicating:
| Badge | Meaning |
|---|
| Regulatory | Detected as regulatory document with smart chunking |
| Citations | Citation metadata has been generated |
| Standard | Processed with standard chunking |
| Scanned | Limited text extraction (OCR coming soon) |
Step 4: Verify in Playground
- Go to the Playground tab
- Ask a question about the document
- Check that citations appear with section references
Best Practices
Document Preparation
- Use text-based PDFs - Not scanned images
- Keep formatting clean - Clear headings and sections
- Include table of contents - Helps detection accuracy
Naming Conventions
- Use descriptive file names
- Include version or date if applicable
- Example:
Aviation-Regulations-2024.pdf
Content Organization
- Upload one regulation per document
- Split very large documents by major parts
- Keep related documents grouped by chatbot
Detection Confidence
The system reports confidence levels:
| Level | Meaning |
|---|
| High | 10+ section patterns found |
| Medium | 5-9 section patterns found |
| Low | 3-4 section patterns found |
Higher confidence means more accurate section detection and citations.
| Format | Extension | Smart Chunking |
|---|
| PDF | .pdf | Yes |
| Word | .docx | Yes |
Troubleshooting
Document Shows as "Standard" Instead of "Regulatory"
- Check that section headings are on their own lines
- Verify headings use patterns like "Section 1.1" or "CHAPTER 1"
- Documents need at least 3 section patterns for detection
Citations Not Appearing
- Verify document processing completed successfully
- Check that the question relates to document content
- Try asking about specific sections by number
Scanned Document Warning
- The document contains mostly images, not text
- Re-export from original source as text-based PDF
- OCR support is planned for a future update
Next Steps