Troubleshooting

Data Hub

Document Type

Scope

FAQs and quick references for creating, annotating, and registering Document Types in Document Manager, including supported attribute types and redaction.

Glossary: Supported Attribute Types

Attribute Type	Description	Example
Alpha Numeric	A mix of letters, numbers, and sometimes special characters.	Invoice Number: `INV-2025-08-001`
Barcode	Machine‑readable image for quick scanning.	Barcode printed on invoices
Checkbox	Binary box that is checked or unchecked.	`[X]` Organ Donor: Yes
Date	A specific date. Standard formats supported.	Date of Birth: `15/05/1990`
Enumeration	Pick one option from a list.	Payment Method: `Credit Card`
Free Form Text	General text not fitting a specific category.	Remarks: "See medical history attached"
Name	Full name of a person or company.	Vendor Name: `ABC Suppliers Ltd.`
Numeric	Numeric value for quantities/amounts/IDs.	Total Due: `$1,500.00`
Signature	Handwritten confirmation/identity mark.	Signed authorization
Table	Rows and columns for structured data.	Line items: Quantity, Unit Price, Total

Data Redaction

Redaction hides or masks sensitive data (PII) in documents and extracted values to protect privacy and comply with regulations.

Steps to enable on an attribute:

Open the attribute in the Taxonomy panel (or create one).
Toggle Enable Redaction.
Choose a method:
- All Characters: Replace all characters (e.g., XXXXXXXXXX).
- First N Characters: Redact the first N characters.
- Last N Characters: Redact the last N characters.
Click Save and repeat for other sensitive fields.

Supported types for redaction: Alpha Numeric, Free Form Text, Numeric.

Use these tools on the annotation screen:

Hand (H): Pan the document
Box (C): Create a box‑annotated attribute
Table (T): Create a table attribute
Zoom In (I) / Zoom Out (O)
Fit to Screen: Fit page to viewport

FAQs

What is a Document Type?
A reusable template (schema/taxonomy) that teaches the AI what fields to extract for a specific document category.

How do I verify extraction is correct?
Use the Document Extraction tab to review values and confidence. If incorrect, adjust annotations, data types, or instructions; then re‑run extraction.

Tips to improve extraction quality?
Use high‑resolution samples, precise bounding boxes, correct data types, clear descriptions/instructions, and accurate table schemas.

Who can create/register Document Types?
Users with Document Manager permissions. If access is restricted, contact your admin.

Data Profiling

How does data profiling work and what insights does it provide? Data profiling analyzes your data's structure, content, and quality:

Analysis Types:

Schema Analysis: Data types, column names, relationships
Content Analysis: Value distributions, patterns, anomalies
Quality Assessment: Missing values, duplicates, inconsistencies
Statistical Summary: Min/max, averages, standard deviations

Generated Reports:

Data quality scores
Column-level statistics
Anomaly detection results
Recommendations for improvement

Data Quality Management

What data quality features are available?

Comprehensive data quality management includes:

Quality Features:

Validation Rules: Custom rules for data validation
Cleansing Operations: Automated data cleaning
Quality Monitoring: Continuous quality assessment
Error Reporting: Detailed error logs and notifications

Quality Metrics:

Completeness (missing data percentage)
Accuracy (data correctness)
Consistency (data uniformity)
Validity (adherence to rules)

Data Connectors

What types of data connectors are supported?

Vue.ai supports 250+ connectors across various categories:

Connector Categories:

Databases: PostgreSQL, MySQL, SQL Server, Oracle, MongoDB
Cloud Storage: Amazon S3, Azure Blob, Google Cloud Storage
CRM/ERP Systems: Salesforce, HubSpot, SAP, Oracle ERP
APIs: REST APIs, GraphQL, webhook integrations
Streaming: Kafka, Kinesis, Pub/Sub
File Formats: CSV, JSON, Parquet, Avro, Excel

Pipeline Monitoring

How can I monitor data pipeline health and performance?

Pipeline monitoring provides comprehensive oversight:

Monitoring Features:

Health Dashboards: Real-time pipeline status
Performance Metrics: Throughput, latency, error rates
SLA Monitoring: Service level agreement tracking
Automated Alerts: Notifications for issues and failures

Key Metrics:

Data freshness and latency
Processing success rates
Resource utilization
Data quality scores

Automation Hub

CSV Dataset Reader Node

What is the CSV Dataset Reader node and when should I use it?

The CSV Dataset Reader node serves as a bridge between your datasets and custom code nodes in automation workflows. Use it when you need to:

Read CSV data into custom processing workflows
Convert dataset formats for downstream processing
Access dataset metadata and schema information

Common Use Cases:

Data preprocessing before ML model training
Format conversion between different data types
Integration with custom Python processing logic

Creating Custom Code Nodes

How do I create and configure custom code nodes?

Follow these steps to create custom code nodes:

Add Code Node: Click "Add Node" → "Custom Code Node"
Configure Environment: Select Python version and required libraries
Write Code: Use the integrated VS Code server to write your logic
Test Locally: Test with sample data before deployment
Deploy: The system automatically builds Docker containers

Best Practices:

Use requirements.txt for dependency management
Implement proper error handling
Include logging for debugging
Test with various data inputs

Drop Node Operations

What is the purpose of the Drop Node?

The Drop Node removes unnecessary columns from your datasets to:

Reduce memory usage and processing time
Clean data for downstream operations
Remove sensitive or irrelevant fields
Optimize data transfer between nodes

Configuration:

Select columns to drop from dropdown list
Preview changes before applying
Save column mapping for reuse

Filter Node Usage

How do I apply conditions with the Filter Node?

The Filter Node allows you to apply conditions to include or exclude rows:

Example Conditions:

age > 25 AND status = 'active'
revenue >= 1000 OR customer_type = 'premium'
date_created > '2023-01-01'

Supported Operators:

Comparison: >, <, >=, <=, =, !=
Logical: AND, OR, NOT
Pattern matching: LIKE, IN

Join Node Functionality

How do I merge datasets using the Join Node?

The Join Node merges datasets based on common keys:

Join Types:

Inner Join: Returns only matching records
Left Join: Returns all left table records + matches
Right Join: Returns all right table records + matches
Full Outer Join: Returns all records from both tables

Configuration:

Select join type
Choose join keys from each dataset
Handle duplicate column names
Preview results before execution

GroupBy Node Operations

How do I aggregate data with the GroupBy Node?

The GroupBy Node aggregates data with various functions:

Available Aggregation Functions:

COUNT: Count of records in each group
SUM: Sum of numeric values
AVG: Average of numeric values
MIN/MAX: Minimum/maximum values
FIRST/LAST: First/last values in group

Example Usage:

Group by customer_id, aggregate SUM(revenue)
Group by product_category, aggregate COUNT(*)
Group by date, aggregate AVG(rating)

Customer Hub

Audience Builder

How does the Audience Builder work?

The Audience Builder helps create targeted customer segments:

Segment Creation Process:

Define Goals: What business outcome do you want to achieve?
Set Criteria: Choose visitor characteristics and behaviors
Add Conditions: Use AND/OR logic for complex segmentation
Preview Results: See estimated segment size
Activate: Deploy segment for targeting

Common Segment Types:

High-value customers
Cart abandoners
First-time visitors
Loyalty program members

Preset Audiences

What are preset audiences and how often are they updated?

Preset audiences are pre-built customer segments that update automatically:

Update Frequency: Every 24 hours Available Presets:

High-value customers (top 20% by revenue)
At-risk customers (declining engagement)
New customers (first purchase < 30 days)
Frequent buyers (multiple purchases)
Cart abandoners (items in cart > 24 hours)

Digital Experience Manager

How do I create personalized digital experiences?

The Digital Experience Manager enables experience personalization:

Creation Steps:

Page Selection: Choose target pages/components
Audience Targeting: Select audience segments
Module Linking: Connect personalization modules
Content Variants: Create different content versions
Testing Setup: Configure A/B testing parameters
Launch: Activate experience campaigns

Strategy Definition

How do I define and configure personalization strategies?

Strategies tailor model parameters and content recommendations:

Strategy Components:

Objective: Business goal (engagement, conversion, retention)
Model Parameters: Algorithm settings and weights
Content Rules: What content to recommend when
Fallback Logic: Default behavior when no match found
Performance Metrics: KPIs to track success

Audience Overlap Analysis

How can I analyze overlap between different audience segments?

Audience overlap analysis helps understand segment relationships:

Analysis Features:

Venn Diagrams: Visual representation of overlaps
Intersection Metrics: Exact overlap percentages
Unique Segments: Users in one segment but not others
Comparison Tables: Side-by-side segment characteristics

Use Cases:

Avoid audience conflicts in campaigns
Identify cross-sell opportunities
Optimize segment definitions

Developer Hub

ML Framework Support

Which machine learning frameworks are supported?

Vue.ai supports popular ML frameworks out of the box:

Supported Frameworks:

Scikit-learn: General-purpose ML library
XGBoost: Gradient boosting framework
TensorFlow: Deep learning and neural networks
PyTorch: Dynamic neural network framework
LightGBM: Gradient boosting for large datasets
Statsmodels: Statistical modeling and analysis

Additional Libraries:

Pandas, NumPy for data manipulation
Matplotlib, Seaborn for visualization
Jupyter for interactive development

Model Performance Monitoring

How do I monitor ML model performance?

Comprehensive model monitoring includes:

Monitoring Components:

MLflow Integration: Experiment tracking and model registry
Performance Dashboards: Real-time metrics and visualizations
Automated Alerts: Notifications for performance degradation
Drift Detection: Data and model drift monitoring

Key Metrics:

Accuracy, precision, recall, F1-score
Model latency and throughput
Feature importance changes
Prediction distribution shifts

Model Deployment Options

What are the available model deployment options?

Multiple deployment options for different use cases:

Deployment Methods:

REST API: Real-time inference via HTTP endpoints
Batch Processing: Scheduled batch predictions
Real-time Inference: Low-latency streaming predictions
Docker Containers: Containerized deployments
Kubernetes: Scalable container orchestration

Deployment Features:

Auto-scaling based on demand
A/B testing for model versions
Rollback capabilities
Health checks and monitoring

Custom Library Integration

How do I integrate custom libraries and dependencies?

Several options for custom library integration:

Integration Methods:

requirements.txt: Standard Python dependency management
Conda Environments: Conda package manager support
Docker Customization: Custom Docker images with dependencies
Virtual Environments: Isolated Python environments

Best Practices:

Pin specific versions for reproducibility
Use virtual environments for isolation
Test custom libraries thoroughly
Document dependencies clearly

General Platform Questions

Support Access

How do I access support and get help?

Multiple support channels available:

Support Options:

Support Portal: Online ticket system with tracking
Documentation: Comprehensive guides and tutorials
Account Manager: Dedicated support for enterprise customers
Community Forums: User community and discussions

Response Times:

Critical issues: 2-4 hours
High priority: 8-12 hours
Standard issues: 24-48 hours

Browser Support

Which browsers are supported?

Vue.ai works best with modern browsers:

Recommended Browsers:

Google Chrome: Latest version (recommended)
Mozilla Firefox: Latest version
Safari: Version 14+ (macOS)
Microsoft Edge: Chromium-based version

Browser Requirements:

JavaScript enabled
Cookies enabled
Local storage support
Modern CSS support

Platform Updates

How often is the platform updated?

Regular updates with new features and improvements:

Update Schedule:

Major Releases: Quarterly with significant features
Minor Updates: Monthly with improvements and fixes
Security Patches: As needed for security issues
Hotfixes: Critical bug fixes deployed immediately

Notification Methods:

In-platform notifications
Email updates to administrators
Release notes documentation
Changelog on support portal

User Permission Management

How do I manage user permissions and access control?

Comprehensive user and permission management:

Permission Features:

Admin Console: Centralized user management
Role-Based Access: Predefined and custom roles
Team Management: Organize users into teams
Resource Permissions: Fine-grained access control

Available Roles:

Platform Administrator
Data Manager
Workflow Developer
Data Analyst
Viewer (read-only)

Security Features

What security features are available?

Enterprise-grade security across the platform:

Security Features:

Single Sign-On (SSO): SAML and OIDC integration
Two-Factor Authentication: Additional security layer
Role-Based Access Control: Granular permissions
Data Encryption: End-to-end encryption
Audit Logging: Comprehensive activity tracking
IP Whitelisting: Network access restrictions

Compliance:

SOC 2 Type II certified
GDPR compliant
HIPAA compliant options
Industry-standard security practices

Still Need Help?

If you can't find the answer to your question in this troubleshooting guide, we're here to help!

Contact Support

Open a support ticket for technical assistance:

Priority Support: For urgent issues affecting production
Standard Support: For general questions and guidance
Feature Requests: Suggest new features and improvements

Browse Documentation

Explore our comprehensive documentation:

Getting Started: Platform introduction and basics
How-to Guides: Step-by-step instructions for specific tasks
API Reference: Complete API documentation with examples
Best Practices: Recommended approaches and patterns

Contact Your Account Manager

Enterprise customers can reach out to their dedicated account manager for:

Strategic guidance and best practices
Custom integration support
Training and onboarding assistance
Escalation of critical issues

Troubleshooting

Data Hub

Document Type

Glossary: Supported Attribute Types

Data Redaction

Annotation Toolbar

FAQs

Data Profiling

Data Quality Management

Data Connectors

Pipeline Monitoring

Automation Hub

CSV Dataset Reader Node

Creating Custom Code Nodes

Drop Node Operations

Filter Node Usage

Join Node Functionality

GroupBy Node Operations

Customer Hub

Audience Builder

Preset Audiences

Digital Experience Manager

Strategy Definition

Audience Overlap Analysis

Developer Hub

ML Framework Support

Model Performance Monitoring

Model Deployment Options

Custom Library Integration

General Platform Questions

Support Access

Browser Support

Platform Updates

User Permission Management

Security Features

Still Need Help?

Contact Support

Browse Documentation

Contact Your Account Manager