Troubleshooting

Data Hub

Document Type

Scope

FAQs and quick references for creating, annotating, and registering Document Types in Document Manager, including supported attribute types and redaction.

Glossary: Supported Attribute Types

Attribute TypeDescriptionExample
Alpha NumericA mix of letters, numbers, and sometimes special characters.Invoice Number: INV-2025-08-001
BarcodeMachine‑readable image for quick scanning.Barcode printed on invoices
CheckboxBinary box that is checked or unchecked.[X] Organ Donor: Yes
DateA specific date. Standard formats supported.Date of Birth: 15/05/1990
EnumerationPick one option from a list.Payment Method: Credit Card
Free Form TextGeneral text not fitting a specific category.Remarks: "See medical history attached"
NameFull name of a person or company.Vendor Name: ABC Suppliers Ltd.
NumericNumeric value for quantities/amounts/IDs.Total Due: $1,500.00
SignatureHandwritten confirmation/identity mark.Signed authorization
TableRows and columns for structured data.Line items: Quantity, Unit Price, Total

Data Redaction

Redaction hides or masks sensitive data (PII) in documents and extracted values to protect privacy and comply with regulations.

Steps to enable on an attribute:

  1. Open the attribute in the Taxonomy panel (or create one).
  2. Toggle Enable Redaction.
  3. Choose a method:
    • All Characters: Replace all characters (e.g., XXXXXXXXXX).
    • First N Characters: Redact the first N characters.
    • Last N Characters: Redact the last N characters.
  4. Click Save and repeat for other sensitive fields.

Supported types for redaction: Alpha Numeric, Free Form Text, Numeric.

Annotation Toolbar

Use these tools on the annotation screen:

  • Hand (H): Pan the document
  • Box (C): Create a box‑annotated attribute
  • Table (T): Create a table attribute
  • Zoom In (I) / Zoom Out (O)
  • Fit to Screen: Fit page to viewport

FAQs

What is a Document Type?
A reusable template (schema/taxonomy) that teaches the AI what fields to extract for a specific document category.

How do I verify extraction is correct?
Use the Document Extraction tab to review values and confidence. If incorrect, adjust annotations, data types, or instructions; then re‑run extraction.

Tips to improve extraction quality?
Use high‑resolution samples, precise bounding boxes, correct data types, clear descriptions/instructions, and accurate table schemas.

Who can create/register Document Types?
Users with Document Manager permissions. If access is restricted, contact your admin.

Data Profiling

How does data profiling work and what insights does it provide? Data profiling analyzes your data's structure, content, and quality:

Analysis Types:

  • Schema Analysis: Data types, column names, relationships
  • Content Analysis: Value distributions, patterns, anomalies
  • Quality Assessment: Missing values, duplicates, inconsistencies
  • Statistical Summary: Min/max, averages, standard deviations

Generated Reports:

  • Data quality scores
  • Column-level statistics
  • Anomaly detection results
  • Recommendations for improvement

Data Quality Management

What data quality features are available?

Comprehensive data quality management includes:

Quality Features:

  • Validation Rules: Custom rules for data validation
  • Cleansing Operations: Automated data cleaning
  • Quality Monitoring: Continuous quality assessment
  • Error Reporting: Detailed error logs and notifications

Quality Metrics:

  • Completeness (missing data percentage)
  • Accuracy (data correctness)
  • Consistency (data uniformity)
  • Validity (adherence to rules)

Data Connectors

What types of data connectors are supported?

Vue.ai supports 250+ connectors across various categories:

Connector Categories:

  • Databases: PostgreSQL, MySQL, SQL Server, Oracle, MongoDB
  • Cloud Storage: Amazon S3, Azure Blob, Google Cloud Storage
  • CRM/ERP Systems: Salesforce, HubSpot, SAP, Oracle ERP
  • APIs: REST APIs, GraphQL, webhook integrations
  • Streaming: Kafka, Kinesis, Pub/Sub
  • File Formats: CSV, JSON, Parquet, Avro, Excel

Pipeline Monitoring

How can I monitor data pipeline health and performance?

Pipeline monitoring provides comprehensive oversight:

Monitoring Features:

  • Health Dashboards: Real-time pipeline status
  • Performance Metrics: Throughput, latency, error rates
  • SLA Monitoring: Service level agreement tracking
  • Automated Alerts: Notifications for issues and failures

Key Metrics:

  • Data freshness and latency
  • Processing success rates
  • Resource utilization
  • Data quality scores

Automation Hub

CSV Dataset Reader Node

What is the CSV Dataset Reader node and when should I use it?

The CSV Dataset Reader node serves as a bridge between your datasets and custom code nodes in automation workflows. Use it when you need to:

  • Read CSV data into custom processing workflows
  • Convert dataset formats for downstream processing
  • Access dataset metadata and schema information

Common Use Cases:

  • Data preprocessing before ML model training
  • Format conversion between different data types
  • Integration with custom Python processing logic

Creating Custom Code Nodes

How do I create and configure custom code nodes?

Follow these steps to create custom code nodes:

  1. Add Code Node: Click "Add Node" → "Custom Code Node"
  2. Configure Environment: Select Python version and required libraries
  3. Write Code: Use the integrated VS Code server to write your logic
  4. Test Locally: Test with sample data before deployment
  5. Deploy: The system automatically builds Docker containers

Best Practices:

  • Use requirements.txt for dependency management
  • Implement proper error handling
  • Include logging for debugging
  • Test with various data inputs

Drop Node Operations

What is the purpose of the Drop Node?

The Drop Node removes unnecessary columns from your datasets to:

  • Reduce memory usage and processing time
  • Clean data for downstream operations
  • Remove sensitive or irrelevant fields
  • Optimize data transfer between nodes

Configuration:

  • Select columns to drop from dropdown list
  • Preview changes before applying
  • Save column mapping for reuse

Filter Node Usage

How do I apply conditions with the Filter Node?

The Filter Node allows you to apply conditions to include or exclude rows:

Example Conditions:

age > 25 AND status = 'active'
revenue >= 1000 OR customer_type = 'premium'
date_created > '2023-01-01'

Supported Operators:

  • Comparison: >, <, >=, <=, =, !=
  • Logical: AND, OR, NOT
  • Pattern matching: LIKE, IN

Join Node Functionality

How do I merge datasets using the Join Node?

The Join Node merges datasets based on common keys:

Join Types:

  • Inner Join: Returns only matching records
  • Left Join: Returns all left table records + matches
  • Right Join: Returns all right table records + matches
  • Full Outer Join: Returns all records from both tables

Configuration:

  1. Select join type
  2. Choose join keys from each dataset
  3. Handle duplicate column names
  4. Preview results before execution

GroupBy Node Operations

How do I aggregate data with the GroupBy Node?

The GroupBy Node aggregates data with various functions:

Available Aggregation Functions:

  • COUNT: Count of records in each group
  • SUM: Sum of numeric values
  • AVG: Average of numeric values
  • MIN/MAX: Minimum/maximum values
  • FIRST/LAST: First/last values in group

Example Usage:

  • Group by customer_id, aggregate SUM(revenue)
  • Group by product_category, aggregate COUNT(*)
  • Group by date, aggregate AVG(rating)

Customer Hub

Audience Builder

How does the Audience Builder work?

The Audience Builder helps create targeted customer segments:

Segment Creation Process:

  1. Define Goals: What business outcome do you want to achieve?
  2. Set Criteria: Choose visitor characteristics and behaviors
  3. Add Conditions: Use AND/OR logic for complex segmentation
  4. Preview Results: See estimated segment size
  5. Activate: Deploy segment for targeting

Common Segment Types:

  • High-value customers
  • Cart abandoners
  • First-time visitors
  • Loyalty program members

Preset Audiences

What are preset audiences and how often are they updated?

Preset audiences are pre-built customer segments that update automatically:

Update Frequency: Every 24 hours Available Presets:

  • High-value customers (top 20% by revenue)
  • At-risk customers (declining engagement)
  • New customers (first purchase < 30 days)
  • Frequent buyers (multiple purchases)
  • Cart abandoners (items in cart > 24 hours)

Digital Experience Manager

How do I create personalized digital experiences?

The Digital Experience Manager enables experience personalization:

Creation Steps:

  1. Page Selection: Choose target pages/components
  2. Audience Targeting: Select audience segments
  3. Module Linking: Connect personalization modules
  4. Content Variants: Create different content versions
  5. Testing Setup: Configure A/B testing parameters
  6. Launch: Activate experience campaigns

Strategy Definition

How do I define and configure personalization strategies?

Strategies tailor model parameters and content recommendations:

Strategy Components:

  • Objective: Business goal (engagement, conversion, retention)
  • Model Parameters: Algorithm settings and weights
  • Content Rules: What content to recommend when
  • Fallback Logic: Default behavior when no match found
  • Performance Metrics: KPIs to track success

Audience Overlap Analysis

How can I analyze overlap between different audience segments?

Audience overlap analysis helps understand segment relationships:

Analysis Features:

  • Venn Diagrams: Visual representation of overlaps
  • Intersection Metrics: Exact overlap percentages
  • Unique Segments: Users in one segment but not others
  • Comparison Tables: Side-by-side segment characteristics

Use Cases:

  • Avoid audience conflicts in campaigns
  • Identify cross-sell opportunities
  • Optimize segment definitions

Developer Hub

ML Framework Support

Which machine learning frameworks are supported?

Vue.ai supports popular ML frameworks out of the box:

Supported Frameworks:

  • Scikit-learn: General-purpose ML library
  • XGBoost: Gradient boosting framework
  • TensorFlow: Deep learning and neural networks
  • PyTorch: Dynamic neural network framework
  • LightGBM: Gradient boosting for large datasets
  • Statsmodels: Statistical modeling and analysis

Additional Libraries:

  • Pandas, NumPy for data manipulation
  • Matplotlib, Seaborn for visualization
  • Jupyter for interactive development

Model Performance Monitoring

How do I monitor ML model performance?

Comprehensive model monitoring includes:

Monitoring Components:

  • MLflow Integration: Experiment tracking and model registry
  • Performance Dashboards: Real-time metrics and visualizations
  • Automated Alerts: Notifications for performance degradation
  • Drift Detection: Data and model drift monitoring

Key Metrics:

  • Accuracy, precision, recall, F1-score
  • Model latency and throughput
  • Feature importance changes
  • Prediction distribution shifts

Model Deployment Options

What are the available model deployment options?

Multiple deployment options for different use cases:

Deployment Methods:

  • REST API: Real-time inference via HTTP endpoints
  • Batch Processing: Scheduled batch predictions
  • Real-time Inference: Low-latency streaming predictions
  • Docker Containers: Containerized deployments
  • Kubernetes: Scalable container orchestration

Deployment Features:

  • Auto-scaling based on demand
  • A/B testing for model versions
  • Rollback capabilities
  • Health checks and monitoring

Custom Library Integration

How do I integrate custom libraries and dependencies?

Several options for custom library integration:

Integration Methods:

  • requirements.txt: Standard Python dependency management
  • Conda Environments: Conda package manager support
  • Docker Customization: Custom Docker images with dependencies
  • Virtual Environments: Isolated Python environments

Best Practices:

  • Pin specific versions for reproducibility
  • Use virtual environments for isolation
  • Test custom libraries thoroughly
  • Document dependencies clearly

General Platform Questions

Support Access

How do I access support and get help?

Multiple support channels available:

Support Options:

  • Support Portal: Online ticket system with tracking
  • Documentation: Comprehensive guides and tutorials
  • Account Manager: Dedicated support for enterprise customers
  • Community Forums: User community and discussions

Response Times:

  • Critical issues: 2-4 hours
  • High priority: 8-12 hours
  • Standard issues: 24-48 hours

Browser Support

Which browsers are supported?

Vue.ai works best with modern browsers:

Recommended Browsers:

  • Google Chrome: Latest version (recommended)
  • Mozilla Firefox: Latest version
  • Safari: Version 14+ (macOS)
  • Microsoft Edge: Chromium-based version

Browser Requirements:

  • JavaScript enabled
  • Cookies enabled
  • Local storage support
  • Modern CSS support

Platform Updates

How often is the platform updated?

Regular updates with new features and improvements:

Update Schedule:

  • Major Releases: Quarterly with significant features
  • Minor Updates: Monthly with improvements and fixes
  • Security Patches: As needed for security issues
  • Hotfixes: Critical bug fixes deployed immediately

Notification Methods:

  • In-platform notifications
  • Email updates to administrators
  • Release notes documentation
  • Changelog on support portal

User Permission Management

How do I manage user permissions and access control?

Comprehensive user and permission management:

Permission Features:

  • Admin Console: Centralized user management
  • Role-Based Access: Predefined and custom roles
  • Team Management: Organize users into teams
  • Resource Permissions: Fine-grained access control

Available Roles:

  • Platform Administrator
  • Data Manager
  • Workflow Developer
  • Data Analyst
  • Viewer (read-only)

Security Features

What security features are available?

Enterprise-grade security across the platform:

Security Features:

  • Single Sign-On (SSO): SAML and OIDC integration
  • Two-Factor Authentication: Additional security layer
  • Role-Based Access Control: Granular permissions
  • Data Encryption: End-to-end encryption
  • Audit Logging: Comprehensive activity tracking
  • IP Whitelisting: Network access restrictions

Compliance:

  • SOC 2 Type II certified
  • GDPR compliant
  • HIPAA compliant options
  • Industry-standard security practices

Still Need Help?

If you can't find the answer to your question in this troubleshooting guide, we're here to help!

Contact Support

Open a support ticket for technical assistance:

  • Priority Support: For urgent issues affecting production
  • Standard Support: For general questions and guidance
  • Feature Requests: Suggest new features and improvements

Browse Documentation

Explore our comprehensive documentation:

  • Getting Started: Platform introduction and basics
  • How-to Guides: Step-by-step instructions for specific tasks
  • API Reference: Complete API documentation with examples
  • Best Practices: Recommended approaches and patterns

Contact Your Account Manager

Enterprise customers can reach out to their dedicated account manager for:

  • Strategic guidance and best practices
  • Custom integration support
  • Training and onboarding assistance
  • Escalation of critical issues