Troubleshooting
Data Hub
Document Type
Scope
FAQs and quick references for creating, annotating, and registering Document Types in Document Manager, including supported attribute types and redaction.
Glossary: Supported Attribute Types
Attribute Type | Description | Example |
---|---|---|
Alpha Numeric | A mix of letters, numbers, and sometimes special characters. | Invoice Number: INV-2025-08-001 |
Barcode | Machine‑readable image for quick scanning. | Barcode printed on invoices |
Checkbox | Binary box that is checked or unchecked. | [X] Organ Donor: Yes |
Date | A specific date. Standard formats supported. | Date of Birth: 15/05/1990 |
Enumeration | Pick one option from a list. | Payment Method: Credit Card |
Free Form Text | General text not fitting a specific category. | Remarks: "See medical history attached" |
Name | Full name of a person or company. | Vendor Name: ABC Suppliers Ltd. |
Numeric | Numeric value for quantities/amounts/IDs. | Total Due: $1,500.00 |
Signature | Handwritten confirmation/identity mark. | Signed authorization |
Table | Rows and columns for structured data. | Line items: Quantity, Unit Price, Total |
Data Redaction
Redaction hides or masks sensitive data (PII) in documents and extracted values to protect privacy and comply with regulations.
Steps to enable on an attribute:
- Open the attribute in the Taxonomy panel (or create one).
- Toggle Enable Redaction.
- Choose a method:
- All Characters: Replace all characters (e.g.,
XXXXXXXXXX
). - First N Characters: Redact the first N characters.
- Last N Characters: Redact the last N characters.
- All Characters: Replace all characters (e.g.,
- Click Save and repeat for other sensitive fields.
Supported types for redaction: Alpha Numeric
, Free Form Text
, Numeric
.
Annotation Toolbar
Use these tools on the annotation screen:
- Hand (H): Pan the document
- Box (C): Create a box‑annotated attribute
- Table (T): Create a table attribute
- Zoom In (I) / Zoom Out (O)
- Fit to Screen: Fit page to viewport
FAQs
What is a Document Type?
A reusable template (schema/taxonomy) that teaches the AI what fields to extract for a specific document category.
How do I verify extraction is correct?
Use the Document Extraction tab to review values and confidence. If incorrect, adjust annotations, data types, or instructions; then re‑run extraction.
Tips to improve extraction quality?
Use high‑resolution samples, precise bounding boxes, correct data types, clear descriptions/instructions, and accurate table schemas.
Who can create/register Document Types?
Users with Document Manager permissions. If access is restricted, contact your admin.
Data Profiling
How does data profiling work and what insights does it provide? Data profiling analyzes your data's structure, content, and quality:
Analysis Types:
- Schema Analysis: Data types, column names, relationships
- Content Analysis: Value distributions, patterns, anomalies
- Quality Assessment: Missing values, duplicates, inconsistencies
- Statistical Summary: Min/max, averages, standard deviations
Generated Reports:
- Data quality scores
- Column-level statistics
- Anomaly detection results
- Recommendations for improvement
Data Quality Management
What data quality features are available?
Comprehensive data quality management includes:
Quality Features:
- Validation Rules: Custom rules for data validation
- Cleansing Operations: Automated data cleaning
- Quality Monitoring: Continuous quality assessment
- Error Reporting: Detailed error logs and notifications
Quality Metrics:
- Completeness (missing data percentage)
- Accuracy (data correctness)
- Consistency (data uniformity)
- Validity (adherence to rules)
Data Connectors
What types of data connectors are supported?
Vue.ai supports 250+ connectors across various categories:
Connector Categories:
- Databases: PostgreSQL, MySQL, SQL Server, Oracle, MongoDB
- Cloud Storage: Amazon S3, Azure Blob, Google Cloud Storage
- CRM/ERP Systems: Salesforce, HubSpot, SAP, Oracle ERP
- APIs: REST APIs, GraphQL, webhook integrations
- Streaming: Kafka, Kinesis, Pub/Sub
- File Formats: CSV, JSON, Parquet, Avro, Excel
Pipeline Monitoring
How can I monitor data pipeline health and performance?
Pipeline monitoring provides comprehensive oversight:
Monitoring Features:
- Health Dashboards: Real-time pipeline status
- Performance Metrics: Throughput, latency, error rates
- SLA Monitoring: Service level agreement tracking
- Automated Alerts: Notifications for issues and failures
Key Metrics:
- Data freshness and latency
- Processing success rates
- Resource utilization
- Data quality scores
Automation Hub
CSV Dataset Reader Node
What is the CSV Dataset Reader node and when should I use it?
The CSV Dataset Reader node serves as a bridge between your datasets and custom code nodes in automation workflows. Use it when you need to:
- Read CSV data into custom processing workflows
- Convert dataset formats for downstream processing
- Access dataset metadata and schema information
Common Use Cases:
- Data preprocessing before ML model training
- Format conversion between different data types
- Integration with custom Python processing logic
Creating Custom Code Nodes
How do I create and configure custom code nodes?
Follow these steps to create custom code nodes:
- Add Code Node: Click "Add Node" → "Custom Code Node"
- Configure Environment: Select Python version and required libraries
- Write Code: Use the integrated VS Code server to write your logic
- Test Locally: Test with sample data before deployment
- Deploy: The system automatically builds Docker containers
Best Practices:
- Use
requirements.txt
for dependency management - Implement proper error handling
- Include logging for debugging
- Test with various data inputs
Drop Node Operations
What is the purpose of the Drop Node?
The Drop Node removes unnecessary columns from your datasets to:
- Reduce memory usage and processing time
- Clean data for downstream operations
- Remove sensitive or irrelevant fields
- Optimize data transfer between nodes
Configuration:
- Select columns to drop from dropdown list
- Preview changes before applying
- Save column mapping for reuse
Filter Node Usage
How do I apply conditions with the Filter Node?
The Filter Node allows you to apply conditions to include or exclude rows:
Example Conditions:
age > 25 AND status = 'active' revenue >= 1000 OR customer_type = 'premium' date_created > '2023-01-01'
Supported Operators:
- Comparison:
>
,<
,>=
,<=
,=
,!=
- Logical:
AND
,OR
,NOT
- Pattern matching:
LIKE
,IN
Join Node Functionality
How do I merge datasets using the Join Node?
The Join Node merges datasets based on common keys:
Join Types:
- Inner Join: Returns only matching records
- Left Join: Returns all left table records + matches
- Right Join: Returns all right table records + matches
- Full Outer Join: Returns all records from both tables
Configuration:
- Select join type
- Choose join keys from each dataset
- Handle duplicate column names
- Preview results before execution
GroupBy Node Operations
How do I aggregate data with the GroupBy Node?
The GroupBy Node aggregates data with various functions:
Available Aggregation Functions:
- COUNT: Count of records in each group
- SUM: Sum of numeric values
- AVG: Average of numeric values
- MIN/MAX: Minimum/maximum values
- FIRST/LAST: First/last values in group
Example Usage:
- Group by
customer_id
, aggregateSUM(revenue)
- Group by
product_category
, aggregateCOUNT(*)
- Group by
date
, aggregateAVG(rating)
Customer Hub
Audience Builder
How does the Audience Builder work?
The Audience Builder helps create targeted customer segments:
Segment Creation Process:
- Define Goals: What business outcome do you want to achieve?
- Set Criteria: Choose visitor characteristics and behaviors
- Add Conditions: Use AND/OR logic for complex segmentation
- Preview Results: See estimated segment size
- Activate: Deploy segment for targeting
Common Segment Types:
- High-value customers
- Cart abandoners
- First-time visitors
- Loyalty program members
Preset Audiences
What are preset audiences and how often are they updated?
Preset audiences are pre-built customer segments that update automatically:
Update Frequency: Every 24 hours Available Presets:
- High-value customers (top 20% by revenue)
- At-risk customers (declining engagement)
- New customers (first purchase < 30 days)
- Frequent buyers (multiple purchases)
- Cart abandoners (items in cart > 24 hours)
Digital Experience Manager
How do I create personalized digital experiences?
The Digital Experience Manager enables experience personalization:
Creation Steps:
- Page Selection: Choose target pages/components
- Audience Targeting: Select audience segments
- Module Linking: Connect personalization modules
- Content Variants: Create different content versions
- Testing Setup: Configure A/B testing parameters
- Launch: Activate experience campaigns
Strategy Definition
How do I define and configure personalization strategies?
Strategies tailor model parameters and content recommendations:
Strategy Components:
- Objective: Business goal (engagement, conversion, retention)
- Model Parameters: Algorithm settings and weights
- Content Rules: What content to recommend when
- Fallback Logic: Default behavior when no match found
- Performance Metrics: KPIs to track success
Audience Overlap Analysis
How can I analyze overlap between different audience segments?
Audience overlap analysis helps understand segment relationships:
Analysis Features:
- Venn Diagrams: Visual representation of overlaps
- Intersection Metrics: Exact overlap percentages
- Unique Segments: Users in one segment but not others
- Comparison Tables: Side-by-side segment characteristics
Use Cases:
- Avoid audience conflicts in campaigns
- Identify cross-sell opportunities
- Optimize segment definitions
Developer Hub
ML Framework Support
Which machine learning frameworks are supported?
Vue.ai supports popular ML frameworks out of the box:
Supported Frameworks:
- Scikit-learn: General-purpose ML library
- XGBoost: Gradient boosting framework
- TensorFlow: Deep learning and neural networks
- PyTorch: Dynamic neural network framework
- LightGBM: Gradient boosting for large datasets
- Statsmodels: Statistical modeling and analysis
Additional Libraries:
- Pandas, NumPy for data manipulation
- Matplotlib, Seaborn for visualization
- Jupyter for interactive development
Model Performance Monitoring
How do I monitor ML model performance?
Comprehensive model monitoring includes:
Monitoring Components:
- MLflow Integration: Experiment tracking and model registry
- Performance Dashboards: Real-time metrics and visualizations
- Automated Alerts: Notifications for performance degradation
- Drift Detection: Data and model drift monitoring
Key Metrics:
- Accuracy, precision, recall, F1-score
- Model latency and throughput
- Feature importance changes
- Prediction distribution shifts
Model Deployment Options
What are the available model deployment options?
Multiple deployment options for different use cases:
Deployment Methods:
- REST API: Real-time inference via HTTP endpoints
- Batch Processing: Scheduled batch predictions
- Real-time Inference: Low-latency streaming predictions
- Docker Containers: Containerized deployments
- Kubernetes: Scalable container orchestration
Deployment Features:
- Auto-scaling based on demand
- A/B testing for model versions
- Rollback capabilities
- Health checks and monitoring
Custom Library Integration
How do I integrate custom libraries and dependencies?
Several options for custom library integration:
Integration Methods:
- requirements.txt: Standard Python dependency management
- Conda Environments: Conda package manager support
- Docker Customization: Custom Docker images with dependencies
- Virtual Environments: Isolated Python environments
Best Practices:
- Pin specific versions for reproducibility
- Use virtual environments for isolation
- Test custom libraries thoroughly
- Document dependencies clearly
General Platform Questions
Support Access
How do I access support and get help?
Multiple support channels available:
Support Options:
- Support Portal: Online ticket system with tracking
- Documentation: Comprehensive guides and tutorials
- Account Manager: Dedicated support for enterprise customers
- Community Forums: User community and discussions
Response Times:
- Critical issues: 2-4 hours
- High priority: 8-12 hours
- Standard issues: 24-48 hours
Browser Support
Which browsers are supported?
Vue.ai works best with modern browsers:
Recommended Browsers:
- Google Chrome: Latest version (recommended)
- Mozilla Firefox: Latest version
- Safari: Version 14+ (macOS)
- Microsoft Edge: Chromium-based version
Browser Requirements:
- JavaScript enabled
- Cookies enabled
- Local storage support
- Modern CSS support
Platform Updates
How often is the platform updated?
Regular updates with new features and improvements:
Update Schedule:
- Major Releases: Quarterly with significant features
- Minor Updates: Monthly with improvements and fixes
- Security Patches: As needed for security issues
- Hotfixes: Critical bug fixes deployed immediately
Notification Methods:
- In-platform notifications
- Email updates to administrators
- Release notes documentation
- Changelog on support portal
User Permission Management
How do I manage user permissions and access control?
Comprehensive user and permission management:
Permission Features:
- Admin Console: Centralized user management
- Role-Based Access: Predefined and custom roles
- Team Management: Organize users into teams
- Resource Permissions: Fine-grained access control
Available Roles:
- Platform Administrator
- Data Manager
- Workflow Developer
- Data Analyst
- Viewer (read-only)
Security Features
What security features are available?
Enterprise-grade security across the platform:
Security Features:
- Single Sign-On (SSO): SAML and OIDC integration
- Two-Factor Authentication: Additional security layer
- Role-Based Access Control: Granular permissions
- Data Encryption: End-to-end encryption
- Audit Logging: Comprehensive activity tracking
- IP Whitelisting: Network access restrictions
Compliance:
- SOC 2 Type II certified
- GDPR compliant
- HIPAA compliant options
- Industry-standard security practices
Still Need Help?
If you can't find the answer to your question in this troubleshooting guide, we're here to help!
Contact Support
Open a support ticket for technical assistance:
- Priority Support: For urgent issues affecting production
- Standard Support: For general questions and guidance
- Feature Requests: Suggest new features and improvements
Browse Documentation
Explore our comprehensive documentation:
- Getting Started: Platform introduction and basics
- How-to Guides: Step-by-step instructions for specific tasks
- API Reference: Complete API documentation with examples
- Best Practices: Recommended approaches and patterns
Contact Your Account Manager
Enterprise customers can reach out to their dedicated account manager for:
- Strategic guidance and best practices
- Custom integration support
- Training and onboarding assistance
- Escalation of critical issues