Overview
This tutorial walks you through creating your first data view in Pylar. You’ll learn how to join data from multiple sources—Snowflake, HubSpot, and Salesforce—into a unified view that an AI agent can use to answer complex business questions.Use Case: Customer 360 View
We’ll create a Customer 360 View that combines:- Customer transaction data from Snowflake
- Marketing engagement data from HubSpot
- Sales pipeline data from Salesforce
- “What’s the total revenue for customers who engaged with our Q4 campaign?”
- “Show me customers who have deals in Salesforce but haven’t made a purchase yet”
- “Which HubSpot contacts have the highest lifetime value in Snowflake?”
Prerequisites
Before you begin, ensure you have:- ✅ A Pylar account with access to create views
- ✅ Active connections to:
- Snowflake (with transaction data)
- HubSpot (with marketing engagement data)
- Salesforce (with sales pipeline data)
- ✅ A project created in Pylar to organize your views
If you haven’t connected your data sources yet, check out the Making Connections section to set them up first.
Step 1: Create a New View
- Navigate to your project in the Pylar dashboard
- Click “Create View” or “New View”
- The SQL IDE will appear in the center of your screen
Step 2: Understand the Data Structure
Before writing your query, understand what data you’re working with:Snowflake Data (Transaction Database)
- Table:
transactions - Key Columns:
customer_id,transaction_date,amount,product_id
HubSpot Data (Marketing Platform)
- Table:
contacts - Key Columns:
contact_id,email,campaign_engagement,last_contacted
Salesforce Data (CRM)
- Table:
opportunities - Key Columns:
account_id,opportunity_name,stage,amount,close_date
The exact table and column names may vary based on your specific setup. Adjust the query to match your actual schema.
Step 3: Write Your Cross-Database Query
Now we’ll create a query that joins data from all three sources. Here’s the complete view:Step 4: Understanding the Query
Let’s break down what this query does:Data Aggregation
Snowflake Subquery:- Aggregates transaction data by customer
- Calculates total revenue, transaction count, and dates
- Filters to last 12 months of data
- Retrieves marketing engagement data
- Includes email and campaign interaction history
- Filters to active contacts only
- Aggregates sales pipeline data
- Calculates active opportunities and pipeline value
- Excludes closed deals from pipeline calculations
Cross-Database Joins
The query usesFULL OUTER JOIN to:
- Include customers who have transactions but no marketing data
- Include prospects who have sales pipeline but no transactions yet
- Include contacts who have marketing engagement but no sales activity
- Combine all three sources into a unified customer view
Customer Segmentation
TheCASE statement creates a customer segment based on:
- Whether they have revenue (Snowflake)
- Whether they have active pipeline (Salesforce)
- Combining both signals for a complete picture
Using
FULL OUTER JOIN ensures you capture all customers, prospects, and leads regardless of which systems they appear in. This is perfect for a comprehensive 360-degree view.Step 5: Select Your Primary Data Source
- In the SQL IDE, locate the “Select Datasource” dropdown
- Since we’re joining across multiple sources, select your primary source (Snowflake in this example)
- The query will execute using that connection, but can reference other sources
Pylar handles cross-database joins automatically. You can reference tables from different sources using the format:
datasource.schema.table.Step 6: Run and Test Your Query
- Review your query in the SQL IDE
- Click “Run Query” in the top right
- Wait for the query to execute (you’ll see a loader)
- Review the results:
- Check the row count
- Verify data looks correct
- Ensure joins worked as expected
- Review the customer segments
If your query runs successfully, you’ll see a table with all the combined customer data. The row count should reflect customers from all three sources.
Step 7: Save Your View
Once you’ve verified the query works:- Click “Save View” or “Save”
- Enter a descriptive name:
customer_360_view - Add a description:
- Click “Save” to finalize
How an AI Agent Uses This View
Once you create MCP tools on this view and publish them, an AI agent can use it to answer complex questions:Example Agent Interactions
Question: “Show me customers who have engaged with our Q4 campaign but haven’t made a purchase yet” Agent Process:- Agent queries the view with filters:
campaign_engagementcontains “Q4”total_revenueis NULL or 0customer_segment= “Prospect with Pipeline” or “Lead”
- Returns customers who are marketing-qualified but haven’t converted
- Can provide actionable insights for sales follow-up
- Agent queries the view filtering for:
active_opportunities> 0total_revenueis not NULL
- Calculates average revenue from the results
- Provides both the average and context about pipeline value
- Agent queries the view with:
total_revenue> threshold (high value)last_contacted_dateis old or NULL
- Identifies high-value customers needing re-engagement
- Can suggest next steps for marketing team
Why This Works
The unified view enables the agent to:- Answer Complex Questions: Questions requiring data from multiple sources can be answered in a single query
- Understand Context: Customer segments provide context for better responses
- Make Connections: The agent can see relationships between transactions, marketing, and sales
- Provide Insights: Combined data enables more sophisticated analysis
This view becomes a single source of truth for customer data. The AI agent doesn’t need to know about Snowflake, HubSpot, or Salesforce—it just queries this unified view.
Next Steps
Now that you’ve created your first view:- Create MCP Tools: Build tools on this view that agents can use
- Test Your View: Use Evals to see how agents interact with it
- Optimize: Refine based on agent usage patterns
Common Issues and Solutions
Issue: Query Fails with “Table Not Found”
Solution: Verify your table references are correct:- Check datasource names match your connections
- Verify schema and table names are exact
- Ensure you have access permissions
Issue: Join Returns Unexpected Results
Solution:- Verify join keys match across sources
- Check for NULL values that might affect joins
- Use
COALESCEto handle mismatched identifiers - Test joins incrementally (start with two sources)
Issue: Query Runs Slowly
Solution:- Add filters to reduce data volume early
- Use aggregated subqueries (as in our example)
- Limit date ranges to recent data
- Consider materializing frequently-used views
Best Practices Recap
- Use Descriptive Names:
customer_360_viewis clear and specific - Add Comments: Document your query logic
- Test Incrementally: Start with one source, then add joins
- Aggregate When Possible: Pre-calculate metrics in subqueries
- Handle NULLs: Use
COALESCEfor unified identifiers - Filter Early: Apply WHERE clauses in subqueries
Build MCP Tools
Next: Create MCP tools so agents can query this view