Joining Data Sets

In the world of data analysis, combining datasets can unveil hidden patterns, correlations, and insights . By merging related data from multiple sources, you can gain a more comprehensive understanding.

Using Querri to Join Your Data Sets

To join two or more data sets within Querri, follow these simple steps:

  1. Create a Project: Log in to your Querri account and create a new project
  2. Upload Your Data Sets: Upload the related datasets you wish to merge. You can either upload CSV files directly or connect Querri to your Dropbox to import the files.
  3. Bring in Your Data: One by one, bring in each file into your project within Querri.
  4. Start talking to your data! Keep reading for insights into various methods of merging data and practical examples to help you get started.

Expanding Your Data Horizontally

Scenario: Merging Customer Orders with Product Details

Imagine you have a table listing customer order dates, including product IDs, and prices. Separately, you have another table with product IDs alongside product names, category and brand.
(Example Data Sets you can download to experiment with: Products & Prices)

Products view

 

Prices View

 

 

Prompt: "Add product details to the customer orders based on product ID."

 

Add Product details to the customer order based on product ID

By executing this prompt, you're essentially placing the prices table next to the products table, aligning each product's details with the corresponding order. This action doesn't change the total number of rows in your prices table but expands it sideways, adding new columns that provide more context about each product.

Products & Prices files compbined

Combining Data Vertically

Scenario: Aggregating Yearly Sales Reports

Consider you have two separate tables, each representing a years sales data. The structure (columns) of these tables might vary slightly, with some quarters having extra promotional details.

(Example Data Sets: Order Details 2003 & Order Details 2004)

Screenshot 2024-03-23 at 5.41.05 PM

 

Car sales details year 2004

Prompt: "Combine all yearly sales reports into a single report."

This command stacks the yearly tables on top of each other, row by row. The final table will have rows from all years and will include all unique columns. If some quarters have extra columns, these will appear in the combined table, with empty cells in rows originating from quarters where this data was not applicable.

Combine all yearly sales reports into a single report.

Contextualizing Data Side by Side

Scenario: Aligning Employee Data with Office Locations

Suppose you have one dataset listing employees, their roles, and department IDs, and another detailing office locations with corresponding department IDs.

Prompt: "Show employee details along with their office locations based on department ID."

After this action, your employees' table will grow horizontally, with information about office locations appending to the right side. Each row remains an employee record, but now you'll see where each person's department is located, as if you've slid the location details right next to the corresponding department in your original table.

Adding many rows from one table to a single row in another

Scenario: Enhancing Event Data with Participant Feedback

Imagine an events table that includes event IDs, dates, and titles, and a feedback table that collects participant comments and ratings, keyed by event ID. In this scenario, you might have 100 events in one table but 1000 or more pieces of feedback. 

Here, you should choose what is important to know about this joined data and prompt Querri accordingly. One option would be:

Prompt: "Add the feedback count and average rating to each event from the feedback table."

In response, the event table will broaden, integrating feedback summary columns to its right. This doesn't alter the event count but enriches each event's row with direct insights from participants, aligning summary information from the feedback to the event they pertain to.

The other option you have is to add every row of feedback to the events table and have the event information duplicated in each row.

Pompt: "Join all of the event and feedback data, with a row for each of the feedback items"

This will both broaden and lengthen your events table, with a row for each of your feedback items. From here, you can perform a wide range of analysis on feedback, and include in that analysis information about the date, size or location of the event. For example, you can now see how feedback is improving over time, or how it differs from one venue to another.

Conclusion

There are numerous strategies to merge and unify your data, each serving different needs. Whether you're expanding tables horizontally to add detailed context, stacking datasets vertically to create comprehensive records, or aligning related information side by side, understanding these methods enhances your ability to extract meaningful insights from your data. Recognizing how to apply these techniques will empower you to make the most of your information, turning isolated datasets into coherent, informative resources.