There are a variety of ways to search unstructured content inside TIBCO Spotfire. The most common is by using the built-in filters to select a subset of data. With this method there is usually one filter per column.

But sometimes it’s required to select a subset between two values in the same column, and evaluate how close those values were matched within the same content block. This tip will walk you through this method, and take a look at what can be done with it.

Introducing guest author, Marcio Arbex. Marcio is a Senior Solutions Consultant with Spotfire’s Latin America team. As a Solutions Consultant he helps customers understand the impact that analytics can have in improving their business. In this post, he walks us through a process for searching unstructured data. –Dave

First, let’s assume this data table:

datatable Spotfire Tips & Tricks: Searching Unstructured DataWe’ll create two custom filters by adding two input fields to the text area, along with a slider that for indicating a proximity for our matched terms. The input fields are defined as follows:

Type: input field
Name: firstWord
DataType: String

Type: input field
Name: secondWord
DataType: String

Type: slider
Name: maxWordSeparation
DataType: Integer

We then link those parameter values with the data table by creating a Calculated Column using the “˜=” operator as part of a regular expression.

expression Spotfire Tips & Tricks: Searching Unstructured Data

The new Calculated Column will tag what was matched or not matched according your parameters. The final step is to use “limit data by expression” on a visualization using an expression that limits based on your match status, like this:

[Matches Proximity Search]=”Match”

Here is a view of an analysis that shows one table with all data, and another table with just the matched data:

Expression Match Spotfire Tips & Tricks: Searching Unstructured Data

For more robust content searching, TIBCO Spotfire Content Analytics can also be used. Content Analytics provides a mechanism for finding contextual meaning and relationships in the data, along with full text search and sentiment analysis. For more information on Content Analytics, click here.

Comments are closed.