In my previous blog I introduced you to TIBCO Spotfire and why it’s such a powerful tool. In this blog, I’ll show you that together with additional TIBCO components, it really does provide you with the stuff of dreams: the magic of understanding what is happening now.
We’ll dive into the three components needed to make this magic happen: TIBCO Data Virtualization, TIBCO Data Science, and TIBCO Streambase / Data Streams. (In case you’re wondering: No, I’m not secretly working for TIBCO ? I’m merely sharing my experience, and I haven’t come across any other suite of products that can do what this one can.)
TIBCO Data Virtualization
Spotfire is perfectly able to consume multiple data sources, join them together, wrangle the data into the correct format, and create models that can be reused. But in terms of data governance it might be better to have a central place where we create data models, and make them available in a controlled way, for self-service BI, and for use outside of our analytics environment.
The first time I came across a case like this, where many data sources had to come together, I immediately thought about having a data warehouse created, or a data mart on a data warehouse. That has been the royal way to go for decades. Of course, this works, but there are issues. We would need to multiply the necessary data, with all risks involved, and maintenance required. This could be costly, error prone, and time consuming. But as mentioned earlier, the sheer power of hardware and networking that we now have, makes it possible to do things differently, when feasible.
How can we approach this differently? Well, with TIBCO Data Virtualization. This amazing environment makes it possible to build virtual models, virtual data marts, virtual data layers. Only the structures are built, and the data stays in its own place. In the database, in the Data Warehouse, in the Cloud, in the Hadoop environment, you name it. Every time we need the data, we just access it at its original location. A query optimizer makes it fast, too.
Does this mean that we don’t need a DWH anymore? Well, no, not per se. But it does mean we can start thinking about other solutions. In fact, a growing number of organizations is indeed phasing out their Data Warehouses, by putting a Data Virtualization environment in place
TIBCO Data Science
TIBCO Data Science is a unified platform that combines the capabilities of TIBCO Statistica, TIBCO Spotfire Data Science (formerly Alpine Data), TIBCO Spotfire Statistics Services, and TIBCO Enterprise Runtime for R (TERR). It contains functionality that is very useful in creating Machine Learning models and data preparation pipelines. These models are basically statistical representations of the data available and can be used to predict behavior or outcomes of future events. As Spotfire can run models, like R or Python, Data Science is the tooling that will help you big time in creating these models. This is the place you want to be when you are looking to implement ModelOps, and that can give you a structured way to create, modify, train, and adapt your models and simplify bringing your models to production.
Data Science has a load of models on board, but also gives you the opportunity to build or import your own model. It will compare the created models for you and tell you which one has the best fit. In addition, the models can be used wherever you want them to. As said before, you can load them in Spotfire and use them there, or they can be used in TIBCO Data Streams.
TIBCO Streambase / Data Streams
So far, we discussed the old-fashioned BI: looking back at what happened, analyzing what happened, visualizing data to change it into information. We also just discussed more modern BI: looking forward, trying to predict the future based on what our models make of it. But there is one tiny slice that we did not yet touch on. The tiny slice between what happened in the past and what might happen in the future, being: What is happening NOW?!
The data that can tell us what is happening now, is different from the data we are used to. Data we gathered about the past is stored in, for example, a database. Predictions about the future are stored in a model. But the things that are happening now are just streaming in. So, we need a means to capture this information. Data Streams captures the data streaming in, and gives us the possibility to wrangle, adapt, filter, transform the tuples of data coming in. Data can be stored temporarily and can be passed on to analytical tools.
Bringing it all together
By bringing the four components together, we are creating the big Transformer that can do even cooler stuff then the small ones can separately. Imagine being able to have historical data available, being able to predict the future, and using all this information to interpret what is happening at this moment.
We have a very nice demo running that combines this all together. The company in the demo is a bank, and people can apply online for a loan. Of course, we have a lot of historical data that we can look at and analyze. We can use TIBCO Data Virtualization to bring all this data together and prep it for usage.
We know that certain customers defaulted on their loans, but why did they do that? Just looking at the data does give us some hints, but we would need to closely inspect the data to get a meaningful answer, using statistical tools that will tell us which correlations we have in our data. Are men more likely to default than women? Do single parents pay their loans less often than married couples?
The fact that we have a lot of data that is difficult to interpret, is one part of the problem we are trying to solve. In the meantime, applications keep coming in. It would be a good idea to have a model in place that helps us predict which customers will be likely to default on their loans, agreed? This is where TIBCO Data Science comes in. Using Data Science, we can setup and train a model to predict if an application will be successful or not.
Then we make sure the online applications come in through TIBCO Data Streams. Here we preprocess the application to flow into our visualization tool Spotfire, but we also run it through our predicting model. The model will use all relevant information about the applicant and the application. What is her annual salary, his home situation? Is he an existing customer? How did she behave on her last loan?
The result is a fully automated system that can potentially approve or reject the application for a loan in real time, based on the data we already have. Or, we can choose to use the model to help us in the approval process. Whichever option tailors our needs and purposes best.
Your needs and purposes
Did any part of this blog sound like it could be useful to you or your company? Then please do not hesitate to reach out to us. We’d be happy to answer your questions or schedule a no-strings-attached meeting to discover what BI dreams you’d like to make reality.