GROW YOUR TECH STARTUP

How predictive data mining can help forecast the online behavior of consumers (podcast episode)

April 28, 2023

SHARE

facebook icon facebook icon

In today’s episode of the Brains Byte Back podcast, we speak with Walter Paliska, Vice President of Marketing at dotData, a company that democratizes the use of AI and Machine Learning by making it simple for organizations to leverage the power of their data through fast, unique, and easy-to-use tools.

In this episode, we discuss how the company first began, how it has grown, and the solutions it currently offers. We also explore predictive data mining and how it helps businesses leverage historical data to make accurate predictions about future behavior or outcomes. 

Paliska draws on the example of marketing, arguing that a company could use predictive data mining to predict the behavior of visitors on a website for personalized targeting.

Additionally, Paliska shares how dotData was founded, after the company’s CEO and founder, Ryohei Fujimaki, noticed that his team of data scientists would spend a huge amount of time on the feature engineering process. 

Since this was the most time-consuming and manual part of the data science process, Fujimaki decided to utilize automation to solve this problem, thus dotData was born.

Paliska also shares how the company helps clients save time and cost on data science projects, democratizing data science for non-data scientists, and improving the accuracy and speed of machine learning models. 

He argues that these are just a few of the ways dotData distinguishes itself from the competition. But Paliska advocates that one of the biggest differentiators is the core engine of dotData, which automates the feature engineering process. 

Feature engineering is a vital step in building effective machine-learning models, but it is also a complex process that can be very time heavy. Normally, data scientists would spend months building feature tables that would then be entered manually into the machine learning algorithms. 

However, dotData automates this process, identifying the connections between different tables and building feature tables automatically. According to Paliska, this approach allows dotData to stand out and is one of the key reasons why the company has achieved such a high level of success in the market.

You can listen to the episode below, or on SpotifyAnchorApple PodcastsBreakerGoogle PodcastsStitcherOvercastListen NotesPodBean, and Radio Public.

Alternatively, you can find a transcript below:

Walter: My name is Walter Walter Paliska. I’m the VP of Marketing for dotData. I’ve been with dotData since May of 2019. So almost four years now. And dotData is a leading provider or data science automation solutions, we have largely broadly speaking to customer types that we tend to target, experienced assigned teams primarily in larger organizations that are looking to automate the feature engineering part of their work. And the other one are more companies that are just getting started in the world of predictive analytics and in the world of data science, and are looking for automation solutions to try and empower non-data scientists in the process of building machine learning models and data science processes integrated to do predictive analytics.


Sam: Awesome, fantastic. Well, thank you so much for joining me today. It’s a pleasure to have you here. And I’m really curious to know like when and how to dotData first stop?

Walter: Yeah, so great question. So actually, the dotData story goes back quite a ways. Ryohei Fujimaki the CEO and founder of dotData, is a former employee of NEC in Japan. So he was actually what’s known as an NEC fellow Research Fellow. Now, there haven’t been that many in the history of the company, and he sees quite an old company goes back well over 100 years. And I forget the exact number, I don’t want to lie to you. But I know that, you know, there’s been very few NEC research fellows in the history of the company, he was the youngest one ever in the history of the company. And he was part of their data science team, and pretty much trend data, their data science organization from a services perspective, right. So they would do project based work going to accounts. And the idea behind dotData actually originated. Through the course of his experience with NEC, one of the things that he consistently kept noticing was that his team of data, scientists would always spend an inordinate amount of time on specific parts of the data science process. And getting a little bit into the technical weeds here. But there’s one part of specialty known as feature engineering, which is really the most time consuming the most manual part of process. And he kept seeing that they would literally be spending months on the feature engineering process. And they would still be at a point where they hadn’t even played with any machine learning algorithms yet to figure out what the model should look like. And you know, that’s obviously an aha moment, at some point that tells you there’s something there. So there’s a need here, there’s automation could perhaps solve that problem. And that’s where the idea behind dotData originated in the company was founded as a spin off from NEC Corporation in 2018. In Japan, and so originally born in Japan, but fully headquarter nine, United States, we have all of our headquarters, if you have staff and pretty distributed company. We’re worldwide we have people in Europe, we have people in Japan, with people in the United States. And that brings us to where we are today.

Sam: Awesome. That’s a fantastic success story. And I’m also very curious to know what is the story behind the name dotData because for our listeners, it’s spelled like dot d o t, but with a lowercase d, and then data with a capital D right after it. Where did that come from?

Walter: Great question. So the original inspiration behind the name dotData really comes from, you know, one of the ideas that they were the group of people to found that a company was toying with was, you know, we’re talking like, probably when this idea was first being kicked around 2016 2017. So there were a lot of conversations in the world going on about data and the volume of data and how much data was being generated around the world on a daily basis. And it sort of dawned on them, you know, the previous iteration of the internet, so to speak, in the 1990s, and the 2000s. Were all about dotnet. Right? And the network, and they thought, well, the next iteration of the world is really going to be about data. So that data, so instead of.net dot data. So that was the original intent behind the dotData name.

Sam: Okay, yeah, that makes sense with a lot more clarity. With that in mind. I also really do love the alliteration of it the D D, the dot data, it’s got a really nice sound to it when you say

Walter: Yeah it is a memorable name, and it’s, it’s easy to mark it.


Sam: Yeah, yeah, I completely get that. And I also want to know, like, in November last year, you folks there at dotData published an article called what is predictive data mining? Obviously, I would highly recommend listeners go check it out. But while you’re here, like could you give us a brief overview regarding what predictive data mining is?


Walter: Sure, absolutely. So obviously, you know, from a perspective of the audience, those that are not familiar with it, you know, you may have heard of it in different terminology may have heard of referred to as predictive analytics, predictive data mining data mining by itself, technically, they’re not quite the same thing, if you really want to split hair, but for, you know, sort of a broad audience, predictive data mining and predictive analytics are really about leveraging historical data that you have in your organization. You know, for example, a good use case might be in marketing, you might want to predict behavior of visitors on your website for a shopping cart, right? And you have historical data about, you know, what are the actions that certain people take before they purchase a specific product. And you want to use that data, you want to mine that data, and use very specific techniques and algorithms things like and you know, decision tree analysis or rule induction, clustering, outlier detection and other types of data mining techniques, to identify the patterns, determine, you know, sort of visit the insights, part of predictive analytics, build those insights. These are also sometimes referred to as features in the world of machine learning, but figure out the insights that tell you okay, this is what tends to happen. Every time somebody buys a particular product in the example we just had, and then taking it one step further and saying, Okay, now I can use certain machine learning algorithms to try and predict a probability of somebody purchasing a product when they take certain specific actions. The reason that’s important you as a marketer for the example that I just gave is, if I can predict with a certain degree of accuracy, what’s going to happen when somebody takes certain specific actions, I can drive people towards those actions, I can now leverage that information to optimize my marketing campaigns. So that’s predictive data mining in a nutshell, it gets a lot more complicated very quickly, there’s a lot more to talk about. So I’ve given you literally the marketing guys 32nd version, a lot more information available on our website, and we’re happy to meet with anybody that obviously wants to learn more about this. We’re very keen on educating the market as much as possible about this.

Sam: Yeah, I can imagine that can get pretty complicated pretty quickly. So I really do appreciate you giving us that brief overview. And I think you did a good job of summarising what appears to be a very complex topic. Now, I also want to know like, are there other companies operating this space? And if so, like, how do you folks at dotData, differentiate yourself from the competition?

Walter: Great question. So the short answer is yes, of course, there are plenty of other companies that operate in this space. And having, you know, having said that, the one thing that is probably also a truism about the Machine Learning slash predictive analytics space, is that it is developing and is changing at an incredibly fast pace. So if you look at the positioning of companies, say, three, four years ago, when I first joined up dotData, versus how those same companies are positioning themselves today, how their products are built today, where they do today, radically different conversations. And that’s really been driven largely by how quickly the market is developing. Through all of that, however, the one huge differentiator for that data really comes down to the core engine of dotData and how dotData works. So one of the things that I haven’t, we haven’t talked about, and again, this gets a little bit technical depth of this conversation, but it’s important is that in the world of predictive analytics, right, when you go and use these machine learning algorithms to build your predictive models, these machine learning algorithms like flat tables, essentially, they’re not happy. So if you know if you don’t know how much you know about enterprise data, but especially in the world of enterprise data, if you think about something like salesforce.com, for example, as a user of salesforce.com, I just see a leads screen and it has lead information. And as my activities against my leads, it’s all in one location. But if I take the covers off of that, and look underneath in the guts, so to speak, of how this system operates, it’s basically what’s called a relational database. So all of those fields that I’m seeing many of them are really parts of different tables delivered in different parts of salesperson, they’re all connected together. Well, machine learning algorithms don’t like those things, machine learning algorithms, like flat tables, machine learning algorithms, like things that look like CSVs looks like spreadsheets. So a big part of machine learning is what’s called Feature engineering, which is essentially a process of taking these complex relational data tables, figuring out the patterns that make sense for your machine learning algorithm and building these flat tables essentially, that you then have to feed into machine learning algorithms. The biggest core differentiator for dotData is that we do that part automatically. That’s traditionally a very hands on process. If I go back to when you asked me the very first question about, you know how that data originated. That was the aha moment that our CEO had was watching In these data scientists spent literally months on building these feature tables that they would then have to manually put into the machine learning algorithms. And realizing there’s got to be a better way, we have to be able to build a system that will automatically find the connections between these tables automatically identify the patterns that are relevant and purposeful, and build these feature tables automatically. So that’s by far our biggest differentiator. And today, we’re really the only company in the market that provides that functionality.

Sam: Okay, that makes sense. Yeah. And I always love the fact that whenever I interview people, it always seems like there is some kind of aha moment at the basis of all of these companies I speak with. And it’s really, it’s a really fun part of my job getting to that core drive, I guess, so that makes absolute sense. And I’m curious to know, what is next on the horizon for you folks at dotData?

Walter: So I think from, you know, sort of a couple of perspectives, right. One is from a, obviously, as a business growth is the biggest area of interest for us. And actually, one of the things that we think, especially given the economic uncertainty that’s happening right now, systems like that data actually become even more beneficial to organizations. You know, when money went investment, capital was plentiful, and organizations didn’t have to worry about headcount and didn’t have to worry about, you know, you could just hire to solve the problem, right? You needed to do things faster, you hire more data scientists, you needed to build products faster, you hired more data engineers, and so on. Well, with the economy doing what it’s doing right now, we’re actually seeing an uptick in demand, we’re actually seeing more companies saying, I don’t have the ability to expand my team, I don’t have the permission, so to speak financially, to go and hire 1015 20 More data scientists. So how do I make my existing team more productive. And that’s where dotData can give them a lot of help. So we see a lot of opportunity in the short term, as well as a long term from that perspective. And obviously, we have a whole lot of ideas and a whole lot of new things coming down the line from a product perspective, most of which I can’t really talk about just yet, but some very exciting things that are coming in the second half of this year, that will continue to expand on the capabilities of the product and also take us into some new areas we haven’t been in before.

Sam: Fantastic. Well, it sounds like you folks have a lot going on. And I wish you the best of luck with that. And if people are listening, and they’re interested in keeping up with you, personally, Walter or dotData, where can they go to do that?

Walter: Great question. So that dotData is easiest one, just go dotdata.com. To connect with me personally, you can find me on the leadership page. If you go to our about page and then leadership, you’ll see my picture my bio, and you can click directly to my LinkedIn profile, or my LinkedIn profile is simple enough. It’s just https://www.linkedin.com/in/walterpaliska/ go to my LinkedIn profile and just reach out to me.

Sam: Excellent. Well, we’ll include links as well in the description of this episode, so listeners can go there. But otherwise, Walter, thank you so much for joining me today.

Walter: Thank you very much for the opportunity, and thank you to all your listeners.

Disclosure: This episode includes a client of an Espacio portfolio company

SHARE

facebook icon facebook icon

Sociable's Podcast

Trending