Building a Visual Text Analytics app using Qlik and Machine Learning techniques in NodeJS — Part 1

Dipankar Mazumdar
3 min readAug 9, 2021

In my last post, I tried elucidating what Visual Analytics is and highlighted how it can differ from Data Visualization in general with some examples. Today, we are going to talk about one particular research area within Visual Analytics i.e., Visual Text Analytics. This tutorial will focus on the nitty-gritty of this area of research and in my next post, I will do a step-by-step tutorial of how you can actually develop the application.

VISUAL TEXT ANALYTICS:

With the surge in the generation of digital text on the web in the form of product reviews, descriptions, feedback, etc., there has been a demand for leveraging text mining techniques to understand and analyze these unstructured data. Typically organizations would like to be able to identify patterns, specific keywords(that make an impact), similarities, etc. through text mining. However, the challenge in analyzing hidden patterns from a large noisy text corpora can be huge and at times daunting for analysts. To mitigate the challenge in the discussion, this research area aims to bring text mining, text visualization, and Human-Computer interaction together to make sense of the data.

SOLUTION:

In the past, I have built a couple of Visual Text Analytics applications using technology stack such as — D3.js, Plotly/Dash, Python Flask(for APIs), etc., and thought it might be interesting to try developing an app using Qlik Sense’s open-sourced solutions. Primarily, for this blog, we will be looking at two of Qlik’s frameworks — Nebula.js and Picasso.js. If you are not aware of them, here is a quick gist:

So, what will be building?

My idea is to build an Exploratory visual analytics app to discover insights from a Cannabis dataset. This will be a full-stack application to analyze the various components such as ‘Effects’, ‘Flavors’, ‘Type of cannabis strains’ and ‘Description’. In this particular dataset, the ‘Description’ field is textual and contains a particular strain’s summary. So, this field will be our focus for the textual analytics part. Below is an example of the ‘Description’ field:

Strain(A-10): A-10 has an earthy, hashy taste that provides a very heavy body stone. frequently used to treat insomnia and chronic pain.

To start developing the application, I have designed a high-level architecture to portray the various components involved in building the app. Hopefully, this will give a better picture for our next steps.

We will understand each of these components in details in our next tutorial and see them in action as we finish developing the app.

--

--

Dipankar Mazumdar

Dipankar is currently a Staff Data Engineering Advocate at Onehouse.ai where he focuses on open source projects in the data lakehouse space.