2 Open source business analytics tools
Guillem Borrell Nogueras edited this page 1 year ago

I've used many free, hosted, locked... business analytics tools already and all of us should feel fortunate that Metabase and Superset are free.

Metabase

I started using database in projects around the end of '19. I liked its simple-but-get-stuff-done approach. The no-code interface feels unintuitive at first, and you will be confused sometimes about how dashboard-global filters work. But if I had to choose something really great about Metabase is that it's incredibly simple to deploy and it's probably the most efficient business analytics tool that there is. Cpu and memory pressure is always ridiculously low, it really does a great job at pushing the load to the database, like it's not doing any serialization.

You deploy Metabase, not a service that needs some additional cache, a task queue... You just deploy a container with Metabase, point requests to the corresponding port, configure the database connectors, and that's it. It is really a component that you can get up and running in your stack in no time.

Metabase has a paid edition with some additional features. The most relevant of them by far is SSO support. Other "enterprise" features, like the authorization layer, made it into the community edition.

Superset

Metabase was my confort zone until I needed some advanced geo representation and I tried Superset. It's harder to install; I remember struggling for a a couple of hours to get dependencies right for 2.0. I then packaged them in a script so I could have reproducible installs, but nothing close to the absolute simplicity of Metabase.

The instructions recommend installing Superset with Celery and some cache, either Memcached or Redis. I've never installed them and the server could always sustain the load of teams of 5-6 analysts and consultants with 4 gunicorn workers. But these workers hit 100% of cpu load often.

It's more powerful but less polished than Metabase in almost every single aspect. It's trivial to understand who has access to which dataset or dashboard with Metabase, but Superset's fine-grain permissions are harder to understand, and you need some careful reading to understand how they work.

Superset is just a Flask application with a React frontend so you get some of the enterprise-ish features out of the box. This is relevant if you want to make it part of a client deliverable. Keeping separate authentication ends up becoming an annoying and time consuming task.

Which one to pick?

I remember Metabase becoming the way in which the Analytics team replaced internal presentations. Metabase kind of pushes you to generate simple visualizations and dashboards. The fact that Metabase analyzes the data and suggests you which kind of visualization can produce is of great help, maybe you thought your data would fit into a waterfall, but it doesnt... I remember the partners of the case logging in, taking a look at a couple of key metrics, and leaving us alone. It was a great tool to leverage in internal meetings, but its simplicity always required some amount of voice over.

Superset always made a great deliverable. My team could deliver better dashboards with Superset, particularly if you had to expose very complex data. The amount ant variety of geoanalytics capabilities of Superset are a life saver sometimes. I remember combining h3 hexes with geo-hexes, grids and scatterplots to generate very fast and eye-catching visualizations.

But business consultants kept using Alteryx and Tableau

One important point about these tools is that they're designed with analysts in mind. Analysts can use it efficiently because they build upon SQL semantics, even if you're using the no-coding interfaces. You must understand joins and aggregations, and these concepts are not at the fingertips of business consultants. And, let's be honest, business consultants are the final users of these tools.

One underestimated aspect of project execution is that, while most business consultants are fine at working long hours preparing a couple of slides, they won't invest a second learning a tool that they will use for a single project, even if it improves teaming. They will push an all nighter to improve a Tableau dashboard, or finishing an Alteryx workbook, but they won't spend the 4-5 hours needed to learn the basic bits of SQL.

If your goal choosing a business analytics platform is to trick consultants into use it, you'll probably fail. My advice here is pick the one an analyst will be happier and more productive.