Keboola projects are gradually growing in size. Transformations are added, data is adjusted, and typically, if you don’t have good habits and a strong moral sense, there is no time to thoroughly determine whether some data can be deleted, for example. An even worse situation occurs when you inherit a project from someone else.
You can find a lot of useful information from Keboola Telemetry. However, you cannot easily find out which data is being downloaded but not used (tables are not used in any transformations or written by any writer). For this more detailed analysis, I wrote a Python script that not only provides me with basic information about the project (all buckets, components, tables) but is also able to extract the tables in storage that are not used. Are you interested in what the result looks like? Check out the sample report
To be used effectively, I have shared it in Google Colab, and you can use it for free. The source code is also available to you (under the CC-BY license), so you can download the entire script, explore the code, and customize it to your needs. Do you want to know the details about your project? So go for it.
How to Find the Correct Keboola API Address
The Keboola API address depends on where your project is „hosted.“ You can easily find it in the address bar after logging into your project because it’s the introductory part of the address. So, if your address is in the following format, then the part highlighted in red (including the .com ending) is also the Keboola API address:
Creating an API Token
You can create an API token in your project in five simple steps: