If you’re going to take the time to attend a conference, you want it to deliver on both the art of the possible and on practical value. I attended the second (tech) day of the Everyday AI Conference in New York City and learned a lot more about Dataiku and what it can do. For those of you who are diving more deeply into Dataiku in your everyday work, here are a few tips and tricks I picked up that can help you make the most of this powerful data tool.
Code Studios: Your Developer’s Playground
Dataiku’s Code Studios is a game-changer for developers. It offers an environment akin to popular integrated development environments (IDEs) like Visual Studio Code (VS Code). This separate sandbox within Dataiku allows developers to work without affecting the main instance’s load. You can even run a terminal within Code Studios, providing a seamless development experience. But here’s the real gem: you can sync your code from Code Studios into your Dataiku Flow, making collaboration and integration a breeze.
Managed Folders: Organize Your Data Your Way
With Managed Folders, Dataiku gives you the flexibility to store various types of files within a project. Whether it’s shell scripts, pickle files, or CSVs, you can neatly organize and access your resources. This feature streamlines your workflow and ensures that all your project-related assets are at your fingertips.
Personal API Keys: Access Everything with Ease
Personal API Keys empower you to run any task you have permission for within your project. It simplifies and speeds up your work by eliminating unnecessary steps and providing direct access to the tools and functions you need.
MLFlow Integration: Seamlessly Import and Reuse Models
If you’re into machine learning, Dataiku’s MLFlow integration is a game-changer. It allows you to import models, regardless of what they were originally trained on. What’s even better is that you don’t need to rewrite your code to make these models work within Dataiku. MLFlow has zero dependencies on Dataiku-specific packages, making model integration a breeze.
Join: Beyond the Basics
Dataiku’s Join function goes beyond the basics. It lets you join multiple tables, not just the leftmost one. What’s more, you can join another table directly from the Join screen, even if it isn’t already an input in your recipe. Say goodbye to unnecessary steps and enjoy the flexibility of a single tool for single or multi-joins. It’s an improvement over other platforms, like Alteryx.
Prepare: Data Transformation Made Easy
Dataiku’s Prepare is a breakthrough tool for data transformation. It’s packed with features that make your work more efficient. For instance, it displays updated values in real-time, which is a significant improvement over some alternatives. When working with data, you can easily keep rows that match a specific condition. Keyboard shortcuts are your friend – simply press “C” to get a column, and double-clicking cells brings up in-context actions. Working with dates? Extract Date Components makes it a breeze to split dates into Year, Month, Day, and more. Process with Formula provides autocomplete and in-context documentation, and rounding numbers from Float to Int or Double is a breeze.
These tips and tricks just scratch the surface of what you can achieve with Dataiku. As a power user, there a wealth of tools and functions available to streamline your workflow and get the most out of your data. Dive in, experiment, and unlock the potential of Dataiku for your everyday work.
Happy data crunching!
Also See Shaun’s Everyday AI Conference Report:
“We Were Made for This moment:” Dataiku Positions Itself as the Platform for Everyday AI