C4E

Mulesoft Development Fundamentals: Dataweave Best Practices

Written by:
Published on June 18, 2021

Mulesoft Development Fundamentals: Dataweave Best Practices

‘MuleSoft development fundamentals’ is a blog series that takes you through various aspects of MuleSoft development from “How to structure your Mule code” to “Things to cater to when you deploy to production”. We would love to share our expertise with the Community, having worked with several MuleSoft Enterprise clients.

Please find all the blogs of this series below:

This read will take you through the best practices, tips and tricks while writing Dataweave in your Mule applications. This would also help you in:

  • Structuring your DWL
  • Making it reusable
  • Optimized for lower maintenance and future changes
  • Writing your DWL
  • Collection of some tricky DWL situations you might face

Writing your DWL

While writing a simple dwl seems straightforward, one should also note that with the complexity of the script increasing, so does the effort required to make changes. Let me also warn you that Maintaining common functionalities across different dwls can be a challenge too.

WRITING IN FILES

By default, the transform message component writes all the data we’ve script directly in the configuration XML file. This experiences multiple disadvantages such as:

  • Really big mule configuration XML files – The length of XML files directly depends on the lines of code that is present in your dwl script. This can get big at times depending on the complexity of the script and becomes tough to manage.
  • Studio bug which does not save the script code – Anypoint Studio has a bug that at certain times will not save the changes you have made to an inline script in the XML file even if you click the ‘save’ button right after modifying your code. The studio just saves it back to the older version without incorporating the changes you have made. This does not happen when DWL script is written in a dedicated file.
  • Not reusable – If a certain script needs to be used at more than one location due to the application’s requirement, writing the script directly in the XML file will not let you do this. You will have to replicate the code in that desired location as well and then update both/all the locations whenever you want to make any changes.

All these problems can be overridden by writing the dwl script in a dedicated file which gets stored in src/main/resources. Anypoint studio gives an option to do this as shown in the below image.

DWL file written in such a way should be organized into use-case specific folders inside src/main/resources/dwl and be named appropriately reflecting the purpose they serve.

DEFINING METADATA

Metadata should consistently be defined when using a transform message component. Defining metadata enjoys the following advantages:

  • Transform message component can easily recognize and parse the incoming data. This becomes a must when dealing with CSV input type otherwise, DW component may in some cases treat the incoming payload as a string instead of parsing it as CSV or any other type.
  • It makes DW script easier to write as while writing the script, you don’t have to refer to another file consisting of the sample input. You can just view it on the left-hand side.
  • Defining output metadata in conjunction with input will let you draw mappings by dragging your mouse instead of writing one to one mappings for fields from input to output.
  • Sample file used for defining metadata also serves as a sample payload while writing your dw script which can be used to see the preview of how your code will work all in the same screen. One does not have to wait for compiling and running the application to check if dw script is working as expected.

The image below shows a sample of the above-mentioned points when put to use.

WRITING FUNCTIONS

One of the most convenient ways to process complex scenarios in dataweave is to write functions in dwl. When written in script, this complex processing can be used only once, but when you move it to a function, you can use it as many times as required throughout your script. Functions in DWL offer a straightforward way of passing a particular set of parameters and obtaining the desired output in a reusable manner which makes much more sense. In dataweave functions are also first citizens just like objects and this lets to pass functions as parameters to other functions.

In the below example, you can see how writing a function for something as simple a remove special chars can reduce the effort needed when any changes are required such as replacing a diverse set of special chars when compared to writing the same logic multiple times.

DWL: Replace special char done for every element in the output. This has the disadvantage of reworking every single line of the script when the regex for replacing function needs to be updated.
DWL: Replace special char written as a function that can be reused throughout the script. This allows for a single point of changes in case the replace special chars function needs to be updated, but even this function needs to be called every time in every line of data that you need to amend/apply this function on. Additionally, the same has to be done for every new line of code that is added to the script and requires the same treatment.
DWL: Writing a function to traverse through an entire payload of any structure and apply a certain function to it. Now, this provides the most generic way in which applying a certain moderation to the entire payload can be done. This recursive function has the capability to traverse through any java/json payload and apply the passed function to it. This removes the need to do any modification to the script line by line and adding a new line of code does not need to cater to this.

Something that should be paid attention to in the above examples is that functions can be called either in the header part or in the script directly, but to use the output of a certain function in the script, a variable has to be created which can hold the output of the called function.

CREATING VARIABLES

Creating variables promotes reusability and helps in better maintainability. The most common usage of variables are:

  • Split a common logic between variable and main script

Sometimes a particular transformation can be too complex and writing a bunch of those in the script may just make things look messy and will not communicate the intention to the next developer. Writing complex logic in a script also increases the probability of syntax and logical errors while trying to modify the script. Declaring a variable and defining logic there makes the code look clean, less error-prone and increases the maintainability.

  • Creating a variable for a particular item in payload gets used a lot in the script.

There can be situations where a particular component in the incoming payload gets used at several places, e.g. address extraction or just a key whose path is prone to change. It’s always better to store such keys  in dwl variables and use the variables in the script so that any change to them can be done at a single point.

Script using two variables addressing the two purposes highlighted above

In the above example, there are two variables, custAddressPath which stores the long path to the customer address object to make sure any change to the customer address object path will result in changing only one line in the entire code.

The second variable created is custAddress which joins all the components of the customer address object and puts them together in a readable format. Since this logic is needed at two places both for the customer’s billing address and delivery address, this variable reduces the lines of code by storing them in a single place.

MAKING CHANGES AT SCRIPT LEVEL

There are certain situations where a particular change, check or function has to be applied to the entire payload, but writing everything line by line might not make sense. Although for complex scenarios there are recursive functions to traverse through the payload, there some features in dataweave that let you apply e.g. a null check to the entire payload with just one line of code.

%output application/xml skipNullOn="everywhere"

USING DWL FILES EVERYWHERE ELSE

Writing dataweave script in files has another advantage, dwl files can not only use be used in transform message components but in any mule component that allows using of dataweave(which is almost everything in Mule 4). If a dwl file is stored in src/main/resources, you can use ${file::filename} syntax to send the script in a dwl file through any XML tag that expects an expression. See the example in HTTP:body:

A file called transform.dwl stored in src/main/resources. If the dwl file is stored somewhere inside another folder in src/main/resources, it should be reached using :: key. e.g. ${file::dwlScripts::http::transform.dwl}

Make it reusable

It’s always easier if common functions can be maintained centrally so that when a change comes through, you need to modify code only in one place. Just like how you can include different DWL libraries(Mule provided) in your DW script, you can also include a custom written DW file in your script to reuse your functions. The idea is to identify all the common functions that apply to more than one script in your application and store them in a single file and now you can include this common file in any DW in your application and use the functions.

e.g. a function is written in a file dwl/modules/functions.dwl like this:

File with a function that can be exported. These files need not have an output type or the classic “—” bifurcator to separate script body from the header.

And can be used like this in the implementation:

Import functions using import keyword where dwl::modules::functions is a reference to the path dwl/modules/functions.dwl and as common is an alias for this path to be used in the script as seen in line 9

IMPORTANCE OF PARAMETERIZATION

Any static data which needs to be used in a dwl script and can be stored as property must always be stored as a property. Whether it is looking for a particular string from a set of values or an exact string match, writing this static data as a property always helps in changing or updating conditions on the go without having to touch the dwl script. Such properties are usually not environment-dependent and can be stored in dedicated dwl properties files which span across all environments the application will be deployed to.

Extracting a simple property from a property file for string matching. The actual property stored in the application is shown in line 3 of the image for reference.

After being done with an exact string match, let’s see how to deal with matching an incoming value against a set of values.

Comparing an incoming string with a set of values from a property file. The actual property used in the application and the input value is both shown in lines 3 and 4 respectively. Here, a splitBy function is used that will split the comma-separated property string and return an array of string with all values separated. This will then be checked against the incoming payload value using the contains operator.

Sometimes a single key-value pair storing is just not enough and the property storage becomes complicated. Although this is not the best way but a JSON can be stored as a string in a property file and used in a dwl script in the following way.

Storing a JSON object as property and using it in the script as needed. The intent here is to first read the JSON stored in the property file as a JSON(converting from string) and then assign it to a variable created inside the script so that other operations and manipulations can be applied on top of it.

Developer’s Experience: DW Playground

With all the benefits and advantages of dataweave, there is still a major hiccup when it comes to writing dataweave while developing an application. Anypoint studio allows dynamic viewing of working of dataweave code using the preview functionality, but this comes with the problem of Anypoint Studio getting stuck. While Anypoint studio is not very high performing in general, it chokes when dataweave(Transform Message component) is used. Although there is no fixing Anypoint Studio soon, there is indeed an alternative approach to this, called Dataweave playground.

Dataweave playground is a docker image that lets you write and run dataweave right out of a browser. This can be found here, https://hub.docker.com/r/machaval/dw-playground It is very easy to spin up a local/cloud instance of this image but there is a free version of this is hosted already by Mulesoft for free to practice and have fun, which can be found here: https://developer.mulesoft.com/learn/dataweave This comes along with all the dataweave documentation, so next time there is an issue with Anypoint Studio or you just don’t want to open it to write code, just go here.

Other Dataweave references

https://docs.mulesoft.com/mule-runtime/4.3/dataweave-create-module

Leave a Reply

Your email address will not be published. Required fields are marked *

Other Blog Posts

Other Blog Posts

MuleSoft Runtime Code Scanning – Why Do You Need It?

One of the most frequently asked questions is if we have static code analysis and a well defined DevOps process, why would we need run time code analysis? In this article, let’s explore the differences between the two and why you might want to have runtime code analysis (and IZ Runtime Analyzer) even if you have …

Read more

Ensuring Software Quality in Healthcare: Leveraging IZ Analyzer for MuleSoft Code Scanning 🏥💻

Ensuring software quality in the healthcare industry is a top priority, with direct implications for patient safety, data security, and regulatory compliance. Healthcare software development requires adherence to specific rules and best practices to meet the unique challenges of the industry. In this blog post, we will explore essential software quality rules specific to healthcare …

Read more

Mule OWASAP API Security Top 10 – Broken Object Level Authorization

In Mule, Object-Level Authorization refers to the process of controlling access to specific objects or resources within an application based on the permissions of the authenticated user. It ensures that users can only perform operations on objects for which they have appropriate authorization. To demonstrate a broken Object-Level Authorization example in Mule, let’s consider a …

Read more

How KongZap Revolutionises Kong Gateway Deployment

In a rapidly evolving digital landscape, businesses face numerous challenges. Faster time to market is the only option business can choose. When it comes end to end Kong Gateway life cycle from deploying to managing Kong Gateway, every one of these challenges is applicable. However, KongZap, a groundbreaking solution is a game-changer by addressing some …

Read more