Setup python3 to be the default version on Mac

After installing XCode on Mac, the system now has both python and python3 available. By default, python still points to the python 2.7 binary. In order to make python3 to be your default version, add the following PATH to your .zshrc file

export PATH="/usr/local/opt/python/libexec/bin:${PATH}"

Restart the terminal and check out the python version, and you’ll notice that python now is pointing to the latest version.

varun~ # python --version
Python 3.9.7

Python – Avoid this common mistake of writing functions with default parameters

Writing functions with default arguments seems like a fairly straightforward task. Default the argument in the function definition and you’re all set. That’s right works like a charm as shown in the following example.

def greetings(name, message = "Good Day"):
  print(f'Hello {name}. Have a {message}')

greetings('John')
greetings('Daniel')
greetings('Jim', 'Good Afternoon')
greetings('Michael')

Returns the following output. Perfect! All good.

Hello John. Have a Good Day
Hello Daniel. Have a Good Day
Hello Jim. Have a Good Afternoon
Hello Michael. Have a Good Day

Now, let’s try to introduce a mutable argument in the function. We’re printing the time of execution for each statement and giving in a 2 second pause between each call to capture the time increments.

from datetime import datetime
import time

def greetingsWithTime(name, currentTime = datetime.now()):
  print(f"Hello {name}. The time is {currentTime.strftime('%B %d, %Y %H:%M:%S')}")

greetingsWithTime('John')
time.sleep(2)
greetingsWithTime('Daniel')
time.sleep(2)
greetingsWithTime('Jim', datetime.now())
time.sleep(2)
greetingsWithTime('Michael')

And, unfortunately, the output is not what you expected. With a timer increment between each call, we expected a new timestamp on each greeting message, but that didn’t happen. But first let’s look at the output you get from running the above code.

Hello John. The time is June 26, 2021 17:26:24
Hello Daniel. The time is June 26, 2021 17:26:24
Hello Jim. The time is June 26, 2021 17:26:28
Hello Michael. The time is June 26, 2021 17:26:24

Why? Let’s find out. The argument time is a mutable object, and therefore it only gets initialized when the function is first called. The name argument being immutable gets re-initialized every time the function is invoked and therefore picks the new value. As a result, what we see in the function greetingsWithTime is that the first time the method was invoked, the time variable was initialized and continued to hold it’s value.

How do we solve this problem? The solution is not not initialize the argument within the function definition, and instead initialize it within the function body. Transforming the function slightly as shown below would ensure that the value is set properly every time the function is invoked.

def greetingsWithTime(name, currentTime = None):
  if currentTime is None:
    currentTime = datetime.now()
  print(f"Hello {name}. The time is {currentTime.strftime('%B %d, %Y %H:%M:%S')}")

Results in the following output, that we expected.

Hello John. The time is June 26, 2021 17:36:57
Hello Daniel. The time is June 26, 2021 17:36:59
Hello Jim. The time is June 26, 2021 17:37:01
Hello Michael. The time is June 26, 2021 17:37:03

DataWeave Cheat Sheet

Here is my most frequently used DataWeave cheat sheet that I keep handy.

Access Mule variable without DW expression

On occassions when you do not have the ability to add a DW expression and only have a TextBox in Mule, you can access the variables within by using the following #[varName] notation. e.g.

#[vars.sftp_password]
#[payload]

DateTime manipulations

The following code block describes how we can effectively accomplish the following:

  1. Convert string to DateTime
  2. Format current DateTime into the desired string format
  3. Combine separate Date and Time components represented as strings, and convert them to a DateTime format
  4. Use a string variable and convert it to Period to compute a relative DateTime
%dw 2.0
output application/json
var dateTime = "2021-05-26 12:03:00 AM UTC" as String
var downtime_daily_start_at = "1970-01-01T12:00:00"

// Format current DateTime into the desired string format
var curDate = now() as Date{format: "yyyy-MM-dd"}
var startTimeStr = (downtime_daily_start_at splitBy("T"))[1] ++ " UTC"
var startDateTimeStr = (curDate ++ " " ++ startTimeStr) as String
var duration = "PT3H"

// Convert string to DateTime
var startDateTime = startDateTimeStr as DateTime {format: "yyyy-MM-dd HH:mm:ss z"}
// Use string variable and convert it to Period to compute a relative DateTime
var endDateTime = (startDateTime + duration) as DateTime {format: "yyyy-MM-dd HH:mm:ss z"}
---
{
    "strToDateTime": (dateTime as DateTime {format: "yyyy-MM-dd hh:mm:ss a z"}),
    "curDate": curDate,
    "startDateTime": startDateTime,
    "endDateTime": endDateTime,
    "downtime": (startDateTime < now()) and (now() < endDateTime)
}

Converting Unix timestamp/epoch to Datetime in dw

%dw 2.0
output application/java
var epoch = "1609459200000"
---
(epoch/1000) as DateTime  as String {format:'yyyy-MM-dd'}

Convert a JSON object to String

%dw 2.0
output application/java
---
write(payload, "application/json")

Convert a XML to Base64 format

Converting an XML to Base64 is a 2 step process:

  1. Convert the XML to String
  2. Convert the String to Base64
%dw 2.0
import * from dw::core::Binaries

var xmlString = write(payload, "application/xml") //convert xml object to string
---
toBase64(xmlString)

Merge the response from scatter-gather payloads

When using a scatter-gather component in Mulesoft, the payloads from corresponding sub-reference-flows are aggregated together under a common payload. So if you had 2 sub-reference-flows in a scatter gather, the combined payload response is going to look similar to this:

{
  "0": {
    "inboundAttachmentNames": [
      
    ],
    "exceptionPayload": null,
    "inboundPropertyNames": [
      
    ],
    "outboundAttachmentNames": [
      
    ],
    "outboundPropertyNames": [
      
    ],
    "payload": [
      {
        "action_type": "EXPORTED",
        "created_at": "2020-11-05T23:55:48",
        "event_id": "31ce5def-2c48-4095-9a49-3fe59177b2b3",
        "event_type": "eventUpdated",
        "id": 1,
        "status": "RETRY"
      }
    ],
    "attributes": null
  },
  "1": {
    "inboundAttachmentNames": [
      
    ],
    "exceptionPayload": null,
    "inboundPropertyNames": [
      
    ],
    "outboundAttachmentNames": [
      
    ],
    "outboundPropertyNames": [
      
    ],
    "payload": [
      {
        "action_type": "SUBMITTED",
        "created_at": "2021-05-26T08:54:24",
        "event_id": "89f5671e-5db4-40b4-bb90-06ff2ba71479",
        "event_type": "eventUpdated",
        "id": 4,
        "status": "SUSPENDED"
      },
      {
        "action_type": "SUBMITTED",
        "created_at": "2021-05-26T08:55:53",
        "event_id": "89f5671e-5db4-40b4-bb90-06ff2ba71481",
        "event_type": "eventUpdated",
        "id": 9,
        "status": "SUSPENDED"
      }
    ],
    "attributes": null
  }
}

Most often, the useful piece of information with that object is the payload itself. So to combine the payload, you could use the following dataweave expression:

%dw 2.0
output application/json
---
payload[0].payload + payload[1].payload

However, if you anticipate the sub-reference-flows to grow gather over time and don’t feel like updating the expression to combine them, you could use the mapObject function to gather all the payload response and flatten out the list to get the desired list of payloads.

%dw 2.0
output application/json
var mapped = payload mapObject(v,k,i) -> (
    v
)
---
flatten(mapped.*payload)

Either of them resulting in the following output:

[
  {
    "action_type": "EXPORTED",
    "created_at": "2020-11-05T23:55:48",
    "event_id": "31ce5def-2c48-4095-9a49-3fe59177b2b3",
    "event_type": "eventUpdated",
    "id": 1,
    "status": "RETRY"
  },
  {
    "action_type": "SUBMITTED",
    "created_at": "2021-05-26T08:54:24",
    "event_id": "89f5671e-5db4-40b4-bb90-06ff2ba71479",
    "event_type": "eventUpdated",
    "id": 4,
    "status": "SUSPENDED"
  },
  {
    "action_type": "SUBMITTED",
    "created_at": "2021-05-26T08:55:53",
    "event_id": "89f5671e-5db4-40b4-bb90-06ff2ba71481",
    "event_type": "eventUpdated",
    "id": 9,
    "status": "SUSPENDED"
  }
]

Array list intersection

Given two list/arrays, find the intersection, i.e. elements that are common in both arrays. e.g.

listA = [1,2,3,5]

listB = [2,4,5]

The expected output should be [2,5]

%dw 2.0
output application/json
var listA = [1,2,3,5]
var listB = [2,4,5]
---
listA filter((item, index) -> listB contains item)

Conditionally add Sibling node to XML

%dw 2.0
output application/xml
var payload = payload
var context = true
var soapEnv = {uri: "http://schemas.xmlsoap.org/soap/envelope/", prefix: "SOAP-ENV"} as Namespace
ns xsd http://www.w3.org/2001/XMLSchema
ns xsi http://www.w3.org/2001/XMLSchema-instance
---
soapEnv#Envelope @('xmlns:xsd': xsd, 'xmlns:xsi': xsi):
{
    soapEnv#Header: vars.jXSoapHeader,
    soapEnv#Body: {
        CustSrch: {
            SrchMsgRqHdr: vars.jXchangeHdr ++ ("MaxRec":"10") if(context==true),
            UserId : payload.userId,
        },
    },
}

Conditionally adding items to JSON

Given a payload which has an optional element metadata, which further down may have some optional items. Build an output JSON with some selected elements from the input payload. e.g. In the given payload, filter out only the primaryUser and secondaryUser elements.

{
   "metadata": {
        "primaryUser": {
            "_id": "238d15d9-5843-4c39-a14d-cb0249a0ae53",
            "fullName": "Tommy Jones"
        },
        "secondaryUser": {
            "_id": "238d15d9-5843-4c39-a14d-cb0249a0ae53",
            "fullName": "Bob Smith"
        },
        "primaryPhone": {
            "value": "514-019-9276",
            "type": "CELL"
        }
   }
}

Dataweave expression:

%dw 2.0
output application/json
---
{
    users: {} 
    ++  (if(payload.metadata != null and payload.metadata.primaryUser != null) ({"primaryAssignedReviewer": payload.metadata.primaryUser}) else {})
    ++ (if(payload.metadata != null and payload.metadata.secondaryUser != null) ({"secondaryUser": payload.metadata.secondaryUser}) else {})
}

Resulting output:

{
  "users": {
    "primaryAssignedReviewer": {
      "_id": "238d15d9-5843-4c39-a14d-cb0249a0ae53",
      "fullName": "Tommy Jones"
    },
    "secondaryUser": {
      "_id": "238d15d9-5843-4c39-a14d-cb0249a0ae53",
      "fullName": "Bob Smith"
    }
  }
}

Extract selective elements from a List of Objects

Given the following input, extract all the document Id’s as a list

{
	"documents": [{
			"id": "1001",
			"name": "Account Statement"
		},
		{
			"id": "1002",
			"name": "Tax Statement"
		},
		{
			"id": "1002",
			"name": "Rental Agreement"
		}
	]
}

Dataweave:

%dw 2.0
output application/json
---
payload.documents.id

Output:

[
  "1001",
  "1002",
  "1002"
]

Extract selective elements from an Object

For a given object, extract a given set of values. e.g. retrieve all the userId’s from the object.

{
  "users": {
    "primaryUser": {
      "_id": "238d15d9-5843-4c39-a14d-cb0249a0ae53",
      "fullName": "Tommy Jones"
    },
    "secondaryUser": {
      "_id": "238d15d9-5843-4c39-a14d-cb0249a0ae53",
      "fullName": "Bob Smith"
    }
  }
}

Dataweave:

%dw 2.0
output application/json
---
payload.users..*"_id" 

Output:

[
  "238d15d9-5843-4c39-a14d-cb0249a0ae53",
  "238d15d9-5843-4c39-a14d-cb0249a0ae53"
]

The DW expression can be adjusted to get a unique set of values, e.g.

%dw 2.0
output application/json
---
payload.users..*"_id" distinctBy (value) -> { "unique" : value }

would output

[
  "238d15d9-5843-4c39-a14d-cb0249a0ae53"
]

Get a list of keys from object/dictionary to check on a value

To check if a value belongs to a key within an object/dictionary/json – use the keySet() method to get a list of keys and contains to check containment.

Input json/object

{
    "specialtyAccounts": {
        "01NA": "BUSINESS",
        "0178NA": "BUSINESS",
        "0188NA": "BUSINESS",
        "0170NA": "TRUST",
        "0175NA": "TRUST"
	}
}

Dataweave:

%dw 2.0
import * from dw::core::Objects
output application/json
var accountId = "0188NA"
---
contains(namesOf(vars.specialtyAccounts), accountId)

Postgres – change column from type text to uuid

I created a table where I was entering UUID’s as text. But then realized postgres supports uuid column types.

But a simple alter table alter column didn’t work and got me the following error:

$ ALTER TABLE packages USING package_id::uuid;
ERROR:  syntax error at or near "USING"
LINE 1: ALTER TABLE packages USING package_id::uuid;

Here’s the solution to change the column type from text to uuid.

ALTER TABLE packages ALTER COLUMN package_id TYPE uuid USING package_id::uuid;

GitHub – Creating a Pull Request without Fork

Ran into a situation where Forking was disabled on the Git instance and had to create a pull request. Here’s the steps that I followed to create a new Pull Request, Commit changes that were already made on the local dev branch, merge the pull request to dev and finally remove the PR Branch.

1. Login to GitHub and navigate to the repository
2. Switch to the dev branch
3. Create a new branch off the dev branch and let’s call it JIRA-123
4. On your local machine, let’s say you are on dev branch and you have the changes ready – an existing file modified and a new file created. To view the changes, type git status -bs

## dev...origin/dev
 M README.md
?? src/main/mule/truststore-salesforce-connector-api.p12

5. Since you’d like to commit the changes onto the new branch JIRA-123, execute git pull to get a copy of the branch locally.
6. Execute git branch -a to get a list of all branches

* dev
  master
  remotes/origin/HEAD -> origin/master
  remotes/origin/JIRA-123
  remotes/origin/dev
  remotes/origin/master

7. If for any reason GIT doesn’t allow you to switch branches and asks you to commit changes first, you may have to stash the changes. To stash the changes, execute the command git stash that will save the changes in a queue with your latest changes being in top.
8. Now you can check out to the branch by executing git checkout JIRA-123
9. Apply the stash with the command git stash apply. This would apply the latest stashed code but will still preserve the changes on the stash. If you’d like to apply the stash locally and drop it at the same time, you could use git stash pop.
10. Add the new file to the commit git add src/main/mule/truststore-salesforce-connector-api.p12. Commit the changes and push the branch git commit -a -m "Message for the commit"
11. Finally push the changes to the remote repository using git push. Now your changes have been pushed to GitHub and you’re ready to create a Pull Request
12. Getting back to GitHub, stay on branch JIRA-123 and click on the button New Pull Request
13. Select your base: dev and compare: JIRA-123. Add comments and click on Create pull request.
14. Request is created and ready for review. As a reviewer, review changes and click on Merge pull request and Confirm merge.
15. GitHub gives you the option to delete the branch JIRA-123 which you should go ahead and delete as it is no longer needed. So click on Delete branch.
16. After merging a pull request and deleting the branch from GIT, you’ll noticed that the local git repository still shows the remote and local branches. When branches get deleted on origin, your local repository won’t take notice of that. You’ll still have your locally cached versions of those branches (which is actually good) but git branch -a will still list them as remote branches.
17. You can clean up the remote branches locally using git fetch --prune.
18. This would still leave your local branch intact that you probably won’t need. Delete the branch locally like this git branch -d JIRA-123

Sending emails from Mulesoft

You could use the SMTP connector to send emails from Mulesoft. I am sure you can find plenty of examples online to send emails via Gmail. That was easy to do with the following configuration:

<email:smtp-config name="Email_SMTP" doc:name="Email SMTP" doc:id="4af35a74-e02e-41be-9125-7fb2da34317c" >
    <email:smtp-connection host="smtp.gmail.com" user="your-email@gmail.com" password="your-password" port="587">
        <email:properties >
            <email:property key="mail.smtp.starttls.enable" value="true" />
        </email:properties>
    </email:smtp-connection>
</email:smtp-config>

But it got a little challenging to send emails through secure SMTPS protocol. Here’s the configuration that you could use to setup the email. In a nutshell, the SMTPS requires you to provide truststore configuration but according to the documentation, if the tls:context is empty (no tls:key-store or tls:trust-store defined), then the default values of the JVM are used, which usually include a truststore with certificates for all the major certifying authorities.

<email:smtp-config name="Email_SMTP" doc:name="Email SMTP" doc:id="0352f798-fd91-4307-bf7d-8f3a092dddd5">
    <email:smtps-connection host="email-smtp.us-east-1.amazonaws.com" user="your-username" password="your-password">
        <tls:context enabledProtocols="TLSv1.2">
            <tls:trust-store />
        </tls:context>
    </email:smtps-connection>
</email:smtp-config>

Priority Queue/Heap in Python

Priority Queues are an efficient way to get the min or max element from a list with O(1) time v/s using a min() function that loops through the entire list and gets the min or max element in O(n) time. The cons of using this approach is that every time an entry is added or removed from the heap, the elements are re-arranged so that the heap property of storing the min/max element at the root of the data structure is satisfied.

Python comes with an in-built module named “heapq” to implement priority queues.

Note

  • heapq implements min heap, i.e. the smallest element is stored at the root node of the data structure
  • max-heap isn’t supported by the heapq module by default. The workaround is to multiply the input numbers by -1 and add them to the heapq so that the lowest element now is the biggest element from the source list. While returning the largest element, multiply the popped element by -1 to get the original value back.
  • The elements are modified in-place. To perform heapify operation on a list, pass the list to the heapify() method which would convert it to a min-heap in place.
  • Every time an element is added or removed to/from the heap, the heap is re-ordered to maintain the priority queue
  • Re-arranging the elements in a priority heap takes O(log n) time v/s removing an element from a list takes O(n) and re-arranging them takes O(n log n)

Operations on a heap

  • heapify(iterable): This function is used to convert the iterable into a heap data structure. i.e. in heap order.
  • heappush(heap, element): This function is used to insert the element mentioned in its arguments into heap. The order is adjusted, so as heap structure is maintained.
  • heappop(heap): This function is used to remove and return the smallest element from heap. The order is adjusted, so as heap structure is maintained.
  • heappushpop(heap, element): This function combines the functioning of both push and pop operations in one statement, increasing efficiency. Heap order is maintained after this operation.
  • heapreplace(heap, element): This function also inserts and pops element in one statement, but it is different from above function. In this, element is first popped, then the element is pushed.i.e, the value larger than the pushed value can be returned. heapreplace() returns the smallest value originally in heap regardless of the pushed element as opposed to heappushpop().
  • nlargest(k, iterable, key = fun): This function is used to return the k largest elements from the iterable specified and satisfying the key if mentioned.
  • nsmallest(k, iterable, key = fun): This function is used to return the k smallest elements from the iterable specified and satisfying the key if mentioned.

Useful tricks for solving coding problems

Sharing some useful tricks for solving coding problems.

1. Get count of adjacent characters in a string

Use Cases

  1. Helpful for solving the Count and Say problem. e.g. given the string ‘aaabbcccccddddefggghh’ the expected output should be ‘3a2b5c4d1e1f3g2h’.
  2. Counting the number of non-space segments in a string.
  3. String compression
  4. Attendance check
  5. Long pressed name

Algorithm:

  1. Have 2 pointers start and end that serve as a window. Initialize them to begin of string. The idea is to keep moving end until it reaches a different character than start OR it reaches end of word.
  2. Run an outer while loop that ensures that end doesn’t go beyond the length of string
  3. Run an inner while loop to check if end of word or same character than start is encountered. Until the end is same as start, keep incrementing end. Most importantly: Check for end of string before char matching so that the following condition (char match) isn’t executed if end of string is reached.
  4. After the inner while loop ends, end is either end of word or different than start. Compute the length of the char occurrences. Append the length and the last char to the output and move start to end.
  5. Execution will return to the outer loop, will ensure end of string is not reached and begin execution with the new start position.
def countAndSayString(string):
  start, end = 0, 0
  result = ''
  # Outer while loop to ensure end is less than string length
  while end < len(string):
    # Inner while loop to move end until it equals start. 
    # In addition also ensure that end doesn't go beyond length
    # Always put the len check before so following condition isn't executed if end is over limit
    while end < len(string) and string[start] == string[end]:
      end += 1
    len_chars = end-start
    result += str(len_chars) + string[start]
    start = end
  #results.append(nw)
  return result

string = 'aaabbcccccddddefggghh'
result = countAndSay(string)
print(f'Result is: {result} | {result=="3a2b5c4d1e1f3g2h"}')

2. Adding very large numbers represented as string

Given the restriction, you cannot convert the strings to integers because the language won’t support it.

Algorithm

  1. Left pad the shorter string by 0
  2. Initiate carry with 0 and keep adding numbers from the LSB with carry. The remainder will be left appended to the result and the dividend will be passed to carry
  3. If carry remains, convert to string and left append to result

3. Palindrome by removing at most one character

Check to see if string can be made palindrome by removing at most one character.

Algorithm

  1. Start from left and right and keep moving pointers inwards
  2. If chars don’t match, then taking either one of them out should result in a palindrome string if so
  3. Compare the remaining string without left char and string without the right char. If either one is palindrome, then we can remove the other char that would result in string being palindrome

4. Repeated string match

Given two strings A and B where A is shorter than B, find out how many times you will have to repeat A such that B becomes a sub-string of A given this is possible., e.g. A = “abcd” and B = “cdabcdab”, A must be repeated 3 times so B becomes a sub-string of A.

Algorithm

  1. The length of A must be at least B’s length so B can be a sub-string of A
  2. Give an additional buffer to repeat A (draw on paper paper to find out why)
  3. Divide len(B) by len(A) to get the min repeatitions and add another one and loop through to check if B is sub-string and return the min possible number

5. Useful to check on upper and lower case string problems

You can convert characters to ASCII and vice-versa in python by using the ord() and chr() methods. It’s useful to check on upper/lower case characters and to convert them without using in built binaries.

  1. ord(c) converts the character c to it’s ASCII value. Upper case starts from 65 and lower case starts from 97
  2. chr(n) converts the ASCII value to it’s character equivalent.

Node.js equivalent of python’s if __name__ == ‘__main__’

As I was building some modules for the automated testing framework, I realized that it would be nice to run the functions in the modules themselves to test individually but not have them executed when imported by another file. That’s when I loved missed Python’s if __name__ == ‘__main__’ but quickly found out that the same can be achieved in node.js as well.

When a file is run directly from Node.js, require.main is set to its module. That means that it is possible to determine whether a file has been run directly by testing require.main === module. For a file foo.js, this will be true if run via node foo.js, but false if run by require(‘./foo’).

var fnName = function() {
    // main code
}

if (require.main === module) {
    fnName();
}

Installing Node.js on Mac

Had a hard time when I shouldn’t have. I installed it from the installer utility on nodejs.org. Found some issues and then installed using brew probably as root which screwed up a few things. But just listing down what users should be going.

1. Run the installer as yourself – not root as it would mess up the directory ownership on /usr/local and if you did that already you could fix that by using this command

sudo chown -R $(whoami) $(brew --prefix)/*

2. Check the version that you are about to install, to check if you’re getting the stable release or the latest release:

brew info nod

3. Install node using brew

brew install node