Skip to main content

How to Perform Unit Testing for AWS Glue Jobs in an Azure DevOps Pipeline

Unit testing AWS Glue jobs presents challenges due to the complexities involved in replicating the Glue environment locally. Fortunately, AWS offers a solution in the form of Glue container images available at Glue container images. These images allow us to perform unit tests effectively, as outlined in detail in the official documentation here. In this blog post, we will delve into the process of running AWS Glue job unit tests within an Azure DevOps pipeline and discuss how to calculate and publish code coverage for these tests.

To begin with, the Glue container image operates under a special user named GLUE_USER, which is referenced in the associated dockerfile.

USER glue_user

Assuming you have developed your Glue job in a Python script named myawesomegluejob.py, which is stored in an Azure DevOps (AzDO) Git repository, creating a pipeline for this purpose might initially seem straightforward. However, executing build steps directly within the Glue container is not feasible due to permission constraints with the GLUE_USER.

To overcome this limitation, our approach involves leveraging Docker commands in the pipeline to fetch the Glue image and subsequently mounting the Azure DevOps pipeline's file structure inside the Glue container. This facilitates the sharing of test results and code coverage data back to the Azure DevOps pipeline for future utilization.

By default, the Azure DevOps pipeline file system is not writable by the GLUE_USER. To address this, we must grant access to all users by executing the command chmod -R 0777 $(Build.SourcesDirectory).

Next, we can execute the following command:

docker run -v $(Build.SourcesDirectory):/home/glue_user/workspace -w /home/glue_user/workspace public.ecr.aws/glue/aws-glue-libs:glue_libs_4.0.0_image_01 -c "pip install pytest pytest-azurepipelines pytest-cov; python3 -m pytest test --doctest-modules --junitxml=junit/test-results.xml --cov=main --cov-report=xml"

This command effectively mounts $(Build.SourcesDirectory) into the /home/glue_user/workspace folder within the container. By setting the working directory to /home/glue_user/workspace, we proceed to execute a series of commands that install the necessary Python libraries and perform the unit tests. Consequently, a coverage.xml file is generated at $(Build.SourcesDirectory). However, as this file is created within the container, it contains relative paths of the container in its sources node. To rectify this, we conduct a string replacement using the sed command.

Here's the relevant snippet encompassing the aforementioned steps in the Azure DevOps pipeline:

- job: 'Scan_and_Build'
  steps:
      - script: |
          docker pull public.ecr.aws/glue/aws-glue-libs:glue_libs_4.0.0_image_01
      displayName: 'Pull Glue Image'

      - script: |
          chmod -R 0777 $(Build.SourcesDirectory)
          docker run -v $(Build.SourcesDirectory):/home/glue_user/workspace -w /home/glue_user/workspace public.ecr.aws/glue/aws-glue-libs:glue_libs_4.0.0_image_01 -c "pip install pytest pytest-azurepipelines pytest-cov; python3 -m pytest test --doctest-modules --junitxml=junit/test-results.xml --cov=main --cov-report=xml"
          sed -i "s|/home/glue_user/workspace|$(Build.SourcesDirectory)|g" $(Build.SourcesDirectory)/coverage.xml
      displayName: 'Run tests'

      - task: PublishTestResults@2
      condition: succeeded()
      inputs:
          testResultsFiles: '**/test-*.xml'
      displayName: 'Publish unit test results'

Conclusion

This blog post has effectively demonstrated how to perform unit tests for AWS Glue jobs within an Azure DevOps pipeline. By leveraging Glue container images and integrating Docker commands, it becomes possible to seamlessly execute unit tests and publish code coverage data, thus ensuring the reliability and stability of your Glue jobs.

Comments

Popular posts from this blog

Integrating React with SonarQube using Azure DevOps Pipelines

In the world of automation, code quality is of paramount importance. SonarQube and Azure DevOps are two tools which solve this problem in a continuous and automated way. They play well for a majority of languages and frameworks. However, to make the integration work for React applications still remains a challenge. In this post we will explore how we can integrate a React application to SonarQube using Azure DevOps pipelines to continuously build and assess code quality. Creating the React Application Let's start at the beginning. We will use npx to create a Typescript based React app. Why Typescript? I find it easier to work and more maintainable owing to its strongly-typed behavior. You can very well follow this guide for jsx based applications too. We will use the fantastic Create-React-App (CRA) tool to create a React application called ' sonar-azuredevops-app '. > npx create-react-app sonar-azuredevops-app --template typescript Once the project creation is done, we ...

Use AI to build your house!

When a new housing society emerges, residents inevitably create chat groups to connect and share information using various chat apps like WhatsApp and Telegram. In India, Telegram seems to be the favorite as it provides generous group limits, admin tools, among other features. These virtual communities become treasure troves of invaluable insights. But whatever app you use, there is always a problem of finding the right information at right time. Sure, the apps have a "Search" button, but they are pretty much limited to keyword search and are useless when you have to search through thousands of messages. I found myself in this situation when it was my turn to start on an interior design project for my home. Despite being part of a vibrant Telegram group, where countless residents had shared their experiences with various interior designers and companies, I struggled to unearth the pearls of wisdom buried within the chat's depths. I remembered that I could take advantage o...

Vaastu Shastra

There are certain tasks that the Indian society expects a person to fulfill - get a good job, get married at a certain age, buy a house, buy a car, have kids etc. So it would seem natural to you that after getting married I have started looking around for a house to buy. It isn't so. My hunt for a house began with a trip to the mall to buy a sofa set. It should not come as a surprise that I like movies. I have watched hundreds of movies and now that I have means at my disposal I started to improve my movie viewing experience. With TV and audio system out of the way, a comfy couch was all that was needed. So I dragged my wife with me to the mall and started evaluating the over-priced sofas. We hopped and jumped on a lot of them and when the dust finally settled, my heart was with a sofa that was also a recliner, rocking chair, had foot support, was made of high quality dead skin... err leather - the complete package! It also came with a hefty price tag. We came back home to di...