Jenkins – Cleaning Workspace but Keeping Python .venv Intact

Jenkinspippythonvirtualenv

This question pertains to working with a Python + Poetry project within a Jenkins pipeline and how to retain the .venv/


SCENARIO:

I have a Jenkins Pipeline job that triggers a Python project. The project uses poetry to create a virtual env in .venv within the workspace. Each subsequent job run, it will re-use the .venv as expected so each pip package does not need to re-download on every run (unless there is a diff in the poetry.lock file). Everything works as expected.

I want to make a change to the Pipeline using the Jenkins Workspace Cleanup Plugin, I want to clobber the workspace files but keep some files, including the pip/poetry/venv environment files. This is to allow it to re-use pip packages from the previous run still stored in .venv — just as it does on the working Pipeline today.

Full pipeline file example is at the bottom of this post, but here is a snippet of the cleanWs() portion I've added to the existing Pipeline:

post {
  always {
    cleanWs(
      deleteDirs: true,
      notFailBuild: true,
      patterns: [
        [pattern: '.venv', type: 'EXCLUDE'],
        [pattern: '.venv/**', type: 'EXCLUDE']
      ]
    )
  }
}

HERE IS THE ISSUE:

  • The first time the job runs, it works perfectly fine and the workspace cleanup works as expected. The .venv/ directory is retained as expected.

  • (problem) On subsequent runs of the job, poetry will re-install all the packages and will not re-use the .venv directory:

    Creating virtualenv test in /data/jenkins_home/workspace/test-cleanup/.venv — This forces a full re-download of every package, even though .venv already exists. It's been confirmed that /data/jenkins_home/workspace/test-cleanup/.venv already exists before the job runs.

  • Here is the strange part: If I go to the workspace dir manually on the Jenkins server and run the exact same command poetry install it works as expected, the .venv is reused and all the packages are not reinstalled. So there is something specific about the way it's handled running in the job that's making it want to recreate the .venv dir.

NOTE that in-project = true is already set for Poetry. So it will always try to use .venv within the current working directory.


EXAMPLES:

Here is a simple example pipeline that works as expected. When poetry install step runs, it does not re-download all the packages every time the job runs, only the first time, or if there is a diff:

pipeline {

  agent any

  stages {

    stage("Prep Build Environment") {
      steps {

        script {
          scmVars = git branch: "main", poll: false, url: "[email protected]:my-org/private-repo.git"
        }

        sh "poetry install"
      }
    }
  }
}

Here is the new Jenkinsfile pipeline file with cleanWs() added. After this was added, the project will no longer re-use the .venv on each run, even though it still exists:

pipeline {

  agent any

  stages {

    stage("Prep Build Environment") {
      steps {

        script {
          scmVars = git branch: "main", poll: false, url: "[email protected]:my-org/private-repo.git"
        }

        sh "poetry install"
      }
    }
  }

  post {
    always {
      cleanWs(
        deleteDirs: true,
        notFailBuild: true,
        patterns: [
          [pattern: '.venv', type: 'EXCLUDE'],
          [pattern: '.venv/**', type: 'EXCLUDE']
        ]
      )
    }
  }

}

Best Answer

Typically a Jenkins Pipeline will clean the workspace at the start of a build, which is why .venv folder exists after the build runs but not when running the virtualenv step during the next build.

If you want to cache or retain some files between builds, the most reliable way to do so is to store those files outside of the workspace. You need to be extremely careful with this, however, because simultaneously running builds accessing the same files can lead to resource contention and race conditions and corrupted files.