subprocess

Resources

run

When scripting a task it is common to need to run a external process, for instance a program to do a particular data analysis. This external process will be a subprocess launched by our Python running process. The subprocess module includes the run function to run external processes.

Let’s image that we want to run the ls command (dir in windows).

from subprocess import run

cmd = ['ls']
run(cmd)
comprehensions.html
comprehensions.qmd
comprehensions.quarto_ipynb
counter.qmd
counter.quarto_ipynb
enumerate.html
enumerate.qmd
enumerate.quarto_ipynb
everyday_python.html
everyday_python.qmd
lambda.html
lambda.qmd
lambda.quarto_ipynb
paths.html
paths.qmd
paths.quarto_ipynb
range.html
range.qmd
range.quarto_ipynb
shutil.html
shutil.qmd
shutil.quarto_ipynb
subprocess.qmd
subprocess.quarto_ipynb
typing.html
typing.qmd
typing.quarto_ipynb
zip.qmd
zip.quarto_ipynb
CompletedProcess(args=['ls'], returncode=0)

Run, by default, expects a list of strings, not a string with the command. For instance, image that our command includes a parameter.

from subprocess import run

cmd = ['ls', '-l']
run(cmd)
total 604
-rw-rw-r-- 1 jose jose 79216 nov  4 15:10 comprehensions.html
-rw-rw-r-- 1 jose jose  4361 nov  3 15:07 comprehensions.qmd
-rw-rw-r-- 1 jose jose  6545 nov  4 15:10 comprehensions.quarto_ipynb
-rw-rw-r-- 1 jose jose  3719 nov  3 15:37 counter.qmd
-rw-rw-r-- 1 jose jose  5739 nov  4 15:10 counter.quarto_ipynb
-rw-rw-r-- 1 jose jose 57776 nov  4 15:10 enumerate.html
-rw-rw-r-- 1 jose jose  1786 oct 30 14:43 enumerate.qmd
-rw-rw-r-- 1 jose jose  3137 nov  4 15:10 enumerate.quarto_ipynb
-rw-rw-r-- 1 jose jose 43105 nov  4 15:10 everyday_python.html
-rw-rw-r-- 1 jose jose   640 oct 30 09:43 everyday_python.qmd
-rw-rw-r-- 1 jose jose 55679 nov  4 15:10 lambda.html
-rw-rw-r-- 1 jose jose  2191 oct 31 11:41 lambda.qmd
-rw-rw-r-- 1 jose jose  3434 nov  4 15:10 lambda.quarto_ipynb
-rw-rw-r-- 1 jose jose 64459 nov  4 15:10 paths.html
-rw-rw-r-- 1 jose jose  4608 sep 15  2024 paths.qmd
-rw-rw-r-- 1 jose jose  6430 nov  4 15:10 paths.quarto_ipynb
-rw-rw-r-- 1 jose jose 68060 nov  4 15:10 range.html
-rw-rw-r-- 1 jose jose  1657 oct 31 10:02 range.qmd
-rw-rw-r-- 1 jose jose  3159 nov  4 15:10 range.quarto_ipynb
-rw-rw-r-- 1 jose jose 63207 nov  4 15:10 shutil.html
-rw-rw-r-- 1 jose jose  2885 nov  3 14:38 shutil.qmd
-rw-rw-r-- 1 jose jose  4383 nov  4 15:10 shutil.quarto_ipynb
-rw-rw-r-- 1 jose jose  3762 nov  3 15:48 subprocess.qmd
-rw-rw-r-- 1 jose jose  6913 nov  4 15:10 subprocess.quarto_ipynb
-rw-rw-r-- 1 jose jose 57159 nov  4 15:10 typing.html
-rw-rw-r-- 1 jose jose  2450 oct 30 10:54 typing.qmd
-rw-rw-r-- 1 jose jose  3721 nov  4 15:10 typing.quarto_ipynb
-rw-rw-r-- 1 jose jose  1356 oct 30 14:46 zip.qmd
-rw-rw-r-- 1 jose jose  2549 nov  4 15:10 zip.quarto_ipynb
CompletedProcess(args=['ls', '-l'], returncode=0)

In any case the run function will launch the external process and will also wait for the process to finish, and only then will the function return a CompletedProcess object.

from subprocess import run

cmd = ['ls', '-l']
process = run(cmd)
print(process.returncode)
total 604
-rw-rw-r-- 1 jose jose 79216 nov  4 15:10 comprehensions.html
-rw-rw-r-- 1 jose jose  4361 nov  3 15:07 comprehensions.qmd
-rw-rw-r-- 1 jose jose  6545 nov  4 15:10 comprehensions.quarto_ipynb
-rw-rw-r-- 1 jose jose  3719 nov  3 15:37 counter.qmd
-rw-rw-r-- 1 jose jose  5739 nov  4 15:10 counter.quarto_ipynb
-rw-rw-r-- 1 jose jose 57776 nov  4 15:10 enumerate.html
-rw-rw-r-- 1 jose jose  1786 oct 30 14:43 enumerate.qmd
-rw-rw-r-- 1 jose jose  3137 nov  4 15:10 enumerate.quarto_ipynb
-rw-rw-r-- 1 jose jose 43105 nov  4 15:10 everyday_python.html
-rw-rw-r-- 1 jose jose   640 oct 30 09:43 everyday_python.qmd
-rw-rw-r-- 1 jose jose 55679 nov  4 15:10 lambda.html
-rw-rw-r-- 1 jose jose  2191 oct 31 11:41 lambda.qmd
-rw-rw-r-- 1 jose jose  3434 nov  4 15:10 lambda.quarto_ipynb
-rw-rw-r-- 1 jose jose 64459 nov  4 15:10 paths.html
-rw-rw-r-- 1 jose jose  4608 sep 15  2024 paths.qmd
-rw-rw-r-- 1 jose jose  6430 nov  4 15:10 paths.quarto_ipynb
-rw-rw-r-- 1 jose jose 68060 nov  4 15:10 0
range.html
-rw-rw-r-- 1 jose jose  1657 oct 31 10:02 range.qmd
-rw-rw-r-- 1 jose jose  3159 nov  4 15:10 range.quarto_ipynb
-rw-rw-r-- 1 jose jose 63207 nov  4 15:10 shutil.html
-rw-rw-r-- 1 jose jose  2885 nov  3 14:38 shutil.qmd
-rw-rw-r-- 1 jose jose  4383 nov  4 15:10 shutil.quarto_ipynb
-rw-rw-r-- 1 jose jose  3762 nov  3 15:48 subprocess.qmd
-rw-rw-r-- 1 jose jose  6913 nov  4 15:10 subprocess.quarto_ipynb
-rw-rw-r-- 1 jose jose 57159 nov  4 15:10 typing.html
-rw-rw-r-- 1 jose jose  2450 oct 30 10:54 typing.qmd
-rw-rw-r-- 1 jose jose  3721 nov  4 15:10 typing.quarto_ipynb
-rw-rw-r-- 1 jose jose  1356 oct 30 14:46 zip.qmd
-rw-rw-r-- 1 jose jose  2549 nov  4 15:10 zip.quarto_ipynb

Launching without waiting

return code

Every process once is finished return a return code or exit status. This return code is an integer and the standard is to return 0 when everything has been fine or any other number in the event of an error happening in the subprocess. You can access to the exit code of the subprocess.

If you want the run function to fail in the event of the called process having any problem you could use the check argument.

stdout and stdin

You can store the result of stdout and stderr as properties of the completed process object.

from subprocess import run

cmd = ['ls', '/hello']
process = run(cmd, capture_output=True)
print(process.stdout)
print(process.stderr)
b''
b"ls: no se puede acceder a '/hello': No existe el fichero o el directorio\n"

Be aware that, by default, the standard output streams will be binary objects, if you want them to be strings you have to provide an encoding.

from subprocess import run

cmd = ['ls', '/hello']
process = run(cmd, capture_output=True, encoding='utf-8')
print(process.stdout)
print(process.stderr)

ls: no se puede acceder a '/hello': No existe el fichero o el directorio

Popen

The run function will wait for the subprocess to finnish before returning. If you just want to launch the process, but not wait for it to finish you can use the Popen class.

The use of Popen is very similar to the use of run, the main difference being that Popen will return a Popen object immediately, without waiting for it to finish.

Once you have that object, you could check if the process has already finished or you could also wait for the process to finish.

from subprocess import Popen

cmd = ['ls']
process = Popen(cmd)
print(process.poll())
print(process.wait())
print(process.returncode)
None
comprehensions.html
comprehensions.qmd
comprehensions.quarto_ipynb
counter.qmd
counter.quarto_ipynb
enumerate.html
enumerate.qmd
enumerate.quarto_ipynb
everyday_python.html
everyday_python.qmd
lambda.html
lambda.qmd
lambda.quarto_ipynb
paths.html
paths.qmd
paths.quarto_ipynb
range.html
range.qmd
range.quarto_ipynb
shutil.html
shutil.qmd
shutil.quarto_ipynb
subprocess.qmd
subprocess.quarto_ipynb
typing.html
typing.qmd
typing.quarto_ipynb
zip.qmd
zip.quarto_ipynb
0
0