Python Unit Testing With PyTest 101
Unit Testing is the practice of writing a series of small unit test cases and validating the behavior of production code at the level of functions & classes in isolation. In simpler terms, every line of code written in development will be associated with a unit test case and validated against positive/negative cases, edge cases, and a happy path. Unfortunately, this practice is not religiously followed by most of the developers today.
In this article, I will walk you through what is unit testing and python (PyTest) unit testing framework for beginners.
What kind of output can we expect from unit testing?
In the first place, Unit testing is not about debugging the source code or fixing the defects/bugs. It will help developers to find
- Syntax errors.
- Logic Errors
Why is Unit Testing needed?
Today and recent past software failures have affected banks, airlines, and many more sectors, causing millions of pounds damage which directly impacted a huge customer base. From my experience, delivering a bug-free product is a never-ending process. I have seen and heard developers usually skip unit testing to meet project deadlines and lack toolset knowledge to do unit testing effectively. Let us see a real-time use case.
- Scenario: Developers were debugging the issue in full-fledged/real-time clusters.
- Scenario: Developers validating the results in UI/Backend for actual and expected output.
The above two approaches will delay the entire delivery process because developers who worked on that specific code have to stop feature development for debugging and fixing issues. As a result time/cost will be high when the product delivered to the field.
In software testing, Unit testing is the lowest level of testing which helps to reduce the complexity by testing function/classes in isolation. So a unit testing is necessary to
- Measures the quality of the software
- Provides Immediate Feedback about your code
- Reduces the overall level of risk in delivered software
- Any bugs can be found easily and faster: For every function unit test cases should be introduced and validated with a series of positive and negative test cases before promoting code to the next level in the development process.
- Unit tests save time and money: Having all the defects and issues fixed in the local development will save a lot of cost and time in the short/long term of product delivery for an organization.
- Live documentation: it is a treasure for new joiners/existing resources for understanding how an application works and helps for new enhancements and references at a later point of time.
Why Test-Driven Development?
Test-Driven Development is a software development process in which unit tests cases are written or updated before development code and ‘passing’ the test is the critical operator of development. TDD workflow advises to write unit test cases incrementally and then write the associated piece of code and make it pass. Unit testing and TDD are often discussed together but hardly implemented in my experience.
TDD workflow cycle has following phases
- Red Phase – Firstly add unit test cases and make it fail. It is expected to fail because the feature is not yet developed.
- Green Phase – Developers will write the minimal code and run all the unit test cases and see all the test cases are passed.
- Refactor Phase – Eliminate redundancy and repeat again.
Overall, it provides immediate feedback for every piece of logic you are adding to your development code. Having this process implemented may be hard initially but in the longer run, you will reap the benefits, and also it is not only limited to python testing, It is an overall software development approach that can be applied for any testing framework.
Note: check the best practices of TDD and the difference between TDD vs BDD.
Why PyTest Framework?
- PyTest is more advanced than unittest in terms of debugging and flexibility.
- We can achieve reusability using global/module fixtures.
- Test Parameterization–Many test cases can use the same test function
- Automatically detect tests based on the standard naming convention.
- Auto-discovery of test cases.
Note: Check the comparison detailed between unittest vs pytest.
Run the following command in your command line:
pip install -U pytest
Check that you installed the correct version:
$ pytest --version
A. Basic Hello world in PyTest
To use PyTest, we just need to write the following statement inside our test code to import PyTest
Let’s get started, the below is a basic hello world example, and running the sample will show PyTest has been set up in the project and it passes the test case.
#FileName:test_helloworld.py #Function def functionHelloWorld(value): return value + "World" #Test case def test_helloworld(): retval = functionHelloWorld("Hello") assert retval == "HelloWorld"
Result: test_helloworld.py::test_helloworld PASSED [100%]
B. How to execute the Unit test cases in Pycharm IDE?
C. How to test PyTest unit test case by command line
By default PyTest runs all the test cases within the directory and subdirectory by using the auto-discovery on the filename, starting with “_test” or ending with “test_” (“helloworld_test” or “test_helloworld”). Not only file names, but even the test methods should also have the “test” text prefixed or post-fixed otherwise the test won’t be executed.
i. Run all the test cases found in the directory.
pytest -v -s
ii. Run all the test cases in the specific directory by having auto-discovery on all files written with the file name either prefixed/post-fixed as “test”
pytest -v -s testdirectory
iii. Run unit test for specific file.
pytest -v -s -k test_helloworld.py
iv. Run unit test case for specific method
pytest -v -s -k test_helloworld
D. Pytest Assertions
An Assert statement verifies the actual vs expected values in the python unit test cases and it will return either Passed or Failed as part of the test case execution. Say if the assert statement fails because of some logic error you will get a complete function call (failure errors) along with the return value.
def test_true(): assert True def test_false(): assert False def test_Integer(): assert 4==4 def test_String(): assert "hello" == "hello"
Result: test_samples.py::test_true PASSED test_samples.py::test_false FAILED test_samples.py::test_Integer PASSED test_samples.py::test_String PASSED
How do I assert exception in PyTest ?
In negative case scenarios, we can validate a function whether it raises an exception under certain conditions. In the below example, “test_exp_readCsv_withWrongFile” function will test the “readCsvData” function with the wrong file name. The “readCsvData” function will return a value Error “csvFileNotValidError” and the result can be asserted in the unit test case function.
Handling ValueError exception example
# This function is responsible to read csv data def readCsvData(self,filename): try: data = self.spark.read.format(self.readFormat).option("inferSchema", "true").option("header", "true").csv(filename) return data except pyspark.sql.utils.AnalysisException as e: raise ValueError("csvFileNotValidError") # Test method def test_exp_readCsv_withWrongFile(wrongCSVFile): with pytest.raises(Exception) as excinfo: csvdata = readCsvData(wrongCSVFile) assert excinfo == ValueError
Note: If the function raises expected ValueError, test will pass and if function does not raise ValueError, test will fail.
E. PyTest Setup and TearDown Method
Setup and teardown methods are common to those coming from a unittest background. Basically, the following methods will be called before and after all the class, function, module, method/function level.
In a real-time scenario, we can load the test data or default configuration for each function/method/class which will reduce the code duplication. We will see an example to understand setup and teardown function
import pytest def test_helloworld(): print("\n Test_HelloWorld function initiated") assert True def setup_function(function): print("\n setup_function called") def teardown_function(function): print("\n teardown_function called")
Result: setupandteardown.py::test_helloworld setup_function called PASSED [100%] Test_HelloWorld function initiated teardown_function called
Note: If the setup is failed or skipped the teardown function will not be called. We can call the setup/teardown multiple times per testing process
F. PyTest fixtures
@fixture(callable_or_scope=None, *args, scope=’function’, params=None, autouse=False, ids=None, name=None)Reference
In fixture, we can delegate setup/teardown in a separate standalone helper method and can be accessed by multiple test functions, classes, modules based on the scope (refer below) defined. The fixture can be accessed by function arguments and for each fixture, we have to use the respective function name with the parameter.
Scopes decides how often the function should be called if it has the fixture decorator
- Function: called once and destroyed at the end of the test.
- Class: called once per class and destroyed at the end of the last test in the class.
- Module: called once per module and destroyed at the end of the last test in the module.
- Package: called once per package and destroyed at the end of the last test in the package.
- Session: called once per session and destroyed at the end of the last test in the session.
i. Fixture Example
Use Case – Initialize a database with known parameters before running a test and close the connection or clear the database tables after the test method.
#By default it will be called for all functions. @pytest.fixture(autouse=True) def setup(request): print("\n setup called") def teardown(): print("\nTearDown called") request.addfinalizer(teardown) def test_helloworld2(): print("\n Test_HelloWorld 2 function initiated") assert True def test_helloworld1(): print("\n Test_HelloWorld 1 function initiated") assert True
Result: test_samples.py::test_helloworld2 setup called Test_HelloWorld 2 function initiated PASSED TearDown called test_samples.py::test_helloworld1 setup called Test_HelloWorld 1 function initiated PASSED TearDown called
ii. Create a single spark context for all your unit test cases – Example
Use Case – If we need to initiate multiple test cases, we need to initialize resource-intensive spark context for every function. Instead, we can have a common test file named “config_test” that will return the single spark session and it will get closed once that test runs successfully. Refer example below
""" pytest fixtures that can be resued across tests """ from pyspark.sql import SparkSession import pytest @pytest.fixture(scope="session") def sparkSession(request): """ fixture for creating a spark context Args: request: pytest.FixtureRequest object """ sparkSession = ( SparkSession .builder .appName("myspark") .master("local") .getOrCreate() ) request.addfinalizer(lambda: sparkSession.stop()) return sparkSession
Yield vs Finalizer in Fixture
- Yield – will execute the teardown code. In fixture, we can yield exactly only one value and used to clean all the resources. Fixture functions can only yield exactly one value and more than one yield will result in an error.
- Finalizer – method with fixture decorator, will allow registering multiple finalizer methods and we can clean all the resources even the fixture setup raises an exception.
Yield vs Finalizer example
import pytest @pytest.fixture() def setup_for_helloworld1(request): print("\n setup called") yield @pytest.fixture() def setup_for_helloworld2(request): print("\nSetup2") def teardown_helloworld1(): print("\nTearDown A") def teardown_helloworld2(): print("\nTearDown B") request.addfinalizer(teardown_helloworld2) request.addfinalizer(teardown_helloworld1) def test_helloworld2(setup_for_helloworld2): print("\n Test_HelloWorld 2 function initiated") assert True def test_helloworld1(setup_for_helloworld1): print("\n Test_HelloWorld 1 function initiated") assert True
Results: test_jobs/test_samples.py::test_helloworld2 Setup2 Test_HelloWorld 2 function initiated PASSED TearDown A TearDown B test_jobs/test_samples.py::test_helloworld1 setup called Test_HelloWorld 1 function initiated PASSED
Note: @pytest.fixture(autouse=True) The above fixture definition indicates the respective function should run per each test.
My two cents on unit testing
- Have all the test cases in a common directory (test_projectname) and categorize it based on the project features.
- Have the proper unit testing setup in your local environment.
- Create the following logic (Single creation of spark context, Database connection, Configuration properties, Logging, Test Data) as global configs using fixtures.
- Always go for classes to have unit test cases in groups.
- Since you will write the unit test case from the beginning of development there are a lot of chances of getting redundancy in the unit test cases. So always revisit the test cases and refactor the code often to reduce the duplicates.
- Last but not least, don’t confuse Unit testing with TDD. Always have in mind – unit testing is about what you are testing and TDD is about when you will do the testing.
- Use Code coverage (pytest-cov) to keep your code quality up always.
- Explore PyTest more in the official documentation.
In my perspective, A good programmer will always perform unit testing on his code before he promotes code to QA/UAT/Production. That is all about unit testing with PyTest and my advice would be, roll up your sleeves and get your hands dirty with (PyTest) unit testing framework. Try yourself, fail, and try again to get familiar yourself with python unit testing with PyTest.
Related Online Courses
What you’ll get from it: Learn how Unit Testing and TDD will help you and walkthrough of PyTest testing library and set up with best practices and example programming sessions.
What you’ll get from it: The course will discuss what makes a good framework, and maybe more importantly, what makes a bad one. Learn how fixtures can eliminate up to 80% of the code in a bloated codebase.