Unit Testing is the practice of writing a series of small unit test cases and validating the behavior of production code at the level of functions & classes in isolation. In simpler terms, every line of code written in development will be associated with a unit test case and validated against positive/negative cases, edge cases, and a happy path. Unfortunately, this practice is not religiously followed by most of the developers today.

In this article, I will walk you through what is unit testing and python (PyTest) unit testing framework for beginners.

What kind of output can we expect from unit testing?


In the first place, Unit testing is not about debugging the source code or fixing the defects/bugs. It will help developers to find 

  1. Syntax errors.
  2. Logic Errors

Why is Unit Testing needed?


Today and recent past software failures have affected banks, airlines, and many more sectors, causing millions of pounds damage which directly impacted a huge customer base. From my experience, delivering a bug-free product is a never-ending process. I have seen and heard developers usually skip unit testing to meet project deadlines and lack toolset knowledge to do unit testing effectively. Let us see a real-time use case.

  1. Scenario: Developers were debugging the issue in full-fledged/real-time clusters. 
  2. Scenario: Developers validating the results in UI/Backend for actual and expected output.

The above two approaches will delay the entire delivery process because developers who worked on that specific code have to stop feature development for debugging and fixing issues. As a result time/cost will be high when the product delivered to the field.

Bottomline


In software testing, Unit testing is the lowest level of testing which helps to reduce the complexity by testing function/classes in isolation. So a unit testing is necessary to 

  1. Measures the quality of the software
  2. Provides Immediate Feedback about your  code
  3. Reduces the overall level of risk in delivered software
  4. Any bugs can be found easily and faster: For every function unit test cases should be introduced and validated with a series of positive and negative test cases before promoting code to the next level in the development process.
  5. Unit tests save time and money: Having all the defects and issues fixed in the local development will save a lot of cost and time in the short/long term of product delivery for an organization.
  6. Live documentation:  it is a treasure for new joiners/existing resources for understanding how an application works and helps for new enhancements and references at a later point of time.

Why Test-Driven Development?


Test-Driven Development is a software development process in which unit tests cases are written or updated before development code and ‘passing’ the test is the critical operator of development. TDD workflow advises to write unit test cases incrementally and then write the associated piece of code and make it pass. Unit testing and TDD are often discussed together but hardly implemented in my experience.

TDD workflow cycle has following phases

Test Driven Development Workflow Cycle
Test Driven Development Workflow Cycle
  1. Red Phase – Firstly add unit test cases and make it fail. It is expected to fail because the feature is not yet developed.
  2. Green Phase –  Developers will write the minimal code and run all the unit test cases and see all the test cases are passed.
  3. Refactor Phase – Eliminate redundancy and repeat again.

Overall, it provides immediate feedback for every piece of logic you are adding to your development code. Having this process implemented may be hard initially but in the longer run, you will reap the benefits, and also it is not only limited to python testing, It is an overall software development approach that can be applied for any testing framework.

Note: check the best practices of TDD and the difference between TDD vs BDD.

Why PyTest Framework?


  1. PyTest is more advanced than unittest in terms of debugging and flexibility.
  2. We can achieve reusability using global/module fixtures.
  3. Test Parameterization–Many test cases can use the same test function
  4. Automatically detect tests based on the standard naming convention.
  5. Auto-discovery of test cases.
Note: Check the comparison detailed between unittest vs pytest.

PyTest installation


Install PyTest

Run the following command in your command line:

pip install -U pytest

Check that you installed the correct version:

$ pytest --version

A. Basic Hello world in PyTest


To use PyTest, we just need to write the following statement inside our test code to import PyTest 

import pytest

Let’s get started, the below is a basic hello world example, and running the sample will show PyTest has been set up in the project and it passes the test case.

Example

#FileName:test_helloworld.py
#Function 
def functionHelloWorld(value):        
   return value + "World"

#Test case
def test_helloworld():                
   retval = functionHelloWorld("Hello")
   assert retval == "HelloWorld"
Result:

test_helloworld.py::test_helloworld PASSED  [100%]

B. How to execute the Unit test cases in Pycharm IDE?


Test runner – pycharm IDE

C. How to test PyTest unit test case by command line


 By default PyTest runs all the test cases within the directory and subdirectory by using the auto-discovery on the filename, starting with “_test” or ending with “test_” (“helloworld_test” or “test_helloworld”). Not only file names, but even the test methods should also have the “test” text prefixed or post-fixed otherwise the test won’t be executed.

i. Run all the test cases found in the directory.

pytest -v -s

ii. Run all the test cases in the specific directory by having auto-discovery on all files written with the file name either prefixed/post-fixed as “test”

pytest -v -s  testdirectory

iii. Run unit test for specific file.

pytest -v -s -k test_helloworld.py

iv. Run unit test case for specific method

pytest -v -s -k test_helloworld

D. Pytest Assertions


An Assert statement verifies the actual vs expected values in the python unit test cases and it will return either Passed or Failed as part of the test case execution. Say if the assert statement fails because of some logic error you will get a complete function call (failure errors) along with the return value.

Assert example

def test_true():
   assert True

def test_false():
   assert False

def test_Integer():
   assert 4==4

def test_String():
   assert "hello" == "hello"
Result:

test_samples.py::test_true PASSED
test_samples.py::test_false FAILED
test_samples.py::test_Integer PASSED
test_samples.py::test_String PASSED

How do I assert exception in PyTest ?


In negative case scenarios, we can validate a function whether it raises an exception under certain conditions. In the below example, “test_exp_readCsv_withWrongFile” function will test the  “readCsvData” function with the wrong file name. The “readCsvData” function will return a value Error “csvFileNotValidError” and the result can be asserted in the unit test case function.

Handling ValueError exception example

# This function is responsible to read csv data
def readCsvData(self,filename):
 try:
   data = self.spark.read.format(self.readFormat).option("inferSchema", "true").option("header", "true").csv(filename)
   return data
 except pyspark.sql.utils.AnalysisException as e:
   raise ValueError("csvFileNotValidError")
				
				
# Test method		
def test_exp_readCsv_withWrongFile(wrongCSVFile):
   with pytest.raises(Exception) as excinfo:
            csvdata = readCsvData(wrongCSVFile)
       assert excinfo == ValueError

Note: If the function raises expected ValueError, test will pass and if function does not raise ValueError, test will fail.

E. PyTest Setup and TearDown Method


Setup and teardown methods are common to those coming from a unittest background. Basically, the following methods will be called before and after all the class, function, module, method/function level.

In a real-time scenario, we can load the test data or default configuration for each function/method/class which will reduce the code duplication. We will see an example to understand setup and teardown function 

Setup/Teardown example

import pytest

def test_helloworld():
   print("\n Test_HelloWorld function initiated")
   assert True

def setup_function(function):
   print("\n setup_function called")

def teardown_function(function):
   print("\n teardown_function called")
Result:

setupandteardown.py::test_helloworld 
setup_function called
PASSED                              [100%]
Test_HelloWorld function initiated
teardown_function called

Note: If the setup is failed or skipped the teardown function will not be called. We can call the setup/teardown multiple times per testing process

F. PyTest fixtures


@fixture(callable_or_scope=None, *args, scope=’function’, params=None, autouse=False, ids=None, name=None)

Reference

In fixture, we can delegate setup/teardown in a separate standalone helper method and can be accessed by multiple test functions, classes, modules based on the scope (refer below) defined. The fixture can be accessed by function arguments and for each fixture, we have to use the respective function name with the parameter.

Scopes decides how often the function should be called if it has the fixture decorator

  • Function:  called once and destroyed at the end of the test.
  • Class: called once per class and destroyed at the end of the last test in the class.
  • Module: called once per module and destroyed at the end of the last test in the module.
  • Package: called once per package and destroyed at the end of the last test in the package.
  • Session: called once per session and destroyed at the end of the last test in the session.

i. Fixture Example

Use Case – Initialize a database with known parameters before running a test and close the connection or clear the database tables after the test method.


#By default it will be called for all functions.
@pytest.fixture(autouse=True)   
def setup(request):
   print("\n setup  called")
   def teardown():
       print("\nTearDown called")

   request.addfinalizer(teardown)


def test_helloworld2():
   print("\n Test_HelloWorld 2 function initiated")
   assert True

def test_helloworld1():
   print("\n Test_HelloWorld 1 function initiated")
   assert True
Result:

test_samples.py::test_helloworld2 
setup  called
Test_HelloWorld 2 function initiated
PASSED
TearDown called
test_samples.py::test_helloworld1 
setup  called
Test_HelloWorld 1 function initiated
PASSED
TearDown called

ii. Create a single spark context for all your unit test cases – Example


Use Case – If we need to initiate multiple test cases, we need to initialize resource-intensive spark context for every function. Instead, we can have a common test file named “config_test” that will return the single spark session and it will get closed once that test runs successfully. Refer  example below

""" pytest fixtures that can be resued across tests
"""
from pyspark.sql import SparkSession
import pytest

@pytest.fixture(scope="session")
def sparkSession(request):
   """ fixture for creating a spark context
   Args:
       request: pytest.FixtureRequest object

   """

   sparkSession = (
       SparkSession
           .builder
           .appName("myspark")
           .master("local")
           .getOrCreate()
   )
   request.addfinalizer(lambda: sparkSession.stop())
   return sparkSession

 Yield vs Finalizer in Fixture

  1. Yield – will execute the teardown code. In fixture, we can yield exactly only one value and used to clean all the resources. Fixture functions can only yield exactly one value and more than one yield will result in an error.
  2. Finalizer – method with fixture decorator, will allow registering multiple finalizer methods and we can clean all the resources even the fixture setup raises an exception.

Yield vs Finalizer example

import pytest

@pytest.fixture()
def setup_for_helloworld1(request):
   print("\n setup  called")
   yield

@pytest.fixture()
def setup_for_helloworld2(request):
   print("\nSetup2")
   def teardown_helloworld1():
       print("\nTearDown A")
   def teardown_helloworld2():
       print("\nTearDown B")

   request.addfinalizer(teardown_helloworld2)
   request.addfinalizer(teardown_helloworld1)



def test_helloworld2(setup_for_helloworld2):
   print("\n Test_HelloWorld 2 function initiated")
   assert True

def test_helloworld1(setup_for_helloworld1):
   print("\n Test_HelloWorld 1 function initiated")
   assert True
Results:

test_jobs/test_samples.py::test_helloworld2 
Setup2
Test_HelloWorld 2 function initiated
PASSED
TearDown A
TearDown B

test_jobs/test_samples.py::test_helloworld1 
setup  called
Test_HelloWorld 1 function initiated
PASSED

Note: @pytest.fixture(autouse=True)
The above fixture definition indicates the respective function should run per each test.

My two cents on unit testing


  1. Have all the test cases in a common directory (test_projectname) and categorize it based on the project features.
  2. Have the proper unit testing setup in your local environment.
  3. Create the following logic (Single creation of spark context, Database connection, Configuration properties, Logging, Test Data) as global configs using fixtures.
  4. Always go for classes to have unit test cases in groups. 
  5. Since you will write the unit test case from the beginning of development there are a lot of chances of getting redundancy in the unit test cases. So always revisit the test cases and refactor the code often to reduce the duplicates.
  6. Last but not least, don’t confuse Unit testing with TDD. Always have in mind – unit testing is about what you are testing and TDD is about when you will do the testing.
  7. Use Code coverage (pytest-cov) to keep your code quality up always.
  8. Explore PyTest more in the official documentation.

Conclusion


In my perspective, A good programmer will always perform unit testing on his code before he promotes code to QA/UAT/Production. That is all about unit testing with PyTest and my advice would be, roll up your sleeves and get your hands dirty with (PyTest) unit testing framework. Try yourself, fail, and try again to get familiar yourself with python unit testing with PyTest.

Related Online Courses

1. Unit Testing and Test-Driven Development

What you’ll get from it: Learn how Unit Testing and TDD will help you and walkthrough of PyTest testing library and set up with best practices and example programming sessions.

2. Elegant Automation Frameworks with Python and Pytest

What you’ll get from it: The course will discuss what makes a good framework, and maybe more importantly, what makes a bad one. Learn how fixtures can eliminate up to 80% of the code in a bloated codebase.