We start with unit tests reading a file line by line with QFile
. Unit tests accessing the file system are not considered unit tests, as they may be slow and may depend on each other in surprising ways (see A Set of Unit Testing Rules by Michael Feathers). In a first step, we encapsulate QFile
in a class TextFile
and provide a fake implementation representing the file as a list of strings kept in main memory. In a second step, we introduce an interface, from which the product and fake implementations derive. We can now apply TDD to classes accessing files.
Bad Unit Tests Read from Files
# files/libffi/recipeinfo (Commit cdb19980)
LICENSE: MIT
PR: r0
PV: 3.2.1
The package scanner in the running example is part of a license compliance checker for Linux images built with Yocto or Buildroot. It reads the recipeinfo files for all packages in the Linux image.
// sources/package_scanner.cpp (Commit cdb19980)
PackageInfo PackageScanner::readRecipeInfo(QString packageName)
{
QFile recipeInfo{QString{"files/%1/recipeinfo"}.arg(packageName)}; // A1
if (!recipeInfo.open(QFile::ReadOnly)) // A2
{
qWarning().noquote().nospace()
<< "Cannot read file \'" << recipeInfo.fileName() << "\'.";
return {};
}
QString licStr;
QString version;
QString revision;
QTextStream is{&recipeInfo};
while (!is.atEnd()) // A3
{
auto line = is.readLine(); // A4
if (line.startsWith("LICENSE"))
{
licStr = line.split(':')[1].trimmed();
}
// Similar for the other lines ...
}
return PackageInfo{packageName, licStr, version, revision};
}
The function under test, PackageScanner::readRecipeInfo
, reads the file files/libffi/recipeinfo for the package packageName
, splits each line key: value
into a key and its value, and stores the result in a PackageInfo
object.
// tests/package_scanner_with_file_io/test_package_scanner_with_file_io.cpp
void TestPackageScannerWithFileIO::testReadRecipeInfo()
{
PackageScanner scanner;
auto package = scanner.readRecipeInfo("libffi");
QCOMPARE(package.name(), "libffi");
QCOMPARE(package.licenseString(), "MIT");
QCOMPARE(package.version(), "3.2.1");
QCOMPARE(package.revision(), "r0");
}
The test testReadRecipeInfo
checks that the values from the recipeinfo file were correctly entered into the PackageInfo
object package
.
# tests/package_scanner_with_file_io/CMakeLists.txt
project(package_scanner_with_file_io)
find_package(Qt6 REQUIRED COMPONENTS Test Core)
file(COPY files/ DESTINATION files) # C1
...
The CMakeLists.txt file copies the directory tree rooted at files
in the current source directory to the directory files
in the current build directory (line C1).
The original package scanner has more tests. Each test has its own recipeinfo file. The tests check the following cases among others:
- The scanner flags an error, if the license string is missing, if the recipeinfo file does not exist, or if the recipeinfo file is not readable.
- The scanner flags a warning, if the version or revision is missing.
The section heading suggests that unit tests are bad, when they read from files or when they write to files. Why?
Before we can answer this question, we must understand what characterises good unit tests. Tim Ottinger and Jeff Langr explain that good Unit Tests Are FIRST: Fast, Isolated, Repeatable, Self-Verifying, and Timely.
- Fast. Test suites that run longer than 3-5 seconds are too slow for TDD, where we may run unit tests several times per minute. Reading from or writing to files is one of the slowest operations on a computer. It gets slower the bigger the files become or the more often the files are accessed.
- Isolated. Unit tests should only have a single reason to fail. The recipeinfo files are only copied (line C1), if we run CMake explicitly. Therefore, tests may work with out-of-date recipeinfo files and produce wrong results.
- Repeatable. No matter how often or in which order we run tests, they should always produce the same result. Especially, tests should not depend on each other. They should not share data, e.g., through static or global variable or through files.
- Self-verifying. A test, which requires a human to check an output in the console, is not self-verifying. Some developers use such “tests” to feign higher test coverage.
- Timely. We write tests before the code and not after the code.
The package scanner tests are not fast, as they access the file system. They are not isolated, as they may fail if CMake was not run at the right time. Tests accessing files can easily become non-repeatable, if we don’t take special care. The tests are self-verifying, as they contain checks. They are also timely, as I write my tests first.
Good Unit Tests Read from File Doubles
We create a file double that holds the file contents in main memory. A text file can be represented as a list of strings: QStringList
. My first idea was to derive a QStringList
-based subclass from QFile
, QFileDevice
or even QIODevice
. This approach would come with considerable effort, as I knew from creating a Mock QCanBusDevice for TDD. The effort would be out of proportion for reading or writing text files line by line.
This approach would also change the interface of PackageScanner
from
PackageInfo readRecipeInfo(QString packageName);
to
PackageInfo readRecipeInfo(QString packageName, QFileDevice *file);
Production code would pass a QFile
object to readRecipeInfo
, whereas test code would pass an InMemoryFile
object. Following down this path would lead to few steps with big changes. I prefer many small steps with small changes. So, what now? That’s when I stumbled over this tweet from Michael “GeePaw” Hill.
I usually encapsulate the native collection classes — List, Set, Map, et al — within minutes of using them, and sometimes even *before* I use them. I recommend the practice to others, especially juniors.
Tweet by @GeePawHill
My idea was to encapsulate the class QFile (used in lines A1, A2, A3 and A4) in the class TextFile
. The implementation of readRecipeInfo
would change a little, but not the interface.
// sources/package_scanner.cpp (Commit 60387030)
PackageInfo PackageScanner::readRecipeInfo(QString packageName)
{
TextFile recipeInfo{QString{"files/%1/recipeinfo"}.arg(packageName)}; // D1
QString licStr;
QString version;
QString revision;
while (!recipeInfo.isAtEnd()) // D2
{
auto line = recipeInfo.readLine(); // D3
if (line.startsWith("LICENSE"))
{
licStr = line.split(':')[1].trimmed();
}
...
}
return PackageInfo{packageName, licStr, version, revision};
}
The lines A1 and A2 from the original implementation are replaced by the line D1. Lines A3 and A4 are replaced by lines D2 and D3, respectively.
The function readRecipeInfo
does not know how file access is implemented. It doesn’t know whether files are accessed through QFile
and whether files are stored on the hard disk or in a QStringList
. TextFile
hides the implementation details from its clients. This makes it easy to change the implementation of TextFile
for testing. Here is the slightly abridged production version of TextFile
. It uses QFile
to read from the file system.
// sources/text_file.cpp (Commit 60387030)
struct TextFile::Impl
{
Impl(QString filePath);
~Impl();
QFile m_file;
QTextStream m_inStream;
};
TextFile::Impl::Impl(QString filePath)
: m_file{filePath}
{
if (!m_file.open(QFile::ReadOnly))
{
throw std::runtime_error(
QString{"Cannot read file \'%1\'."}.arg(filePath).toStdString());
}
m_inStream.setDevice(&m_file);
}
TextFile::Impl::~Impl() { m_file.close(); }
TextFile::TextFile(QString filePath)
: m_impl{new Impl{filePath}} {}
bool TextFile::isAtEnd() const { return m_impl->m_inStream.atEnd(); }
QString TextFile::readLine() { return m_impl->m_inStream.readLine(); }
And yes, the production version of readRecipeInfo
must handle the exception thrown in the constructor. I left it out for brevity. Throwing the exception ensures that no TextFile
object exists – especially not a partially created object, when an error condition occurs.
We use the pimpl pattern so that the TextFile
header does not contain any traces of QFile
or of any other implementation details. Our next step is to replace the QFile
-based implementation of TextFile
by a QStringList
-based one.
// tests/doubles/fake_text_file.cpp (Commit ecaf965c)
struct TextFile::Impl
{
Impl(QString filePath);
~Impl();
bool m_isOpen{true};
QStringList m_lines{
"LICENSE: MIT",
"PR: r0",
"PV: 3.2.1"
};
int m_currentLine{0};
};
TextFile::Impl::Impl(QString filePath)
{
if (!m_isOpen)
{
throw std::runtime_error(
QString{"Cannot read file \'%1\'."}.arg(filePath).toStdString());
}
}
TextFile::Impl::~Impl() {}
TextFile::TextFile(QString filePath)
: m_impl{new Impl{filePath}} {}
bool TextFile::isAtEnd() const
{
return m_impl->m_currentLine == m_impl->m_lines.count();
}
QString TextFile::readLine()
{
auto line = m_impl->m_lines[m_impl->m_currentLine];
++m_impl->m_currentLine;
return line;
}
We store the recipeinfo file in the member variable m_lines
, which is a QStringList
. The member variable m_currentLine
is the index of the line returned by the next call to readLine
. The function readLine
increments m_currentLine
at each call. The end of the “file” is reached when m_currentLine
is equal to the number of lines in m_lines
. “Opening” a file always succeeds, as m_isOpen
is initialised with true
. The exception is never thrown.
The fake implementation only works for a single recipeinfo file. That’s OK. We will grow the implementation test by test in the next section. Proceeding in small steps is the gist of TDD after all.
The tests using the fake TextFile
are in the project tests/package_scanner_without_file_io. The test function TestPackageScannerWithoutFileIO::testReadRecipeInfo
is identical to TestPackageScannerWithFileIO::testReadRecipeInfo
above. The CMakeLists.txt file does not copy any recipeinfo files around but simply adds fake_text_file.cpp to the executable.
# tests/package_scanner_without_file_io/CMakeLists.txt
project(package_scanner_without_file_io)
find_package(Qt6 REQUIRED COMPONENTS Test Core)
add_executable(
${PROJECT_NAME}
test_package_scanner_without_file_io.cpp
../../sources/package_scanner.cpp
../../sources/package_info.cpp
../doubles/fake_text_file.cpp
)
The test TestPackageScannerWithoutFileIO::testReadRecipeInfo
is fast and isolated. It satisfies all FIRST criteria. That’s a big step, but we are not finished yet. We need the fake TextFile
implementation to work with different file contents.
Growing the File Double Test by Test
The fake implementation of TextFile (in fake_text_file.cpp) is a bit simplistic. It works only with a single hard-wired recipeinfo file. Let us change this.
Cannot Open File
// tests/package_scanner_without_file_io/test_package_scanner_without_file_io.cpp
// (Commit: 50b1f5f2)
void TestPackageScannerWithoutFileIO::testCannotOpenRecipeInfo()
{
PackageScanner scanner;
auto package = scanner.readRecipeInfo(u"cannot-open"_qs);
QVERIFY(!package.isValid());
}
The next test checks whether readRecipeInfo
returns an invalid package, if it cannot open the text file, that is, if the TextFile constructor throws an exception. readRecipeInfo
now has a catch block that returns a default constructed PackageInfo
object, which is always invalid.
TextFile::Impl::Impl(QString filePath)
{
if (!m_isOpen)
{
throw std::runtime_error(
QString{"Cannot read file \'%1\'."}.arg(filePath).toStdString());
}
}
We make the fake implementation throw the exception by forcing m_isOpen
to false. When TextFile is created with a different file path, say files/cannot-open/recipeinfo, the fake implementation initialises m_isOpen
with false and m_lines with an arbitrary string list (e.g., the empty string list).
struct TextFileData
{
bool m_isOpen{false};
QStringList m_lines;
};
struct TextFile::Impl
{
Impl(QString filePath);
~Impl();
QHash<QString, TextFileData> m_fileSystem{ // E1
{u"files/libffi/recipeinfo"_qs,
{true, {u"LICENSE: MIT"_qs,
u"PR: r0"_qs,
u"PV: 3.2.1"_qs}}},
{u"files/cannot-open/recipeinfo"_qs,
{false, {}}},
};
bool m_isOpen{false};
QStringList m_lines;
int m_currentLine{0};
};
TextFile::Impl::Impl(QString filePath)
{
auto textFileData = m_fileSystem.value(filePath); // E2
m_isOpen = textFileData.m_isOpen; // E3
m_lines = textFileData.m_lines; // E4
if (!m_isOpen) // E5
{
throw std::runtime_error(QString{"Cannot read file \'%1\'."}.arg(filePath).toStdString());
}
}
We are moving from a single file to multiple files. The hash map m_fileSystem
from file paths to TextFileData
(file contents and attributes) reflects this (see line E1). The first entry of the hash map provides a proper recipeinfo file, which can be opened, to the test testReadRecipeInfo
. The second entry provides an empty file (key: u"files/cannot-open/recipeinfo"_qs
), which cannot be opened (value: {false, {}}
), to the test testCannotOpenRecipeInfo
.
Line E2 retrieves the TextFileData
for a given filePath
from the hash map m_fileSystem
. Lines E3 and E4 initiase m_isOpen
and m_lines
with the values from the hash map. For the file path u"files/cannot-open/recipeinfo"_qs
, m_isOpen
is false and condition E5 evaluates to true. The TextFile::Impl
constructor throws the exception. readRecipeInfo
catches the exception and returns an invalid PackageInfo
object. The test testCannotOpenRecipeInfo
passes.
Missing License
Thanks to the file system double, writing tests becomes easy. We just add another TextFileData
entry to the file system map.
// tests/package_scanner_without_file_io/test_package_scanner_without_file_io.cpp
// Commit: 520b8d5e0
void TestPackageScannerWithoutFileIO::testLicenseMissingInRecipeInfo()
{
PackageScanner scanner;
auto package = scanner.readRecipeInfo(u"missing-license"_qs);
QVERIFY(package.isValid());
QCOMPARE(package.name(), u"missing-license"_qs);
QVERIFY(package.licenseString().isEmpty());
}
The test checks that the license string is empty, if the recipeinfo file doesn’t give a value for LICENSE. We add the following entry to the filesystem map for the file path files/missing-license/recipeinfo.
// tests/doubles/fake_text_file.cpp
QHash<QString, TextFileData> m_fileSystem{
...,
{u"files/missing-license/recipeinfo"_qs,
{true, {u"LICENSE: "_qs,
u"PR: r4"_qs,
u"PV: 6.3.2"_qs}}},
};
The test testLicenseMissingInRecipeInfo
and the other two tests pass.
File System Double for Different Test Cases
The fake TextFile
still has a problem. The file system double is specific for the test case TestPackageScannerWithoutFileIO
. We need different file system doubles for different test cases.
// tests/doubles/fake_text_file.cpp(Commit: be9f55b3)
#include "file_system_double.h"
#include "text_file.h"
struct TextFile::Impl
{
Impl(QString filePath);
~Impl();
QHash<QString, TextFileData> m_fileSystem = fileSystemDouble();
// As before ...
};
We extract the definition of the file system double from fake_text_file.cpp and move it into the header file file_system_double.h.
// tests/package_scanner_without_file_io/file_system_double.h
struct TextFileData
{
bool m_isOpen{false};
QStringList m_lines;
};
inline static QHash<QString, TextFileData> fileSystemDouble()
{
return QHash<QString, TextFileData>{
{u"files/libffi/recipeinfo"_qs,
{true, {u"LICENSE: MIT"_qs,
u"PR: r0"_qs,
u"PV: 3.2.1"_qs}}},
// More files ...
};
}
The header file_system_double.h is located in the same directory – tests/package_scanner_without_file_io – as the test case test_package_scanner_without_file_io.cpp, as the file system double is specific to this test case. The fake TextFile
, fake_text_file.cpp, is in the directory tests/doubles so that multiple test cases can use it. Each test case provides its own file_system_double.h and reuses fake_text_file.cpp.
Splendid: Abstract TextFile with Product and Test Implemenations
So far, we made the class PackageScanner
testable according to the FIRST principles without changing the interface of the class under test. This wasn’t easy, because we can’t set the text file through the interface of PackageScanner
. The implementation hides, which file is read from the real file system or its double. We must provide the text files through the back door by including the file_system_double.h header that fits to the respective test case.
// sources/package_scanner.cpp (Commit: 788eecdf)
PackageInfo PackageScanner::readRecipeInfo(QString packageName)
{
try
{
TextFile recipeInfo{QString{"files/%1/recipeinfo"}.arg(packageName)};
return readRecipeInfo(packageName, recipeInfo);
}
catch (...)
{
return {};
}
}
PackageInfo PackageScanner::readRecipeInfo(QString packageName, TextFile &recipeInfo)
{
QString licStr;
QString version;
QString revision;
while (!recipeInfo.isAtEnd()) { ... }
return {packageName, licStr, version, revision};
}
We temporarily duplicate the readRecipeInfo
function in the class PackageScanner
. The new function takes an additional second parameter: a reference to a TextFile
object. This enables clients of PackageScanner
to pass a TextFile
object explicitly. The original readRecipeInfo
function creates a TextFile
object, recipeInfo
, and calls the new function with packageName
and recipeInfo
.
At this point, we haven’t changed any client code yet. The clients still call the old version of readRecipeInfo
, which calls the new version. The tests pass and give us high confidence that the new version works correctly. We can now make one client after the other call the new version of readRecipeInfo
.
// tests/package_scanner_without_file_io/test_package_scanner_without_file_io.cpp
void TestPackageScannerWithoutFileIO::testReadRecipeInfo()
{
PackageScanner scanner;
TextFile recipeInfo{u"files/libffi/recipeinfo"_qs};
auto package = scanner.readRecipeInfo(u"libffi"_qs, recipeInfo);
QVERIFY(package.isValid());
// More checks ...
}
Calling the new version works well for testReadRecipeInfo
and testLicenseMissingInRecipeInfo
. It crashes for testCannotOpenRecipeInfo
, because the test doesn’t catch the exception thrown by the TextFile
constructor.
// tests/package_scanner_without_file_io/test_package_scanner_without_file_io.cpp
void TestPackageScannerWithoutFileIO::testCannotOpenRecipeInfo()
{
PackageScanner scanner;
QVERIFY_EXCEPTION_THROWN(
TextFile recipeInfo{u"files/cannot-open/recipeinfo"_qs};
auto package = scanner.readRecipeInfo(u"cannot-open"_qs, recipeInfo),
std::runtime_error
);
}
The QVERIFY_EXCEPTION_THROWN
macro passes if the expression in its first argument throws an exception of the type given in its second argument. The TextFile
constructor throws the exception. Therefore, we could omit the call to readRecipeInfo
, as it is never called.
We can replace the old version of readRecipeInfo
by the new one in TestPackageScannerWithFileIO
. It will read from real files, because it uses sources/text_file.cpp instead of test/doubles/fake_text_file.cpp.
We remove the old version of readRecipeInfo
from PackageScanner
. We also remove PackageInfo::isValid
, as readRecipeInfo
does not catch the TextFile
exception and hence does not return an invalid package any more.
We are now ready to get rid of the crutch file_system_double.h. We turn TextFile
into an interface AbstractTextFile
and pass an AbstractTextFile
reference to readRecipeInfo
. We derive TextFile
and FakeTextFile
from AbstractTextFile
. TextFile
is the product version reading from real files and FakeTextFile
is the version for unit testing reading from a string list. Let us do this small step by small step. Here is the interface AbstractTextFile
.
// sources/abstract_text_file.h (Commit: c7530ab2)
class AbstractTextFile
{
public:
virtual ~AbstractTextFile() = default;
virtual bool isAtEnd() const = 0;
virtual QString readLine() = 0;
};
The class TextFile
inherits the interface AbstractTextFile
and includes the header abstract_text_file.h.
#include "abstract_text_file.h"
class TextFile : public AbstractTextFile
{
// As before ...
};
The function readRecipeInfo
takes an AbstractTextFile
reference as its second argument instead of the concrete TextFile
reference. The header file forward declares AbstractTextFile
instead of TextFile
and the source file includes abstract_text_file.h instead of text_file.h.
// sources/package_scanner.h
class AbstractTextFile;
class PackageScanner
{
public:
PackageInfo readRecipeInfo(QString packageName, AbstractTextFile &recipeInfo);
The tests pass after these changes. We create a header for the class FakeTextFile
in test/doubles/fake_text_file.h derived from AbstractTextFile
. The header looks exactly the same as sources/text_file.h except that TextFile
is replaced by FakeTextFile
. Similarly, we replace TextFile
by FakeTextFile
and text_file.h by fake_text_file.h in test/doubles/fake_text_file.cpp.
// tests/package_scanner_without_file_io/test_package_scanner_without_file_io.cpp
// Commit: 48d62561
void TestPackageScannerWithoutFileIO::testReadRecipeInfo()
{
PackageScanner scanner;
FakeTextFile recipeInfo{u"files/libffi/recipeinfo"_qs};
auto package = scanner.readRecipeInfo(u"libffi"_qs, recipeInfo);
QCOMPARE(package.name(), u"libffi"_qs);
// ...
The test case TestPackageScannerWithoutFileIO
uses FakeTextFile
for QStringList
-based files, whereas the test case TestPackageScannerWithFileIO
uses TextFile
for real files. The time has come to sunset the crutch file_system_double.h used by TestPackageScannerWithoutFileIO
.
// tests/doubles/fake_text_file.cpp (Commit: 358ff6c0)
#include "fake_text_file.h"
struct FakeTextFile::Impl
{
bool m_isOpen{false};
QStringList m_lines;
int m_currentLine{0};
};
FakeTextFile::FakeTextFile(QString filePath, bool isOpen, QStringList lines)
: m_impl{new Impl{isOpen, lines, 0}}
{
if (!m_impl->m_isOpen)
{
throw std::runtime_error(
QString{"Cannot read file \'%1\'."}.arg(filePath).toStdString());
}
}
We pass the file attribute isOpen
and the file contents lines
to the FakeTextFile
constructor, which stores these arguments in the structure FakeTextFile::Impl
. The two new constructor arguments mirror one TextFileData
entry of the hash map m_fileSystem
. Instead of adding entries to the hash map in file_system_double.h, every FakeTextFile
constructor call sets the file attribute and the file contents.
Each test contains all the information needed to understand it in one glance. That’s a lot better than having to find the file information in an extra header file file_system_double.h.
// tests/package_scanner_without_file_io/test_package_scanner_without_file_io.cpp
void TestPackageScannerWithoutFileIO::testReadRecipeInfo()
{
PackageScanner scanner;
FakeTextFile recipeInfo{
u"files/libffi/recipeinfo"_qs, true,
{u"LICENSE: MIT"_qs,
u"PR: r0"_qs,
u"PV: 3.2.1"_qs}
};
auto package = scanner.readRecipeInfo(u"libffi"_qs, recipeInfo);
QCOMPARE(package.name(), u"libffi"_qs);
The macro QVERIFY_EXCEPTION_THROWN
gets confused by the brace initialisation of the FakeTextFile
constructor. It thinks that it receives 4 instead of 2 arguments. We can fix this by using parentheses instead of braces. The compiler is happy with this. QtCreator keeps flagging this as an error – wrongly. So, we ignore QtCreator.
// tests/package_scanner_without_file_io/test_package_scanner_without_file_io.cpp
void TestPackageScannerWithoutFileIO::testCannotOpenRecipeInfo()
{
PackageScanner scanner;
QVERIFY_EXCEPTION_THROWN(
FakeTextFile recipeInfo(u"files/cannot-open/recipeinfo"_qs, false, {});
auto package = scanner.readRecipeInfo(u"cannot-open"_qs, recipeInfo),
std::runtime_error
);
}
All tests pass. We have successfully converted integration tests accessing the file system into unit tests accessing in-memory “files”. Hard-to-test code becomes easy to test. We did this using TDD in many small steps.
How to Follow Along
You find the example code in the directory BlogPosts/TDDonClassesWithFileIO of the GitHub repository embeddeduse. Check out the commit SHAs given in the code snippets to follow along step by step.
Hello,
What about moving the file read operation out of the scanning code ? This is a common practice to separate data read from processing.
The scanner class would be a simple function :
using FileContent = QStringList ; // to be improved
PackageInfo scanPackageRecipeInfo(const FileContent &);
Usage:
const auto info = scanPackageRecipeInfo (mustReadFileContent(“…”));
Then your tests can provide simple raw text that you split in QStringList which is IMHO simpler to maintain.
What do you think ?
Hi Aurélien,
That is certainly an option, when I refactor the legacy code further. However, the focus of the post was on file I/O.
Cheers,
Burkhard
Hi,
Thx for the write up.
You may add an extra test where one line is missing the colon character. Example:
LICENSE MIT
PR: r0
PV: 3.2.1
Hi David,
Absolutely. I use this test and many others for the original code. For the post, all these tests would be a bit too much.
Cheers,
Burkhard