vueshot | Screen Recording + Network & Console logs

1

Context The anatomy of the low-level design of a software application should closely mimic the business domain it is trying to model. So, a Domain-driven approach to designing components seems like an apt choice, since it helps formulate a shared understanding of the domain. This especially rings true for early prototypes, such as this, where the codebase would be touched more often during various stages of this product’s evolution and hence needs to have the right domain vocabulary for conducting the activities so that anyone reading the codebase could just naturally trace back its behavior to the business domain.

Besides its focus on the domain vocabulary and its general principle of avoiding a thick Services and an anemic Model layer, the DDD approach also recommends separating Domain entities from Application and Infrastructure code. Another adjacent concept, called Clean architecture, also suggests decoupling various layers to enable better readability and maintenance.

Now like every other component design, this too is a solution space for the business case of an innovative Bug Reporting tool described in this link. It would be quite useful to have a read of the overview document in order to follow along the motivations behind the abstractions and natural disjunctions in the domain that led to the decomposition of the components in the design explained below.

Layers The codebase is organized into cooperating logical units called layers, each having a specific set of responsibilities and abiding by a certain set of (dependency) rules laid out in the dependency inversion principle. The idea is to reduce coupling within the layers as shown in the following Fig. 1

Fig. 1 - Tight coupling

as far as possible, by abstracting away the details as shown in Fig. 2 below.

Fig. 2 - Less coupling

Following are the layers in the codebase.

1. services – This is the quintessential "Application Layer". If this layer was a person, it would ask itself "What services can I provide to the users of Vueshot?", aka. Business use cases. The stage

2

of evolution where this Product is at this moment, the AI use cases described here are still in the lab, so for now this layer -

Creates a Report,
Stores the incoming Screen Capture chunks, Console & Network events,
and at the end Consolidates all the pieces of information it has received to form a coherent piece of commentary of the events that transpired in the user’s browser.

There are two components that carry out the activities of this layer - “reporting.py” and “recorder.py”. The obvious next step is to provide the streaming and search services via “streaming.py” and “search.py” components respectively.

2. vue – Named after a shortened form of the Product name, this module is the "Domain layer" of the application and houses the Domain Entities - Report, ConsoleLogEntry, NetworkLogEntry, and Payload. Besides the domain objects, it is also the home of (currently two) abstract repositories - AbstractReportRepository and AbstractUserRepository and also an abstract notifier class called AbstractUserNotification.

As their names suggest, these domain entities are reflections of their equivalent concepts in the business domain, and the abstract repositories & user notifier act as the "ports" whose implementations are in the "adapters" module, thus opening the possibility of Testing using mock repositories. The terms port and adapters are borrowed from Clean/Hexagonal architecture, while the concept of having all the domain rules (of object creation and other domain responsibilities) in domain entities has been adapted from Domain-driven design.

In the future this module will need to have a Domain Service (invoked from the Application Service) that handles the pipeline of splitting a video into smaller chunks, transcribing, and redaction, besides domain methods in Report that can generate a “Title” and a “Friendly URL” for a Report based on either a description or transcription or a combination of both, if available.

Also, there is a lurking idea to introduce a way to suggest defect prioritization based on Fuzzy Logic described here. The mathematical models described here would find its natural home inside the Domain layer as well.

3. adapters – In a tip of the hat to Hexagonal architecture owing to its name, (though the term’s origin can easily be traced back to the object-oriented world) this module has the implementations of the abstractions of “ports” defined in the Domain layer (i.e., module “vue” above). This module embodies the “Infrastructure layer” defined in Domain-driven design. The concrete implementations of the repositories for Report and User are ReportRepository and UserRepository respectively. These repositories store and retrieve data not only from database, but also file and

3

object stores.

Another concrete implementation that resides in this layer is UserNotification. It is primarily used for account activation at this stage but will have more utility with the progression of the product’s other use cases. In short, this layer conducts all the necessary I/O activities with external systems like file/object stores, databases, and email service.

A schematic diagram of the dependencies between vue, services, and adpaters layers mentioned above is shown in Fig. 3 and a more low-level diagram illustrating the abstract dependencies of the services layer is shown in Fig. 4 below.

Fig. 3 - Dependencies between layers

Fig. 4 - Dependency inversion at a lower-level

While running the Tests a FakeRepository (working in-memory) can be injected into the services to inspect the behavior of the APIs being tested.

4. identity – This module provides a UUID to any newly created object. It is in a separate layer so that any module in the application (besides the obvious one, i.e., the repositories) that need an identifier (or slug) can obtain one by calling its method. As another aside, there could be different identifier generation strategies required in the future for easier identification of a resource, and this module is an obvious surrounding for such functionalities. Strictly speaking, this too belongs to the "Infrastructure layer" of Domain-driven design.

5. entrypoint – This module drives the application and provides the outward facing "adapters" in the form of REST APIs and WebSocket connections. The web framework that supports the application at this stage is Flask. The WebSocket communication is powered by Flask-Sock. The WSGI server used is Gunicorn with asynchronous gevent workers.

4

Since the entrypoint is the strarter of this application, this is the correct place to initialize the database Connection pool. It initializes the Concrete Repositories using this connection pool and "injects" them into the services which depend on the Abstract Repositories.

A component diagram showing the flow described above (marked with "Steps") along with all the modules, their responsibilities and the layers they represent is shown in Fig. 5 below.

Fig. 5 - Component diagram of the prototype

The Component diagram in Fig. 5 above, shows that the "services" module, i.e., the Application layer, handles all the use cases that are served as Flask REST APIs to the consumers of the services provided by Vueshot. It does so with the help of the Repositories in the adapters, i.e., the Infrastructure layer. So, in order to Test these services, the Repositories in these adapters need to be mocked.

TestsAn oft-used metaphor to describe the conceptual structure of the layout for Unit, Integration, and E2E (end-to-end) tests is the Testing Pyramid. It is indeed quite a nice way to visualize these distinct types of tests. At the bottom of the Pyramid are the Unit tests, that would test the Domain objects in the "vue" module. This would be followed vertically upwards by Integration tests, which test the "services" module, and further upwards (at the top of the pyramid) E2E tests, that test the "entrypoint" module.

5

Following is an illustration of the test pyramid discussed above along with the layers in the codebase, scope of these tests, and their intricacies.

Fig. 6 - Test pyramid and various scopes for each type of test in the pyramid

Each of these levels of Tests come with their own set of dependencies, effort, level of granularity, speed, and coverage. So, as expected Unit (Domain layer) test being fine-grained and closer to the domain, would be more susceptible to change, following a code refactor, than the E2E tests. Sitting at a higer level of abstraction, an E2E test won't require to be changed, as long as the HTTP REST APIs remain unchanged over the development lifetime of the product. These E2E tests, while providing assurance that the codebase works following a large-scale refactoring effort, are completely unaware of the object design at the low-level.

So, the services layer (Integration) tests can be a good middle-ground as regards coverage and being relatively less affected by changing codebase ay the low-level. A good way to decouple services layer code is to rely on abstract behavior of adapters, fulfilled through dependency injection and take in primitives instead of domain entities.

In the design presented above, the services depend only on

1. primitives,
2. an abstract Repository object,
3. and sometimes an abstract Notification object

Hence, service level tests can easily be done using Fake Respositories that are essentially in-memory implementations of Abstract Repositories. Also, since the service functions don't take in Domain objects as parameters they can easily evolve and hence "refactored" without breaking the service level Integration tests.

Now above all, this tool is evolving and hence its components are changing too. However, the design philosophy and the test criteria described above is going to hold true in the future versions of the codebase.

Low-level Design