Skip to content

Repository and Query Builder

samuelgfeller edited this page May 14, 2024 · 12 revisions

Introduction

Repository classes mediate between the application's domain/business logic and the data access logic, typically a database.
They handle all interactions with the data source, such as a database, file system, or other external services.
The primary purpose of a repository is to encapsulate the logic required to access the data and provide a simplified interface to the rest of the application.

Naming repositories

Just like every component of the slim-example-project, slim-api-starter or the slim-starter, repositories should follow the Single Responsibility Principle.
For every service class that needs access to an external data source, a repository has to exist.

The repository should be named after the service class that uses it with the suffix "Repository" to differentiate it.

For example, if the ClientCreator service wants to insert a client into the database, the ClientCreatorRepository would be responsible for this.

Repository content

Every public method of a repository contains a data access query that is prepared and executed.

The result is returned as a single value, as a DTO or an array.

Business logic should be kept outside the repository.

Example

<?php

namespace App\Domain\Client\Repository;

use App\Infrastructure\Factory\QueryFactory;

final readonly class ClientCreatorRepository
{
    public function __construct(
        private QueryFactory $queryFactory
    ) {
    }

    /**
     * Insert client in database.
     *
     * @param array $data keys are the column names
     * 
     * @return int lastInsertId
     */
    public function insertClient(array $data): int
    {
        return (int)$this->queryFactory->insertQueryWithData($data)->into('client')->execute()->lastInsertId();
    }
}

Database Access with the CakePHP Query Builder

For security reasons, it's highly recommended to use a query builder for database access.

CakePHP's cakephp/database library offers an efficient query builder with user-friendly interfaces.

Full documentation: Query Builder.

Configuration

Settings

The default database settings are stored in the config/defaults.php file.
The database name is set in the file config/env/env.dev.php for the dev environment, config/env/env.prod.php for production and config/env/env.test.php for the testing env.

The host and password are stored in the secret config/env/env.php file.

Read more about the configuration.

File: config/defaults.php

$settings['db'] = [
    'host' => '127.0.0.1',
    'database' => 'slim_example_project',
    'username' => 'root',
    'password' => '',
    'driver' => Cake\Database\Driver\Mysql::class,
    'encoding' => 'utf8mb4',
    'collation' => 'utf8mb4_unicode_ci',
    // Enable identifier quoting
    'quoteIdentifiers' => true,
    // Disable query logging
    'log' => false,
    // Turn off persistent connections
    'persistent' => false,
    // PDO options
    'flags' => [
        // Turn off persistent connections
        PDO::ATTR_PERSISTENT => false,
        // Enable exceptions
        PDO::ATTR_ERRMODE => PDO::ERRMODE_EXCEPTION,
        // Emulate prepared statements
        PDO::ATTR_EMULATE_PREPARES => true,
        // Set default fetch mode to array
        PDO::ATTR_DEFAULT_FETCH_MODE => PDO::FETCH_ASSOC,
    ],
];

Database connection

The database connection is established in the config/container.php file.

File: config/container.php

use Psr\Container\ContainerInterface;
use Cake\Database\Connection;

return [
    // ...
    
    // Establish database connection
    Connection::class => function (ContainerInterface $container) {
        $settings = $container->get('settings')['db'];
        return new Connection($settings);
    },
    
    // ...
];

QueryFactory

The slim-example-project and slim-starter use a custom QueryFactory class which offers an additional abstraction layer to the query builder library and provides practical methods to create queries.

Click to see file: src/Infrastructure/Factory/QueryFactory.php
<?php

namespace App\Infrastructure\Factory;

use Cake\Database\Connection;
use Cake\Database\Query;
use Cake\Database\Query\InsertQuery;
use Cake\Database\Query\SelectQuery;
use Cake\Database\Query\UpdateQuery;

/**
 * Query factory.
 * Documentation: https://github.com/samuelgfeller/slim-example-project/wiki/Repository-and-Query-Builder
 * CakePHP: https://book.cakephp.org/5/en/orm/query-builder.html
 */
final readonly class QueryFactory
{
    public function __construct(public Connection $connection)
    {
    }

    /**
     * Returns a select query instance.
     ** Exclude deleted_at records if soft-deleting.
     *
     * SELECT usage example:
     *     $query = $this->queryFactory->selectQuery()->select(['*'])->from('user')->where(
     *         ['deleted_at IS' => null, 'name LIKE' => '%John%']
     *     );
     *     return $query->execute()->fetchAll('assoc');
     *
     * @return SelectQuery<mixed>
     */
    public function selectQuery(): SelectQuery
    {
        return $this->connection->selectQuery();
    }

    /**
     * Returns an update query instance.
     * Include deleted_at in where clause if soft-deleting.
     *
     * UPDATE usage example:
     *     $query = $this->queryFactory->updateQuery()->update('user')->set($data)->where(['id' => 1]);
     *     return $query->execute()->rowCount() > 0;
     *
     * @return UpdateQuery
     */
    public function updateQuery(): UpdateQuery
    {
        return $this->connection->updateQuery();
    }

    /**
     * Returns an insert query instance.
     *
     * @return InsertQuery the insert query object
     */
    public function insertQuery(): InsertQuery
    {
        return $this->connection->insertQuery();
    }

    /**
     * Data is an assoc array of a row to insert where the key is the column name.
     *
     * Insert row usage example:
     *     return (int)$this->queryFactory->insertQueryWithData($data)->into('user')->execute()->lastInsertId();.
     *
     * @param array $data ['col_name' => 'Value', 'other_col' => 'Other value']
     *
     * @return InsertQuery
     */
    public function insertQueryWithData(array $data): InsertQuery
    {
        return $this->connection->insertQuery()->insert(array_keys($data))->values($data);
    }

    /**
     * Set the deleted_at column of the entry to the current datetime.
     *
     * Soft-delete usage example:
     *     $query = $this->queryFactory->softDeleteQuery('user')->where(['id' => $id]);
     *     return $query->execute()->rowCount() > 0;
     *
     * @param string $fromTable
     *
     * @return UpdateQuery
     */
    public function softDeleteQuery(string $fromTable): UpdateQuery
    {
        return $this->connection->updateQuery()->update($fromTable)->set(['deleted_at' => date('Y-m-d H:i:s')]);
    }

    /**
     * Returns a delete query instance for hard deletion.
     *
     * Hard-delete usage example:
     *     $this->queryFactory->hardDeleteQuery()->delete('table')->where(['id' => $id]);
     *     return $query->execute()->rowCount() > 0;
     *
     * @return Query\DeleteQuery the delete query object
     */
    public function hardDeleteQuery(): Query\DeleteQuery
    {
        // Return the delete query object created by the connection.
        return $this->connection->deleteQuery();
    }

    /**
     * Inserts multiple rows into a table.
     * Parameter is an array of rows to insert where each row is an array with the key
     * being the column name and the value being the value to insert.
     *
     * Insert multiple rows usage example:
     *     return (int)$this->queryFactory->newMultipleInsert($data)->into('user')->execute()->lastInsertId();
     *
     * @param array $arrayOfData [['col_name' => 'Value', 'other_col' => 'Other value'], ['col_name' => 'value']]
     *
     * @return InsertQuery
     */
    public function insertQueryMultipleRows(array $arrayOfData): InsertQuery
    {
        $query = $this->connection->insertQuery()->insert(array_keys($arrayOfData[array_key_first($arrayOfData)]));
        // According to the docs, chaining ->values is the way to go https://book.cakephp.org/4/en/orm/query-builder.html#inserting-data
        foreach ($arrayOfData as $data) {
            $query->values($data);
        }

        return $query;
    }
}

Basic operations with the Query Builder

Select query

$query = $this->queryFactory->selectQuery();
$results = $query->select(['field1', 'field2'])->from('table')->where(['id' => $id])->execute()->fetchAll('assoc');

Where condition

The function where() accepts an array of AND conditions.

The array values are escaped and bound to the query, but the keys are not.

$query->where(['id' => $id, 'name' => $name]);

OR conditions

For OR conditions, the key OR can be used with an array of OR conditions as value.

$query->where(['id' => $id, 'name' => $name, 'OR' => ['field1' => $value1, 'field2' => $value2]]);

The above example would result in the following SQL query:

WHERE id = :id AND name = :name AND (field1 = :value1 OR field2 = :value2)

IN clause

To check if a value is in a list of values, the IN condition can be used like this:

$query->where(['id' => $id, 'field IN' => ['value1', 'value2']]);

Documentation: Creating IN Clauses.

IS NULL

To check if a value is or is not NULL, the IS NULL or IS NOT NULL conditions can be used:

$query->where(['field1 IS' => null, 'field2 IS NOT' => null]);

Documentation: IS NULL Creation.

Greater than, less than

To check if a value is greater than or less than another value, the > and < or <= and >= conditions can be used:

$query->where(['field1 >' => $value1, 'field2 <' => $value2]);

Advanced where conditions

I recommend reading the documentation page Advanced Conditions and SQL Injection Prevention.

The array in the where() method also accepts single strings as conditions which are inserted into the generated query as-is.
This means that they can be used as raw SQL expressions, but it also means that they are vulnerable to SQL injection when untrusted values are used.

When using values in such a context, they should be bound:

$query->where([
        'MATCH (comment) AGAINST (:userData)',
        'created < NOW() - :moreUserData',
    ])
    ->bind(':userData', $userData, 'string')
    ->bind(':moreUserData', $moreUserData, 'datetime');

Join query

$results = $query->select(['field1', 'field2'])->from('table')
    ->join(['table2' => 'table2.id = table.foreign_key'])->execute()->fetchAll('assoc');

Joining multiple tables

$results = $query->select(['field1', 'field2'])->from('table')->join(
        't2' => ['table' => 'table2', 'type' => 'LEFT', 'conditions' => 't2.id = table.foreign_key'],
        't3' => ['table' => 'table3', 'type' => 'LEFT', 'conditions' => 't3.id = t2.foreign_key',],
    )->execute()->fetchAll('assoc');

Update query

$query = $this->queryFactory->updateQuery();
return $query->update('table')->set(['field' => 'new_value'])->where(['id' => $id])->execute()->rowCount();

Insert query

QueryBuilder insert query

To insert a row of data with custom columns, the CakePHP QueryBuilder can be used like this:

$query = $this->queryFactory->insertQuery();
return $query->insert(['field1', 'field2'])->values(['value1', 'value2'])->into('table')->execute()->lastInsertId();

Custom insert query

The QueryFactory class contains two methods to facilitate the insertion of data into a database:

  1. Insert a single row of data

    $insertQuery = $this->queryFactory->insertQueryWithData(['field1' => 'value1', 'field2' => 'value2']);
    return $insertQuery->into('table')->execute()->lastInsertId();
  2. Insert multiple rows of data

    $data = [
         ['field1' => 'value1', 'field2' => 'value2'],
         ['field1' => 'value3', 'field2' => 'value4'],
    ];
    $insertQuery = $this->queryFactory->insertQueryMultipleRows($data);
    return $insertQuery->into('table')->execute()->lastInsertId();

Delete query

Default delete query

The default QueryBuilder deleteQuery can be accessed via the QueryFactory with the hardDeleteQuery() method:

$query = $this->queryFactory->hardDeleteQuery()->delete('user')->where(['id' => 1]);
return $query->execute()->rowCount() > 0;

This deletes the record from the database forever.

Soft delete query

The QueryFactory class provides a softDeleteQuery() method that sets the deleted_at column to the current datetime marking it as deleted.

$query = $this->queryFactory->softDeleteQuery('user')->where(['id' => 1]);
return $query->execute()->rowCount() > 0;

Crucial additions

Below is a list of things I wish I knew earlier when working with the query builder.

Debugging

To see the SQL query that is executed, the sql() method can be used:

$query = $this->queryFactory->selectQuery();
$query->select(['field1', 'field2'])->from('table')->where(['id' => $id]);
echo $query->sql();

Using a column as a value

When a table column should be used as a value in a query, we have to tell CakePHP that it's a column and not a value.
This wouldn't work ->where(['table.field1' => 'table.field2']) because table.field2 is interpreted as string "table.field2" and not the actual value of the column.

The identifier() method can be used to reference a column in a query.

$query->where(['table.field1' => $query->identifier('table.field2')]);

Functions

CakePHP offers abstraction for some commonly used SQL functions.
These can be accessed with the func() method followed by the function name.

For example, to count the number of rows in a table, the count() function can be used like this:

$query->select(['count' => $query->func()->count('*')]);

Documentation: Using SQL Functions.

Function arguments

The advantage of using the func() method is that it automatically binds the arguments making them safe from SQL injection.

When the arguments should be interpreted as-is, they have to be specified as literals or identifiers.

Citing from the documentation:

When providing arguments for SQL functions, there are two kinds of parameters you can use, literal arguments and bound parameters.
Identifier/Literal parameters allow you to reference columns or other SQL literals.
Bound parameters can be used to safely add user data to SQL functions.

Identifiers and literals can be specified by making a key-value pair with the key being the identifier or literal and the value 'identifier' or 'literal'.

Bound parameters can just be added as values to the argument array.

Both literal and identifier arguments allow you to reference other columns and SQL literals while identifier will be appropriately quoted if auto-quoting is enabled. If not marked as literal or identifier, arguments will be bound parameters allowing you to safely pass user data to the function.

In the following example, ' - CAT: ' and ' - Age: ' are bound parameters.

$concat = $query->func()->concat([
    'Articles.title' => 'identifier',
    ' - CAT: ',
    'Categories.name' => 'identifier',
    ' - Age: ',
    $query->func()->dateDiff([
        'NOW()' => 'literal',
        'Articles.created' => 'identifier',
    ])
]);
$query->select(['link_title' => $concat]);

The above example generates something like this in MySQL.

SELECT CONCAT(Articles.title,
              :c0,
              Categories.name,
              :c1,
              (DATEDIFF(NOW(), Articles.created)))
FROM articles;

Custom functions

Citing from Custom Functions:

If func() does not already wrap the SQL function you need, you can call it directly through func() and still safely pass arguments and user data as described. Make sure you pass the appropriate argument type for custom functions, or they will be treated as bound parameters.

$customFuncResult = $query->func()->custom_func([
    'field' => 'identifier',
    $unsafeUserInput, // This will be treated as a bound parameter
    'other_field' => 'identifier',
    'bound_param',
]);
$time = $query->func()->date_format([
    'created' => 'identifier',
    "'%H:%i'" => 'literal'
]);
$query->select([
    'timeCreated' => $time,
    'customFuncResult' => $customFuncResult,
]);

These custom functions would generate something like this in MySql:

SELECT (date_format(`created`, '%H:%i'))                       AS `timeCreated`,
       (custom_func(`field`, :param0, `other_field`, :param1)) AS `customFuncResult`
FROM `table`

Raw SQL expressions

Sometimes we need to use a plain SQL expression in a query.
This should be used with caution because the QueryBuilder does not escape anything that is passed to the newExpr() method (unlike func()), meaning it's vulnerable to SQL injection.
Unsafe values must be bound.

$query->newExpr('DATE_SUB(NOW(), INTERVAL 1 DAY)');

Documentation: Raw Expressions.

Binding values

When potentially unsafe values from outside are used in a context where SQL is interpreted as-is, they have to be bound.
This can be done with a named placeholder preceded by a colon :placeholder in the expression.

$query = $this->queryFactory->selectQuery();
$query->select('*')->from('table')->where([
        'created_at >' => $query->newExpr('DATE_SUB(NOW(), INTERVAL :days DAY)'),
    ])->bind(':days', $days, 'integer');

Documentation: Binding values

Clone this wiki locally