Таблица тайских согласных с IPA транслитерацией и русскими названиями букв

Исходники по адресу https://github.com/vasily-kartashov/thai-consonants, сама таблица выглядит вот так:

Thai Consonants

Using DataDog on AWS Elastic Beanstalk

There is somewhat limited information available on how to use DataDog on the Elastic Beanstalk platform, and it took me quite a bit of time to get it working. Here is a summary of how to make the DataDog agent run and report logs to DataDog.

First, place two files in predefined locations: a minimal DataDog configuration file with an API key, DataDog endpoint, and a key enabling log collection, and a second file containing a list of log files you would like to watch and send to DataDog.

files:
  "/etc/datadog-agent/datadog.yaml":
    mode: "000644"
    owner: root
    group: root
    content: |
      dd_url: https://app.datadoghq.com
      api_key: 012301230123012301230123
      logs_enabled: true
  "/etc/datadog-agent/conf.d/app.d/app.yaml":
    mode: "000644"
    owner: root
    group: root
    content: |
      logs:
       - type: file
         path: /var/log/application.json.log
         service: application
         source: php

Next, execute the command to install the DataDog agent, noting that the install script returns an error code if you try to reinstall the agent. Therefore, you should include a guard checking for the existence of the /opt/datadog-agent directory before executing the install command:

commands:
  install_datadog:
    command: "[ -d /opt/datadog-agent ] || curl -L https://raw.githubusercontent.com/DataDog/datadog-agent/master/cmd/agent/install_script.sh | sudo DD_API_KEY=012301230123012301230123 bash"

Optionally, you may also choose to restart the agent on every application deployment. This is not strictly necessary, but provided here for completeness:

files:
  ...
  "/opt/elasticbeanstalk/hooks/appdeploy/post/50_restart_services":
    mode: "000755"
    owner: root
    group: root
    content: |
      #!/usr/bin/env bash
      set -xe
      . /opt/elasticbeanstalk/support/envvars
      ...
      if status datadog-agent | grep -q start ; then
        echo "Restarting DataDog"
        restart datadog-agent
      else
        echo "Starting DataDog"
        start datadog-agent
      fi
      ...

And that’s it!

PHP Command Resource

There is an obvious similarity between web resources, command-line interface (CLI) commands, and Alexa resources. Web resources transform HTTP requests into HTTP responses, CLI commands accept user input and trigger business logic while occasionally reporting back to the user, and Alexa resources encode parsed voice commands and report back a JSON-formatted response. In fact, there is probably a 90% overlap between these three types of resources. Here is a short note on how to use this overlap to your advantage:

First, it is important to note that in the process of transforming input into output, we will go through multiple stages, and these stages are largely disjoint. An HTTP request must conform to a specific web service specification, an Alexa request must conform to its own specification, and a CLI request must conform to the flavor of command line utilities you prefer.

                                                +---------+
HTTP Request  -> Authorization -> Validation -> |         | -> HTTP Formatting  -> Output
Alexa Request -> Authorization -> Dispatch   -> |         | -> Alexa Formatting -> Output
CLI Request -> Validation                    -> |         | -> Output
                                                +---------+

To reuse code across these different types of resources, the interface should be the common denominator, and the output should be understandable by all possible clients of this interface. One possible interface is the Symfony\Component\Console\Command\Command, which can be extended to operate on the psr/http-message interfaces as well. For example:

protected function execute(
    InputInterface $input,
    OutputInterface $output,
    ResponseInterface $webResponse = null,
    AlexaResponse $alexaResponse = null) {

    $controllerId = $input->getOption('controller-id');
    ...
}

To extend the Symfony\Component\Console\Command\Command interface to work with psr/http-message interfaces, we can add the following method:

protected function executeWebRequest(RequestInterface $request, ResponseInterface $response)
{
    $input = new ArrayInput([
        '--controller-id' => $request->getParameter('controllerId')
    ], $this->getDefinition());

    $this->execute($intput, $output, $response);
}

We can also add a method for handling Alexa requests:

protected function executeAlexaRequest(AlexaRequest $request, AlexaResponse $response)
{
    if ($request instanceof IntentRequest) {
        if ($request->intentName == 'execute') {
            $intput = new ArrayInput([
                '--controller-id' => $request->getSlot('controllerId')
            ], $this->getDefinition());
        }
        ...
    }

    $this->execute($input, $output, null, $response);
}

With these methods, we can reuse the same class from web, Amazon Alexa, and CLI. It is convenient that PHP allows us to “implement” the execute method while adding parameters that default to null.

PHP Global Error Handler

When it comes to handling uncaught Throwables in PHP 7, there are a few things to consider:

  • It’s best to have only one handler defined
  • The handler should be stateful, meaning it should be able to capture something like a LoggerInterface
  • The additional handler shouldn’t hide the code location, meaning that in logs, we should see the file and line of the previous hop in the stack trace, not the handler’s
  • Registering handlers should be simple and straightforward

Here’s an example of an implementation that meets these criteria:

class ErrorHandler implements GenericHandler
{
    private $logger;

    public function __construct(LoggerInterface $logger)
    {
        $this->logger = $logger;
    }

    public function __invoke($code, $message, $file, $line)
    {
        switch ($code) {
            case E_USER_ERROR:
                if (extension_loaded('newrelic')) {
                    newrelic_notice_error($message);
                }
                $this->logger->critical($message);
                break;
            default:
                $this->logger->warning($message);
                break;
        }
        return true;
    }
}

And here’s an example of an exception handler:

class ExceptionHandler implements GenericHandler
{
    private $logger;

    public function __construct(LoggerInterface $logger)
    {
        $this->logger = $logger;
    }

    public function __invoke(Throwable $exception)
    {
        if (extension_loaded('newrelic')) {
            newrelic_notice_error($exception->getMessage(), $exception);
        }
        $this->logger->warning($exception->getMessage(), ['exception' => $exception]);
        http_response_code(500);
    }
}

And for the regisration:

set_error_handler(new ErrorHandler($logger), E_ALL);
set_exception_handler(new ExceptionHandler($logger));

In addition, to make this work, you will need to use a fork of Log4Php, version 4.*, that:

  • includes namespaces,
  • implements the psr/log interface,
  • accounts for the new Throwable hierarchy change in PHP 7,
  • adds GenericHandler and GenericLogger marker interfaces that are used to skip additional logging layers in stack traces.

How to Run Cronjobs on Elastic Beanstalk

Elastic Beanstalk offers a convenient option for specifying cron tasks and running them on the leader instance, but there can be issues with this approach. For example, if the leader instance is terminated, your cron tasks will be suspended until the next deployment.

To avoid this issue, you can run cron jobs on every instance but have them exit early if the current instance is not the “first” in the autoscaling group. There are different ways to define what “first” means in this context. One option is to sort instance IDs alphabetically.

The following PHP snippet demonstrates how this can be achieved:

$client = new ElasticBeanstalkClient(...);
$description = $client->describeEnvironmentResources([
    'EnvironmentName' => BEANSTALK_ENVIRONMENT;
]);
$currentId = file_get_contents("http://instance-data/latest/meta-data/instance-id");
$availableIds = [];
foreach ($description['EnvironmentResources']['Instances'] as $instance) {
    $availableIds[] = $instance['Id'];
}
sort($availableIds);
if ($currentId != $availableIds[0]) {
    $logger->info('Skipping cron execution on {id}. Leader is {leader}', [
        'id' => $currentId,
        'leader' => availableIds[0]
    ]);
    exit(0);
} else {
    $logger->info('Executing cron on {id}', ['id' => $currentId]);
}

...

This snippet first checks which instances are available, sorts the available instances by their ID alphabetically, and then compares the current instance’s ID with the leader instance ID. If the current instance is not the leader, the script exits early, otherwise it continues to execute the cron job. This approach ensures that the cron jobs will run on every instance, thus preventing the suspension of cron jobs in case of leader instance termination.

Result Set Processor

When it comes to fetching data from a database, there’s often a dilemma of whether to join tables and retrieve all the data in one go, or fetch only the necessary data for the moment and risk N+1 query problems. Let’s assume we have a list of controllers, each with a set of sensors attached. The result should look something like:

[
    {
        "controllerId": "c001",
        "sensors": []
    },
    {
        "controllerId": "c002",
        "sensors": [
            {
                "sensorId": "s001",
                "active": false
            }
        ]
    }
]

One solution is to use the following SQL query:

SELECT controllers.id AS controller_id,
       sensors.id AS sensor_id
       sensors.active
  FROM controllers
       JOIN sensors
         ON contoller.id = sensors.controller_id;

$controller = null;
foreach ($tupes as $tuple) {
    if ($tuple.controller != $controller) {
        $controller = $tuple.controller;
    }
    $controller.append($tuple.sensor);
}

However, this can become tedious and complex for more advanced use cases. Another solution is to use a foreach loop and fetch data for each controller individually:

foreach (SELECT controllers.id FROM controllers => $id) {
    SELECT * FROM sensors WHERE sensors.controller_id = $id;
}

This second solution also has its downsides and can lead to N+1 query problems.

In this article, the author presents a way to address this dilemma in PHP by using the concept of processing contexts. The basic idea is to define a ProcessingContext class that keeps track of a set of records and provides methods for manipulating and extracting those records.

The job of processing result sets is a fascinating area to explore. The core data structure is a set of records:

abstract class ProcessingContext
{
    private $records;
}

When there is no information about the structure of these records, the only meaningful operations for this context are collectAll and collectHead. This context can be called a Collector:

class Collector extends ProcessingContext
{
    public function collectAll(): array ...

    public function collectHead() ...

}

Once we know that the $records are associative arrays, we can do more interesting things with them. However, whatever we want to do with these records, it should always start with picking a subset of each record for further processing. The next context is the Selector. Even though knowing that the records are associative arrays allows us to add more operations to the context, we can still do what the Collector does i.e. collectHead and collectAll:

class Selector extends Collector
{
    public function selectValue(string $field): Converter ...

    public function selectFields(strings... $fields): Converter ...

    public function map(string $keyField, string $valueField): MapConverter ...
}

What is a Converter or a MapConverter? A selector allows you to pick fields from each record and place them in some sort of structure. For example, selectValue lets you pick a value of a field and store it as a scalar, selectFields lets you fetch an embedded associative array from each record, and map lets you create a new key/value pair from the values of two fields. The Converter is the context in which the API user must decide what to do with the selected subrecord.

class Converter extends ProcessingContext
{
    public function name(string $name): Selector ...

    public function group(): Collector ...

    public function groupInto(string $name): Selector ...

    ...
}

class MapConverter extends Converter
{
    public function flatten(): Collector ...

    public function flattenInto(string $name): Selector ...

    ...
}

So the name method returns the subrecord back into the record it was extracted from under a new name. The group method groups subrecords by using the remainder of each record as a group key. It does not return the group back into the record, so the result of group is actually a collector, i.e. the records are the groups extracted by the selector. The groupInto method not only groups subrecords but also pushes the groups back into the record.

I understand that the example provided may be complex and difficult to follow. Here is how I would simplify the example join above:

$query = "
    SELECT controllers.id,
           sensors.id AS sensor_id
           sensors.active AS sensor_active
      FROM controllers
           JOIN sensors
             ON contoller.id = sensors.controller_id;
";
$procedure = $database->prepare($query);
$procedure->processAll()
    ->selectByPrefix('sensor_')->group('sensors')
    ->collectAll();

The records would look like this:

| id   | sensor_id | sensor_active |
+------+-----------+---------------+
| c001 | NULL      | NULL          |
| c002 | s001      | false         |

Then, we select by prefix and group them into a record called sensors:

| id   | sensors                       |
+------+-------------------------------+
| c001 | []                            |
| c002 | [{ id: s001, active: false }] |

And that’s it! If you’d like oto see a working example, you can check out the example imeplementation and some unit tests for further reference.

Arduino with C / make on ubuntu

As I am not really a C / make person, let’s document rather trivial findings on how to compile and deploy arduino based project without using Arduino IDE. It’s a well known fact that most arduinos out there are exclusively utilized for LEDs blink, so who I am to argue? This is how you do it in C:

#include <avr/io.h>
#include <util/delay.h>

#define BLINK_DELAY_MS 2000

int main (void) {
    /* set pin 5 of PORTB for output*/
    DDRB |= _BV(DDB5);

    while(1) {
        /* set pin 5 high to turn led on */
        PORTB |= _BV(PORTB5);
        _delay_ms(BLINK_DELAY_MS);

        /* set pin 5 low to turn led off */
        PORTB &= ~_BV(PORTB5);
        _delay_ms(BLINK_DELAY_MS);
    }
}

To compile and deploy let’s use the following Makefile

name = led

eeprom = target/$(name)
hex = $(eeprom).hex
source_files = $(wildcard src/*.c)
object_files = $(patsubst src/%.c, target/%.o, $(source_files))

.PHONY: clean

all: clean build

clean:
        @rm -rf target

build: $(hex)

run: $(hex)
        @avrdude -F -V -c arduino -p ATMEGA328P -P /dev/ttyUSB0 -b 115200 -U flash:w:$(hex)

$(hex): $(object_files)
        @avr-gcc -mmcu=atmega328p $< -o $(eeprom)
        @avr-objcopy -O ihex -R .eeprom $(eeprom) $(hex)

target/%.o: src/%.c
        @mkdir -p $(shell dirname $@)
        @avr-gcc -Os -DF_CPU=16000000UL -mmcu=atmega328p -c -o $@ $<

Which contains quite a few magical constants, but it’s ok for now. Now we only need to install few command line tools

sudo apt-get install gcc-avr binutils-avr gdb-avr avr-libc avrdude

That’s it, connect your Arduino to USB0 (if not, fix the Makefile), and run make run. Et voilà, another blinking LED. Before we proceed with drawing of the rest of the owl, let’s see how we could use CLion, to make our work a bit more comfortable.

First, let’s add the following CMakeLists.txt to our project (stolen mostly from this guy):

cmake_minimum_required(VERSION 3.0)
project(led)

set(DEVICE "atmega328p")
set(FREQ "16000000")

set(CMAKE_C_COMPILER avr-gcc)
set(CMAKE_CXX_COMPILER avr-g++)

include_directories(/usr/lib/avr/include)

set(CMAKE_SHARED_LIBRARY_LINK_C_FLAGS "")
set(CMAKE_SHARED_LIBRARY_LINK_CXX_FLAGS "")

set(CMAKE_C_FLAGS  "-Os -mmcu=${DEVICE} -DF_CPU=${FREQ}UL -std=gnu99 -Wl,--gc-sections")
set(CMAKE_CXX_FLAGS "-Os -mmcu=${DEVICE} -DF_CPU=${FREQ}UL -Wl,--gc-sections")

set(CMAKE_RUNTIME_OUTPUT_DIRECTORY "${CMAKE_CURRENT_SOURCE_DIR}/target")

set(SOURCE_FILES src/led.c)

add_executable(${PROJECT_NAME} ${SOURCE_FILES})

add_custom_command(TARGET ${PROJECT_NAME} POST_BUILD COMMAND avr-objcopy -O ihex -R .eeprom ${CMAKE_RUNTIME_OUTPUT_DIRECTORY}/${PROJECT_NAME} ${CMAKE_RUNTIME_OUTPUT_DIRECTORY}/${PROJECT_NAME}.hex)
add_custom_command(TARGET ${PROJECT_NAME} POST_BUILD COMMAND avr-objcopy -O ihex -j .eeprom --set-section-flags=.eeprom="alloc,load" --change-section-lma .eeprom=0 ${CMAKE_RUNTIME_OUTPUT_DIRECTORY}/${PROJECT_NAME} ${CMAKE_RUNTIME_OUTPUT_DIRECTORY}/${PROJECT_NAME}.eep)
add_custom_command(TARGET ${PROJECT_NAME} POST_BUILD COMMAND avr-size ${CMAKE_RUNTIME_OUTPUT_DIRECTORY}/${PROJECT_NAME} --mcu=${DEVICE} --format=avr)

Run build, and it compiles from the IDE. For deployment create a simple run configuration in CLion IDE (not sure if there’s a way to do that with CMake). Anyhow, now you have a proper IDE to support you in LED blinking endeavour.

Functional programming

In functional programming, it is sufficient to consider the current scope and the name of a symbol (variable or function) to understand its meaning. This is because functional languages are typically purely functional and immutable, so the scope and name of a symbol can be determined simply by examining the source code. In contrast, procedural programming often requires knowledge of the execution path in addition to the scope and name of a symbol to understand its value. This can be challenging because the execution path is a runtime property that can be affected by many factors, leading many programmers to simply debug it rather than attempting to reconstruct it.

A simple caching web framework

A while ago I was working at an agency and we had to develop a fairly simple set of web apps, that would scale very well and integrate tightly with a shared caching bus and content distribution network. Performance was the key. Especially for the markets around the globe, some of them with generally slow internet connections. This post is a short schematic description about the design I came up with.

So I started with two main considerations in my mind. The first was to integrate with HTTP protocol as transparently as possible. There’s a set of RFCs out there, I am specifically talking about RFC 2616. The set of abstractions was modelled as closely as possibly to the this specification. By working with standards I hoped to easily integrate with numerous web caches and skip unnecessary refreshes. Additional but non-neglgible benefit was that a person knowing how HTTP works would be able to understand application logic easily.

The second, also important consideration was to not invent any DSLs, embedded or otherwise, used to replace constructs that the programming languages are already equipped with. To give an example of what I am talking about, I didn’t want to write configurations / annotations / ant-matchers and similar nonsense in order to say that URLs starting with /admin require that the user is authenticated and belongs to the group admin. Instead the following code would do just fine:

<?php

if ($request->startsWith('/admin')) {
  if (!$user->isAuthorized() or !$user->hasRole('admin')) {
    return new NonAuthorizedResponse();
  }
  ...
}

I personally dislike frameworks that force you to do the right thing. They’re supposed to provide you with a rock solid skeleton for your next application, equip you with enough tools to tweak the behavior without giving you a chance to break things. As usual, in real life the tools given are never enough and you end up hacking and breaking things anyway.

Third consideration was to help with code reuse as much as possible. Most websites are quite simply built, there’s a master template and numerous adjustments to it. There are generally two ways to reuse templates and resources serving them. First you can extract common parts into helpers where you end up with headers, footers, navigation panels and so on. Second way to use template inheritance to tweak just the specific areas of the page and not caring about how to assemble this page from parts.

After a couple of attempts I end up with the following set of abstractions. From HTTP specifications I needed

  • Request
  • Response
  • Resource
  • Entity

Additionally there was some non application-specific logic that needed the following set of abstractions

  • Application
  • Cache
  • Template

Let’s go through that list one by one before explaining how those parts fit together.

Request

Very simple albeit wider interface that facades the HTTP request and adds a couple of path matching methods.

<?php

interface Request {
    public function environmentNameEndsWith(string $suffix) : bool;
    public function getEnvironmentName() : string;
    public function getHeader(string $headerName) : string;
    public function getLanguageCodes() : array;
    public function getMethod() : string;
    public function getRemoteIpAddress() : string;
    public function getPath() : string;
    public function getBody() : string;
    public function hasParameter(string $name) : bool;
    public function getParameter(string $name, $defaultValue = null) : string;
    public function getParameters() : array;
    public function hasSessionParameter(string $name) : bool;
    public function getSessionParameter(string $name, $defaultValue = null) : string;
    public function getSessionParameters() : array;
    public function getCookie(string $name, $defaultValue = null) : string;
    public function pathMatches(string $path) : bool;
    public function pathMatchesPattern(string $pattern);
    public function pathStartsWith(string $prefix) : bool;
    public function pathStartsWithPattern(string $pattern);
    public function preconditionFulfilled(Entity $entity, Cache $cache) : bool;
}

The HTTP request is just an implementation of this interface and hides a lot of ugliness of handling super-globals. There are also additional implementations of this interface for unit tests and command line tools.

Response

Response interface is trivial and tiny.

<?php

interface Response {
    public function output(Request $request, Cache $cache) : void;
}

There is just one non trivial part about it. When response writes its output to stdout (btw. it may as well be easily abstracted by adding writer parameter to output method) we need to know what was the request, because there might be headers with preconditions that would modify the output. We also need to access the cache to see what was in the cache, returned cached value if possible without re-rendering response, and update the cached value. This is an interesting part, not so trivial. Let’s se the implementation of AbstractResponse which is the superclass for all other responses.

<?php

class AbstractResponse implements Response
{
    ...
    public function output(Request $request, Cache $cache) : void {
        ...
        if (!is_null($this->entity)) {
            $cacheEntry = $this->entity->loadOrGenerate($cache);
            $now = time();
            $maxAge = max(0, $cacheEntry['expires'] - $now);

            header('ETag: ' . $cacheEntry['tag']);
            header('Last-Modified: ' . $this->formatTimestamp($cacheEntry['modified']));
            header('Cache-Control: public, max-age=' . $maxAge);
            header('Expires: ' . $this->formatTimestamp($now + $maxAge));

            if ($this->embedEntity) {
                header('Content-Type: ' . $this->entity->getMediaType());
                header('Content-Length: ' . $cacheEntry['length']);
                header('Content-MD5: ' . $cacheEntry['digest']);
                $language = $this->entity->getContentLanguage();
                ...
                echo $cacheEntry['content'];
            }
        }
        exit;
    }
    ...
}

The code is pretty much self explanatory but I am gonna walk through it anyway. If there’s an entity (payload) that we want to send, let’s load the latest cache entry for this entity. If the cache doesn’t have an entry it gets generated. Now we’re updating the expiration, and version control headers and output the results, unless someone explicitly told us not to embed the entity. And who might that be? Let’s take a look at OkOrNotModifiedResponse to see the usage:

<?php

class OkOrNotModifiedResponse extends AbstractResponse {

    public function __construct(Entity $entity, Request $request) {
        parent::__construct();
        $this->setEntity($entity);
    }

    public function output(Request $request, Cache $cache) : void {
        if ($request->preconditionFulfilled($this->entity, $cache)) {
            $this->setStatus('200 OK');
            $this->setEmbedEntity(true);
        } else {
            $this->setStatus('304 Not Modified');
            $this->setEmbedEntity(false);
        }
        parent::output($request, $cache);
    }
}

We need an entity to see if it’s expired or to see if its ETag matches. It doesn’t necessarily mean that we want to render the entity into response as is the case with 304 response code.

Entity

The next abstraction is the above mentioned Entity, which is a pretty much the payload of the response plus additional headers like Content-Type, Content-Length and similar.

<?php

interface Entity {
    public function getCachingTime() : int;
    public function getContent() : string;
    public function getContentLanguage() : string;
    public function getKey() : string;
    public function getMediaType() : string;
    public function loadOrGenerate(Cache $cache) : CacheEntry;
}

Important to note that every entity need to be able efficiently generate a key which is used to locate cache entry.

The main implementation of the Entity interface is AbstractEntity

<?php

abstract class AbstractEntity implements Entity {
    ...
    public function loadOrGenerate(Cache $cache) : CacheEntry {
        if (!is_null($this->cacheEntry)) {
            return $this->cacheEntry;
        }

        $key = $this->getKey();
        list($this->cacheEntry, $found) = $cache->get($key);
        $now = time();
        $expires = $this->cacheEntry->expires ?? 0;

        if (!$found or $now >= $expires) {
            $content = $this->getContent();
            $tag = md5($content);
            if (is_array($this->cacheEntry) and $tag == $this->cacheEntry['tag']) {
                $this->cacheEntry->expires = $now + $this->getCachingTime();
            } else {
                $this->cacheEntry = new CacheEntry([
                    'content'  => $content,
                    'tag'      => $tag,
                    'digest'   => base64_encode(pack('H*', md5($content))),
                    'length'   => strlen($content),
                    'modified' => $now,
                    'expires'  => $now + $this->getCachingTime(),
                ]);
            }
            $cache->set($key, $this->cacheEntry);
        }

        return $this->cacheEntry;
    }
}

Again the code is fairly trivial. We’re looking at the cache, and if there’s already a value under this key, we check if it’s expired or not. If it’s fresh, then this is what we return without re-rendering the entity. Otherwise we render the entity and see if the entity’s content has changed since last time. If not, we only update the expires, otherwise we add new cache entry.

Entity as such is not an abstraction for the payload. It’s an abstraction for payload generator that has memory. We can ask an Entity object to give us the latest value, and this object is smart enough to:

  • skip generation if the cache value if fresh enough,
  • update only the expiration header if the newly generate value is the same as the old one,
  • generate new value otherwise

One example of an entity would be a JsonEntity that only asks the extending classes to implement getData logic. The rest of cacheing and expiration logic is inherited from AbstractEntity. The complete class looks like following:

<?php

abstract class AbstractJsonEntity extends AbstractEntity {

    abstract protected function getData();

    public function getContent() : string {
        return json_encode($this->getData());
    }

    public function getMediaType() : string {
        return 'application/json;charset=UTF-8';
    }
}

Resource

The last abstraction on standard list is the Resource which is also a tiny interface

<?php

interface Resource {
    public function getResponse(Request $request) : Response;
}

The job of Resources is to translate requests into responses, in other words to interpret the request, see if it’s appropriate, validate input, call business methods, generate response. Some of the resources are trivial and really just return an entity. For this they can extend EntityResource

<?php

class EntityResource implements Resource {

    protected $entity;
    protected $methods;

    public function __construct(Entity $entity, array $methods = ['GET']) {
        $this->entity = $entity;
        $this->methods = $methods;
    }

    public function getResponse(Request $request) : Response {
        if (in_array($request->getMethod(), $this->methods)) {
            $response = new OKOrNotModifiedResponse($this->entity, $request);
            return $response;
        }
        return new MethodNotAllowedResponse($this->methods);
    }
}

Application

There’s really no generic interface for applications. It can do whatever you want. I usually extend the following AbstractApplication which asks me two things

  • How do I find resource that is responsible for this request, so that I can dispatch my request to it.
  • How do we get to cache server.
<?php

abstract class AbstractApplication {

    abstract protected function findResource(Request $request) : Resource;

    public function run(Request $request) : Response {
        $resource = $this->findResource($request);
        $response = $resource->getResponse($request);
        return $response;
    }

    abstract protected function getCache(Request $request) : Cache;

    public function output(Request $request, Response $response) : void {
        $response->output($request, $this->getCache($request));
    }
}

Cache

The next abstraction is the most trivial one. The only reason it exists is to add a layer on top of Memcached, Redis, Elasticache and similar solutions.

<?php

interface Cache {
    public function get(string $key, $defaultValue = null) : CacheEntry;
    public function set(string $key, $value, int $timeToLive = 0);
    public function delete(string ...$keys);
}

class CacheEntry {
    private $content;
    private $tag;
    private $digest;
    private $length;
    private $modified;
    private $expires;

    ...
}

Example application

The most trivial application I can think of is an application that gives you current day, but doesn’t recalculate it every time user visits the endpoint. We start with an entity that generates the day and sets proper expiration time for the result.

<?php

class DayOfWeekEntity extends JsonEntity {

    // Only one cache entry shared across all entities of this type
    public function getKey() {
       return __CLASS__;
    }

    // Return current day of week
    public function getData() {
      return date('l');
    }

    // Cache till the end of the day today
    public function getCachingTime() {
       return 86400 - time() % 86400;
    }
}

Now let’s create our own application

<?php

class MyApplication extends AbstractApplication {

    // Check if the request is for a registered path
    // Return standard JSON Resource
    // Otherwise let NotFoundResource handle the request
    protected function findResource(Request $request) : Resource {
        if ($request->pathMatches('/dayOfWeek')) {
           return new JsonEntityResource(new DayOfWeekEntity(), ['GET']);
        }
        return new NotFoundResource();
    }

    // Return connection to the shared cache
    protected function getCache(Request $request) : Cache {
        return new Memcache('host001.elasticache.aws.', 11211);
    }
}

That’s all you need to do. The first visit to the endpoint generates the entity. The value is stored in cache and sent back to the client. The response headers have ETag, validation and expiration headers explaining to the client and proxies that the value just generated won’t change till the end of the day today. If the user hits refresh and the browser decides to send new request with ETag and validations headers, the response is going to be 304 and no new entity will be generated. In fact the response might as well be from a CDN. Such application is well placed behind a load balancer, as the shared cache allows for better scalability.

Template inheritance

Last thing that I want to mention here is the template handling. Some of the renderings can be

  • quite expensive
  • collect a lot of data from different sources for rendering
  • have a lot in common

One thing is we’re provided with is the abstract template entity that expects you to provide data, the path to the template, as well as the rendering engine for example Twig or Smarty.

<?php

abstract class AbstractTemplateEntity extends AbstractEntity {
    abstract protected function getTemplatePath() : string;
    abstract protected function getTemplateData();
    abstract protected function getTemplateRenderer() : TemplateRenderer;
    public function getContent() : string {
        return $this->getTemplateRenderer()
            ->render($this->getTemplateData(), $this->getTemplatePath());
    }
}

The way it works in general, you will end up with an inheritance hierarchy of TemplateEntities that call parent::getTemplateData() to collect the data required to generate parent template, then adds it’s own data fetching and renders the response. For example for the parent template main.tpl:

<html>
  <body>
    <h1>[[header]]</h1>
    [% block content %]
      Welcome!
    [/ block ]
  </body>
</html>

And the child template details.tpl:

[% extends "main.tpl" %]

[% block content %]
    [[message]]
[/ block ]

The same hierarchy is repeated with entities. MainEntity and DetailsEntity would contain the following code

<?php

class MainEntity extends TwigTemplateEntity {

    protected function getTemplatePath() : string {
        return 'main.tpl';
    }

    protected function getTemplateData() {
        return [
            'header' => 'My Homepage'
        ];
    }
}

class DetailsEntity extends MainEntity {

    protected function getTemplatePath() : string {
        return 'details.tpl';
    }

    protected function getTemplateData() {
        return parent::getTemplateData() + [
            'message' => 'The page is under construction'
        ];
    }
}

The extending entities only contain the code generating the difference to the data set of the parent template, which helps with code reuse. As with any entities, you can specify the cacheing time, so if for example you want to regenerate templates depending on some timestamp from the database, this is what getKey() of the entity should use.

The source code of the base framework is available on GitHub under GPL 2 license. At one point I am going to publish a follow up note with a non-trivial example application.

Solving Codility exercises in shell script

How about today we try and solve Codility exercise problems in one of the most “odd” languages? Bash! Never liked the syntax, type system is horrendous, performance is abysmal, development is pretty much echo driven. Let’s see if after today I will feel more confident about this language. I will write down all learnings and surprises.

The first exercise is the binary gap. Main job is done by bc calculator. Take away from this is the horrible pattern for dealing with trailing characters

function convert {
  # convert to binary
  local b=$( echo "obase=2;$1" | bc )

  # remove trailing spaces
  local c="${b%"${b##*[!0]}"}"

  # split by 1s and run high water mark algorithm
  local i r=0
  for i in ${c//1/ } ; do
    l=${#i} && (( l > r )) && (( r=l ))
  done

  echo "$r"
}

convert 20
convert 1041
convert 2043403647

The second exercise is quite a bit of pain. Passing arrays into and from functions is complex. I unwrap the values when calling the function and collect it back after shifting away the first argument. Clearly it won’t work for functions that accept multiple arrays as arguments. The algorithm by itself is trivial though.

function rotate {

  # get the first element and shift as array is passed 'unwrapped'
  local k="${1}"
  shift
  # wrap back the array and get its length
  local a=("${@}")
  local n="${#i[@]}"

  # calculate the split, swap sides
  (( k = n - k % n ))
  local p=("${a[@]:0:k}")
  local q=("${a[@]:k}")
  local r=("${q[@]}" "${p[@]}")

  echo "${r[@]}"
}

i=(3 8 9 7 6)
rotate 3 "${i[@]}"

i=("dude" "there" "you go")
rotate 2 "${i[@]}"

The third exercise is trivial once you know the trick that folding arrays with xor gives you the xor fold of unpaired elements. The logic translates directly into shell script.

function pair {
  local i r=0
  for i in "$@" ; do
    (( r ^= i ))
  done
  echo "$r"
}

pair 9 3 9 3 9 7 9

The Frog Jump Exercise is another application of (( ... )) magic. Very little is learned except that local x=1 y=$x will not work in bash, and to see those kind of errors always run your bash scripts with set -o nounset.

function jump {
  local a=$1 b=$2 c=$3
  local d=$(( b - a ))
  local j=$(( d / c ))
  (( j * c == d )) || (( j += 1 ))
  echo "$j"
}

jump 10 85 30

The Tape Equilibrium Exercise is where we discover that we don’t have abs function in shell (in korn shell we do though).

function balance {
  local b=0 a=0 i d

  # compute the total weight of the bar
  for i in "$@" ; do
    (( b += i ))
  done

  # set max to unbalanced (0 -- all) distribution
  local m=$b
  for i in "$@" ; do

    # move a piece from left to right
    (( a += i )) && (( b -= i ))

    #  compute the absolute difference
    (( b > a )) && (( d = b - a )) || (( d = a - b ))

    # update low water mark
    (( d < m )) && (( m = d ))
  done
  echo "$m"
}

balance 3 1 2 4 3

To solve Permutation Missing Element Exercise we can use the fact that the sum of arithmetic progression is a closed form expression and can be computed in O(1). If we compare this theoretical sum to the actual sum of the array the difference will give us the missing argument.

function missing {
  local s=0 n="$#" i

  for i in "$@" ; do
    (( s += i ))
  done
  echo $(( (n + 2) * (n + 1) / 2 - s ))
}

missing 2 3 1 5

The Frog River One kind of pushes us towards using proper ADTs, in this case hash maps. I cheated a bit here storing numbers in a : separated string a. As this datatype doesn’t track the size of our ‘list’, we better store the size in an additional state variable w.

function wait {
  # get the width of the river
  local x=$1
  shift

  # a is a list :1:6:7: with O(n) for membership check
  local i a=':' d c=0 b w=0
  for i in "$@" ; do
    if [[ "$a" != *":${i}:"* ]] ; then
      # add item to list
      a="${a}${i}:"
      (( w += 1 ))
      (( w == x )) && { echo "$c" ; return ; }
    fi
    (( c += 1 ))
  done
}

wait 5 1 3 1 4 2 3 5 4

Another Permutations Exercise, another cheat :). I just sort the values, which is O(n * log n) operation and check if the sorted order contains gaps. It’s not as fast as O(n) but definitely better than naive repeated linear scanning which is O(n * n).

function permchk {
  local i local a=$( echo "$@" | tr " " "\n" | sort | tr "\n" " " ) c=1
  for i in $a ; do
    (( c == i )) && (( c += 1 )) || { echo false ; return ; }
  done
  echo true
}

permchk 4 3 1 2
permchk 4 1 3

The next Missing Integer Exercise is pretty much the variation to the theme, just add uniqueness to sort -u to skip over duplicates

function findmiss {
  local i local a=$( echo "$@" | tr " " "\n" | sort -u | tr "\n" " " ) c=1
  for i in $a ; do
    (( c == i )) && (( c += 1 )) || { echo $c ; return ; }
  done
}

findmiss 1 3 6 4 1 2

So far the ugliest solution for Max Counts Exercise where I try to use fixed length delimited string as poor man’s array. The thorny path led me to few realizations. Firstly, {1..n} expansions don’t “understand” variables therefore {1..$n} won’t work. Secondly, there’s no simple way to repeat a string multiple times like "hello" * 5 in Python. I had to write a function instead. Thirdly, it’s not so easy to work with strings as arrays as constant padding and trimming is not fun and adds significant clutter, so I probably gonna pass on this “data structure” in the future.

readonly W=5

# repeat string $1 $2 times
function rep {
  local a="" i
  for (( i = 0 ; i < $2 ; i++ )) ; do
    a="$a$1"
  done
  echo "$a"
}

# pad the value with spaces from left
function pad {
  printf "%${W}s" $1
}

function count {
  local n=$1
  local m=$( pad 0 )
  # initialize array "....0....0....0....0"
  local a=$( rep "$m" $n )
  shift

  local i v w b c
  for i in "$@" ; do
    if (( i > n )) ; then
      # refill with max
      a=$( rep "$m" $n )
    else
      (( b = (i - 1) * W ))
      (( c = i * W ))
      # the ugly bit, update the counter, and pad it again
      v=$( pad $(( ${a:b:W} + 1 )) )
      (( v > m )) && m=$v
      a="${a:0:b}${v}${a:c}"
    fi
    echo "$a"
  done
}

count 5 3 4 4 6 1 4 4

Following was another Counting Exercise. The solution is trivial, adding the code for completeness’ sake.

function pair {
  local i s=0 r=0
  for i in "$@" ; do
    (( s += i ))
  done
  for i in "$@" ; do
    (( i == 0 )) && (( r += s )) || (( s-- ))
  done
  echo "$r"
}

pair 0 1 0 1 1

After that a simple exercise in rounding up and down. Move the a right to the nearest number divisible by c, move the b left to the nearest number divisible by c. Compute the number of intervals fitting in between and add one to count the dots.

function count {
  local a=$1 b=$2 c=$3 n=0
  (( a % c == 0 )) || (( a = (a / c + 1) * c ))
  (( b = (b / c) * c ))
  echo $(( (b - a) / c + 1 ))
}

count 6 11 2 # 6 8 10
count 6 10 2 # 6 8 10
count 7 12 3 # 9 12

At one point I’ll need to start using proper data types such as hash map. Well, maybe next time. For Genomic Range Query I’ve just created 4 arrays for folds, which will do for now, but are clearly very verbose. Learnings from this exercise is that looping through characters in a sting needs a a function call to scan, array concatenation +=(1) is very different from +=1, where the second expression just quietly promotes (demotes?) array to string, and that && || && || chains don’t always work for long statements with arithmetic context (( )). I will need to explore more on the last point.

function nucleo {
  local n=$1
  shift

  local a=0 c=0 g=0 t=0
  declare -a as=(0) cs=(0) gs=(0) ts=(0)
  for i in $( echo "$n" | fold -w1 ) ; do
    case "$i" in
      A) (( a++ )) ;;
      C) (( c++ )) ;;
      G) (( g++ )) ;;
      T) (( t++ )) ;;
    esac
    as+=($a) ; cs+=($c) ; gs+=($g) ; ts+=($t)
  done

  declare -a b vs
  for i in "$@" ; do
    b=( ${i//:/ } )
    p=${b[0]}
    q=$(( ${b[1]} + 1 ))
    if   (( as[q] - as[p] > 0 )) ; then vs+=(1)
    elif (( cs[q] - cs[p] > 0 )) ; then vs+=(2)
    elif (( gs[q] - gs[p] > 0 )) ; then vs+=(3)
    else                                vs+=(4)
    fi
  done
  echo "${vs[@]}"
}

nucleo CAGCCTA 2:4 5:5 0:6

By the way we probably don’t need ts as we infer existence of Ts indirectly in the else branch.

Next exercise is as boring as the rest of summation exercises. The jobs is greatly obscured by the syntax. Learnings from this one are none, except I am getting less and less gotchas with bash arrays.

function minavg {
  # scan, use s as threshold
  declare a=(0)
  local i s=0
  for i in "$@" ; do
    (( s+= i ))
    a+=($s)
  done

  # do not divide, work with the average multiplied by 6
  local v r
  for (( i = 2 ; i < ${#a[@]} ; i++ )) {
    (( v = (a[i] - a[i - 2]) * 3 ))
    (( v < s )) && (( s = v )) && (( r = i - 2 ))
    if (( i > 2 )) ; then
      (( v = (a[i] - a[i - 3]) * 2 ))
      (( v < s )) && (( s = v )) && (( r = i - 3 ))
    fi
  }
  echo "$r"
}

minavg 4 2 2 5 1 5 8

The triangle exercise is actually the first one making bash shine … a bit. Making an array out of an input string and sort it numerically in one line is not bad at all. Took me literally 2 minutes to solve it, especially because they’ve put O(n log n) into expected complexity. As one of my professors used to say if your math teacher asks you ‘which number’ the right answer is almost always “0”, sometimes “1”.

function triangle {
  local i v
  declare -a t r=()
  for i in "$@" ; do
    t=($( echo "$i" | tr ":" "\n" | sort -n | tr "\n" " " ))
    r+=($(( t[0] + t[1] > t[2] )))
  done
  echo "${r[@]}"
}

triangle 10:2:5 1:8:20 1:1:1

Followed up by another short exercise leveraging sort -u. Nothing learned, except maybe that uniq compares adjacent lines only, so there’s no way to avoid explicit sorting.

function distcount {
  declare -a v=($( echo "$@" | tr " " "\n" | sort -u | tr "\n" " " ))
  echo "${#v[@]}"
}

distcount 2 1 1 2 3 1

And another one aiming at the same “skill”.

function maxprod {
  declare -a i=($( echo "$@" | tr " " "\n" | sort -n | tr "\n" " " ))
  local a b n=${#i[@]}
  (( a = i[0] * i[1] * i[n - 1] ))
  (( b = i[n -3] * i[n - 2] * i[n - 1] ))
  (( a > b)) && echo "$a" || echo "$b"
}

maxprod -3 1 2 -2 5 6

At this point I have to admin I don’t commit nearly as many syntactic errors as before. So let’s keep going and see where it’s gonna take me.

I spent quite some time understanding the next exercise but decided to cheat a bit and take someone else’s solution. Interestingly bash really excelled in this exercise. The complete solution is not much longer than Luca’s python code. Learnings are numerous. Firstly, you can pipe the loops into construct thus avoiding storage. Secondly, pipes operate with their own namespace so assigning an outer variable from within piped code is horribly complex. Thirdly, once in pipes, stay in pipes. Still, I don’t know how to reference input array $@ from arithmetic context (( )), will have to explore it later.

function discint {
  declare -a a=("$@")
  local i b c
  for (( i = 0 ; i < ${#@} ; i++ )) ; do
    echo $(( i - a[i] )) "\t1"
    echo $(( i + a[i] )) "\t-1"
  done | grep -v "^$" | sort -k1,1n -k2,2rn | while read _ i ; do
    (( i > 0 )) && (( b += c ))
    (( c += i ))
    echo "${b}"
  done | tail -1
}

discint 1 5 2 1 4 0

And after the last high another low with the fish. Considerable amount of time is spent bending the data structures around to reduce the amount of clutter. I took a longer path to create a pipeable function. Still hasn’t figured it out how to better return values from a pipe, other than by tail -1. Another pain is adding things to an empty array. There must be a better way than checking -z "${s+_}". This experience reminded me of my time at the university where all our code was highly optimized to make it run faster by completely disregarding readability of the code and using contra-intuitive data types.

function unwrap {
  tr ' :' "\n\t" | while read -r w d ; do
    (( d == 1 )) && echo $(( w * -1 )) || echo "$w"
  done
}

function fish {
  declare -a s=()
  local i a l=0
  unwrap | while read i ; do
    # push new fish to stack be default
    a=true
    while (( s[0] < 0 )) ; do
      if (( s[0] * -1 > i )) ; then
        # our upstream fish was eaten by the top of the stack
        a=false ; break
      else
        # top of the stack was eaten, reduce count and pop
        (( l-- )) ; s=("${s[@]:1}")
      fi
    done
    # adding new fish to the stack
    if [ $a == true ] ; then
      (( l++ ))
      [ -z "${s+_}" ] && s=($i) || s=("$i" "${s[@]}")
    fi
    # echo current length
    echo "$l"
  done | tail -1
}

echo 4:0 3:1 2:0 1:0 5:0 | fish

Followed by another stack exercise. Again learnings are to use data structures that reduce the need to do operations in bash, like translating brackets to numbers with offset of 3, which lets us easily match opening and closing brackets in one operation.

function brackets {
  tr "{[(}])" "123456" | fold -w1 | while read c ; do
    if (( c < 4)) ; then
      [ -z "${s+_}" ] && s=($c) || s=($c "${s[@]}")
    else
      (( c - s[0] == 3 )) && s=("${s[@]:1}") || break
    fi
    [ -z "${s+_}" ] && echo true || echo false
  done | tail -1
}

echo "{[()()]}" | brackets
echo "([)()]" | brackets

Next one was quite interesting, as it required a bit of set-theoretic proof to bi-directionally map two sets, thus reducing the problem to a much simpler one. Learning is really just one, but important one, you can create an initializing block inside of piped coded hiding it behind some sort of [ -z "${s+_}" ] test. Getting more and more used to pipes and stuff.

function wall {
  tr " " "\n" | while read h ; do
    [ -z "${s+_}" ] && { declare -a s=() ; local n=0 c=0 ; }
    while (( n > 0 )) && (( s[n-1] > h )) ; do (( n-- )) ; done
    if (( n == 0 )) || (( s[n-1] != h )) ; then
      (( c++ ))
      s[n]=${h}
      (( n++ ))
    fi
    echo "$c"
  done | tail -1
}

echo 8 8 5 7 9 8 7 4 8 | wall

The EquiLeader is easier to solve if we recognize that if a number dominates two parts then is will dominate the complete sequence. Therefore we compute the dominating number and count it in as $s and then run through sequence checking the balances

function eqlead {
  declare -a s=($( echo "$@" | tr " " "\n" | sort | uniq -c | sort -k1,1nr | head -1 ))
  local i a=0 b=$# l=0 r="${s[0]}"

  for i in "$@" ; do
    (( a++ )) ; (( b-- ))
    if (( i == s[1] )) ; then
      (( l++ )) ; (( r-- ))
    fi
    (( l * 2 > a )) && (( r * 2 > b )) && echo $(( a - 1 ))
  done
}

eqlead 4 3 4 4 4 2

The Dominator should probably come before EquiLeader. Nothing interesting really, except tried not to loop over things explicitly and use pipes as much as possible. Still need to capture intermediate results with $() because I don’t see any nice way of branching and merging the pipes althogh this probably points in the right direction. Another surprise is that uniq -c does some unexpected padding on counter column therefore I am not sure how to cut -f1 out of results and have to go with read -r n _. By the way read -r n would capture complete lines

function dominator {
  declare -a c=($( tr ' ' "\n" \
                 | sort \
                 | uniq -c \
                 | sort -k1,1nr \
                 | while read -r n _ ; do echo "$n" ; done \
                 | tr "\n" ' ' ))
  local s=$( echo "${c[@]}" | tr ' ' '+' | bc )
  echo $(( c[0] * 2 >  s ))
}

echo 3 4 3 2 3 -1 3 3 | dominator

While looking for solving the next one I discovered a strangely popular topic among bash programmers: how to read the first and the last lines from a file. For documentation purposes { IFS= read -r a ; echo "$a" ; tail -1 ; } worked for me. The algorithm is straightforward. If the new value is below running minimum, update the minimum, otherwise check if trading at this point would increase running max profit, and update it if required.

function mprofit {
  local i m=$1 b=0
  for i in "${@:1}" ; do
    if (( i > m )) ; then
      (( i - m > b )) && (( b = i - m ))
    else
      (( m = i ))
    fi
  done
  echo "$b"
}

mprofit 23171 21011 21123 21366 21013 21367

Next task made me realize I have to write routines to reverse arrays myself. Implemented Kadane’s algorithm.

function kadane {
  declare -a a=("$@") b=(0)
  local i n=$# c=0 d=0
  for (( i = 1 ; i < n - 1 ; i++ )) ; do
    (( c + a[i] > 0 )) && (( c += a[i] )) || (( c = 0 ))
    (( d < c )) && (( d = c ))
    b+=($d)
  done
  b+=(0)
  echo "${b[@]}"
}

function rev {
  local i
  declare -a a=()
  for i in "$@" ; do [ -z "${a+_}" ] && a=($i) || a=($i "${a[@]}") ; done
  echo "${a[@]}"
}

function mdslice {
  declare -a p=($( kadane "$@" )) q=($( rev $( kadane $( rev "$@" ) )))
  local n=$# m=0 s
  for (( i = 1 ; i < n - 1 ; i++ )) {
    (( s = p[i-1] + q[i+1] ))
    (( s > m )) && (( m = s ))
  }
  echo "$m"
}

mdslice 3 2 6 -1 4 5 -1 2

Skipping the max slice sum as this is just a simpler version of the last one.

Following are a set of exercises that I expect will make me use bc a lot. The first one is about finding integer square root.

function minrec {
  local i n=$( echo "scale=0;sqrt($1)" | bc ) a=$1
  for (( i = n ; i > 0 ; i-- )) ; do
    (( a % i == 0 )) && { echo $(( 2 * (i + (a / i)) )) ; break ; }
  done
}

minrec 30

And the following Count Factors Exercise is pretty much the same loop, just a different counter and a different break condition.

function nfactors {
  local i n=$( echo "scale=0;sqrt($1)" | bc ) a=$1 r=0
  # set the counter to -1 if a is a square to not double count sqrt(a)
  (( n * n == a )) && (( r = -1 ))
  for (( i = n ; i > 0 ; i-- )) ; do
    (( a % i == 0 )) && (( r += 2 ))
  done
  echo "$r"
}

nfactors 24

For the Peaks Exercise I just took someone else’s code, as it was a fairly uninspiring problem. I’ve added comments to the logic, one interesting nuance of the algorithm is the part with integer divisions in c / i > f and c / i == f used to check if a peak falls within current group. Very nice indeed. No new bash gotchas.

function peaks {
  local n=$# i f g s c
  declare -a a=("$@") p=()

  # compute all the peaks
  for (( i = 2 ; i < n - 1; i++ )) ; do
    (( a[i-1] < a[i] )) && (( a[i] > a[i+1] )) && p+=($i)
  done

  for (( i = 1 ; i <= n ; i++ )) ; do
    # for all group sizes i that evenly divide n
    (( n % i != 0 )) && continue

    # number of peaks found f, number of groups g, success s
    f=0 ; g=$(( n / i )) ; s=1

    for c in "${p[@]}" ; do
      # overshot, breaking out
      (( c / i > f )) && { s=0 ; break ; }
      # falls into the group, increase counter
      (( c / i == f )) && (( f++ ))
    done
    # if found less than g set success status to 0
    (( f != g )) && s=0
    # super successful exit
    (( s == 1 )) && { echo "$g" ; break; }
  done
}

peaks 1 2 3 4 3 4 1 2 3 4 6 2

The count semi-primes exercise is trivial but it’s nice to play with memoization in bash. I’ve added two indices PRIMES_INDEX and PRIMES containing respectively indices of prime numbers to quickly answer a question if a number is a prime and to loop quickly through prime numbers.

declare -a PRIMES_INDEX=(0 0 1 1 0 1) PRIMES=(2 3 5)
P_CURRENT=5

function primes {
  local b=$1 f p
  while (( P_CURRENT * P_CURRENT <= b )) ; do
    (( P_CURRENT++ ))
    f=1
    for p in "${PRIMES[@]}" ; do
      (( P_CURRENT % p == 0 )) && { f=0 ; break; }
    done
    (( PRIMES_INDEX[P_CURRENT] = f ))
    (( f == 1 )) && PRIMES+=($P_CURRENT)
  done
  echo "${PRIMES[@]}"
}

function semipr {
  local i j c p

  for i in "$@" ; do
    r=(${i//:/ })
    (( c = 0 ))
    primes "${r[1]}"
    for (( j = r[0] ; j <= r[1] ; j++ )) ; do
      for p in "${PRIMES[@]}" ; do
        (( p * p > j )) && break;
        if (( j % p == 0 )) && (( PRIMES_INDEX[j / p] == 1 )) ; then
          (( c++ ))
          break
        fi
      done
    done
    echo "$c"
  done
}

semipr 1:26 4:10 16:20 12:2035

Very much the same applies for the count non divisible, it makes perfect sense to create indices during linear scan to avoid search later on. In this case we needed an index of occurences os which for every number i gives you the number of times os[i] the number i occurs in the input array. And the array cs is the number of non divisors which is uniformely set to n at the beginning, and then reduce while we loop over divisors.

function nondiv {
  declare -a cs=() os=() rs=()
  local i=0 n=$# j d

  while (( i < 2 * n )) ; do (( os[i++]=0 )) ; done
  for a in "$@" ; do (( os[a]++ )) ; done

  (( i = 0 )) ; while (( i < 2 * n )) ; do (( cs[i++] = n )) ; done

  for (( d = 1 ; d < 2 * n ; d++ )) ; do
    (( os[d] == 0 )) && continue
    for (( j = d ; j < 2 * n ; j += d )) ; do
      (( cs[j] -= os[d] ))
    done
  done

  for a in "$@" ; do
    rs+=("${cs[$a]}")
  done

  echo "${rs[@]}"
}

nondiv 3 1 2 3 6

Followed by chocolates by numbers, where you have two different velocities around circle, m and n. Let’s linearize positions. So a[i+1] = a[i] + n. To check that two linearized positions represent the same place on the circle we wrap around m, in other words steps p and q represent the same position on the circle if a[p] % m == a[q] % m or (a[p] - a[q]) % m == 0. And for the same two positions to be on the same progression we need the difference to be divisible by n as well (a[p] - a[q]) % n == 0.

function gcd {
  local a=$1 b=$2 c
  while (( a % b != 0 )) ; do
    c=$a ; a=$b ; (( b = c % b ))
  done
  echo "$b"
}

function choc {
  local n=$1 m=$2 d=$( gcd "$1" "$2" )
  echo $(( n / d ))
}

choc 10 4

The common prime divisors is interesting because of its reduce function. GCD of two numbers should be all you need in order to reduce the numbers. So given initial value a and gcd, the a / gcd should be a product of divisors of the very same gcd. So we extract common part from a / gcd and gcd and reduce a / gcd by this amount. We proceed till either the number is reduced to 1 or cannot be reduced any further.

function gcd {
  local a=$1 b=$2 c
  while (( a % b != 0 )) ; do
    c=$a ; a=$b ; (( b = c % b ))
  done
  echo "$b"
}

function reduce {
  local a=$1 g=$2 f d r=1
  # first division
  (( f = a / g ))
  # need to be reduced as f is not a divisor of g
  while (( g % f != 0 )) ; do
    # find the amount the f needs to be reduced by
    d=$( gcd "$g" "$f" )
    # check if can be reduced
    (( d == 1 )) && { r=0 ; break ; } || (( f /= d ))
  done
  echo "$r"
}

function cpd {
  local i a b g p q v
  declare -a r
  for i in "$@" ; do
    r=(${i//:/ }) ; a="${r[0]}" ; b="${r[1]}"
    g=$( gcd "$a" "$b" )
    p=$( reduce "$a" "$g" ) ; q=$( reduce "$b" "$g" )
    (( p == 1 )) && (( q == 1 )) && (( v++ ))
  done
  echo "$v"
}

cpd 15:75 10:30 3:5 22:7744 44:23232

The ladder is a counting exercise which is just implementation of Fibonacci numbers. Pretty much the only complexity is to understand that the number of paths leading to n-th rung is fib(n).

declare -a FIB=(1 1 2)
declare FIB_I=2

function fib {
  local n=$1 f
  while (( FIB_I < n )) ; do
    (( FIB_I++ ))
    (( f = FIB[FIB_I - 2] + FIB[FIB_I - 1] ))
    FIB+=($f)
  done
  echo "${FIB[$n]}"
}

function lad {
  declare -a r
  local a b i f d
  for i in "$@" ; do
    r=(${i//:/ }) ; a="${r[0]}" ; b="${r[1]}"
    f=$( fib "$a" )
    d=$( echo "2^$b" | bc )
    echo $(( f % d ))
  done
}

lad 4:3 4:2 5:4 5:3 1:1

The Fibonacci frog is actually dynamic programming exercise, which I was too lazy to implement just yet. I am traversing all possible paths instead, which is very bad. When I reach the same point c on two different paths, I am computing the value function at point c twice. Nobody should ever do that. Learnings are that writing recursive functions in bash is pain, probably even more pain than writing iterative versions with manual stack handling.

declare -a FIB=(1 2)
declare FIB_I=1

# generate fibonacci numbers up to n
function fib {
  local n=$1 f
  while (( FIB[FIB_I] <= n )) ; do
    (( FIB_I++ ))
    (( f = FIB[FIB_I - 2] + FIB[FIB_I - 1] ))
    FIB+=($f)
  done
}

function jmp {
  # c - current position
  # d - next position
  # f - iterator over fibonacci numbers
  # m - length of min path, -1 if no path from current point
  # v - the length of sub path starting from d = c + f
  local c=$1 d f m v
  shift
  # n is the length of the padded array
  local n=$#
  declare -a as=("$@")

  # check if we reached the end
  if (( c == n - 1 )) ; then
    echo "0"
  else
    m=-1
    for f in "${FIB[@]}" ; do
      # check if jumping over f is a valid point
      (( d = c + f ))
      (( d < n )) || continue
      (( as[d] == 1 )) || continue
      # find the optimal path length starting from d
      v=$( jmp $d "$@" )
      # update minimum
      if (( v >= 0 )) ; then
        if (( m == -1 )) || (( v + 1 < m )) ; then
          (( m = v + 1 ))
        fi
      fi
    done
    echo "$m"
  fi
}

function frog {
  # pre-compute fibonacci numbers
  fib "$#"
  # pad the array with terminal 1 to denote terminal position
  jmp -1 "$@" 1
}

frog 0 0 0 1 1 0 1 0 0 0 0

Enough.